Help

I'm working in the Hibernate and Infinispan teams at JBoss, caring about Lucene integration in products we support, striving to make it easier to use and to integrate in well known APIs and patterns, and finally to make it scale better; I love clean and well performing code.

I've been an early adopter of cloud deployments scaling Lucene to a huge number of requests on EC2 using Hibernate Search, and after that I worked with Sourcesense to make JIRA clusterable via Infinispan. Have been trainer on Seam and Hibernate courses.

Location: Newcastle, UK
Occupation: Doing stuff at JBoss, a Division of Red Hat Inc
Archive

After 14 months of hard work, please welcome Hibernate Search 5 !

Let's have a look at the highlights of why you should be eager to upgrade:

  • Upgraded to Lucene 4.10
  • Lots of internal improvements, especially performance
  • Thanks to Hibernate Search abstraction, most of your code should be upgradable easily despite the massive changes in Lucene APIs
  • Numeric properties now indexed as NumericField by default
  • Requires JDK 7
  • Compatible with Hibernate ORM 4.3 and WildFly 8.x
  • Stable

How to get it

Everything you need is available on Hibernate Search's web site. Download the full distribution from here. And don't hesitate to reach us in our forums.

If you are new to Hibernate Search, best is to start with our getting started guide.

Feature list

Let's dive into the feature list.

Lucene 4.10

Hibernate Search 4 has been stuck with the quite outdated 3.6.x version of Apache Lucene, while the Lucene 4 series is introducing lots of improvements. Lucene has now reached version 4.10.3 and is considered stable, reliable and significantly more efficient than previous versions; you can now benefit from all these improvements. Some APIs changed, you might need to make some adjustments to your code such as Analyzer class names, but generally if you were using the Hibernate Search API, the most tricky changes of Lucene are encapsulated and won't affect your code directly.

Why version 5.0

The major number was increased because the Lucene upgrade is a significant change, and because it forced us to break our API compatibility promise which we apply on minor versions. Don't assume that this will require Hibernate ORM at version 5 too: it still depends on Hibernate ORM versions 4.3.x (as did Hibernate Search 4.5) and is still compatible with WildFly 8, and we expect it will be compatible with WildFly 9 as well. It is possible that Hibernate Search 5 will be compatible with ORM version 5; we'll certainly aim for that, but cannot guarantee it.

So if you have an application using Hibernate ORM 4.3.x and Hibernate Search 4.5.x, it should be simple to upgrade as you won't have to upgrade ORM and can focus on changes needed for Search and Lucene only.

Indexing Performance

The indexing engine has been revisited, providing great performance enhancements and also simplifying configuration: you no longer need to configure a number of backend workers.

Both asynchronous indexing and synchronous indexing have been redesigned.

For the asynchronous indexing backend you now have a per-index index_flush_interval property which you can use to limit the time between your updates committed on the database and the related index commit.

The synchronous backend is now able to merge write requests from multiple parallel transactions so to provide both the benefits of batched writes on the index while still having synchronous updates. This new model allows to have performance similar to what was previously only possible when selecting the NRT backend, but doesn't have the drawbacks such as not being compatible with the Infinispan Directory.

OSGi, Apache Karaf, JBoss FUSE

The project code and build has been refactored to produce nice OSGi compatible libraries. We run integration tests with Apache Karaf so our artefacts should be safe to consume via JBoss FUSE. The Lucene jars are still a bit troublesome, but if you have any problem with it please let us know we might be able to find a solution.

JDK 7, 8 and 9 compatibility

Hibernate Search 5 now requires a Java 7 runtime, but we also test regularly with Java 8 and previews of Java 9.

Automatic bridge discovery for property conversion

For those developers defining custom domain types, it's now possible to automatically bind a given Java type to a FieldBridge. You won't have to copy/paste those @FieldBridge annotations all over your model. This feature is explained in the BridgeProvider section of the documentation. You could use it for example to contribute the missing converters for Java 8 Date/Time types.

MoreLikeThis queries

Using the new MoreLikeThis query capabilities you don't have to target specific fields but can provide an instance of an indexed object. This model is also known as query by example and will trigger a similarity query matching all fields (or a subset of your choice). A full exaxmple can be seen on this previous blog post.

Dropped dependency to Apache Solr

Until this version Hibernate Search depended on Apache Lucene for most of the work, and also on Lucene's sister project Apache Solr to provide a richer set of analyzers. Since the Lucene project incorporated this functionality from Solr, there is no longer any need to depend on Solr artifacts.

Improved modularity: clean WildFly integration

With requirements such as OSGi support, other projects like CapeDwarf and Infinispan integrating Hibernate Search (but excluding dependencies to Hibernate ORM), advanced needs for the Hibernate OGM project our integration API and modularity was extensively stretched and tested, resulting in lots of improvements which you might not directly notice, but will make it much easier to avoid dependency conflicts with any other library you might use, or integrate nicely in your favorite container / framework.

One example is the new structure of the modules we provide for easy WildFly integration: highly encapsulated, and significantly less dependencies than previous versions.

For example the JGroups backend can use a JGroups version of your choice, and it doesn't need to match the JGroups version of Infinispan even if Hibernate Search is using Infinispan as well (which depends on its own JGroups version); this will not be a problem, and JGroups wouldn't even be exposed to your application so in theory you could be using a third different version of the clustering library in your app directly. In practice you would probably want to keep the versions aligned, but if you prefer otherwise it won't be a problem.

Numeric Fields

Any numeric property, including Calendar and Date types, are now by default indexed as a NumericField. A NumericField is more efficient to perform range queries, so we think this is what you should be using in most cases. Of course it's still possible to explicitly annotate the property to revert to the old behaviour: this is just a change in the defaults.

Please keep this change in mind when running queries, as you'll now need to query these as a NumericField. If you use our Query builder DSL this is going to be correct transparently, but if you use the Lucene native APIs to create queries the results won't match and you won't get any kind of warning.

Migration Guide

We normally keep track of any API change in our wiki's migration guide; that's the right place to look for API / compatibility changes between any specific version.

For a summary of the changes for people jumping from version 4.x to 5.x, we created a new dedicated Migration page on the website which you can find from the Documentation page.

Index Migration

Technically it is possible that this latest version of Lucene could read your existing indexes, but with such a large version increase of Lucene's code, and considering the numeric mapping changes, and the many changes in the Analyzers over time, we highly recommend you replace your old indexes and use the MassIndexer to trigger a fresh rebuilt.

What's next?

We have several interesting plans ahead, but our priority is defined by feedback. Please let us know what you'd need, or even if it works great for you it's nice for us to hear about it and what you do with it. You can get in touch with us with any of these media, especially the forums should be a good starting point.

This is what we hope to work in the near future:

  • dynamic defined models (not strictly bound to annotated classes)
  • Alternatives to embedded Lucene backends: Apache Solr or ElasticSearch seem to be good candidates for this
  • Support for the new Java 8 types
  • Integration in WildFly 9
  • Support for Forge
  • Openshift / Docker / Kubernetes templates and guides
  • Improve performance (Always!)
  • Improved clustering functionality (master election?) on Infinispan/JGroups
  • Take better advantage of the new Lucene 4 capabilities (Faceting, query-time join, etc..) Can you suggest?

This list is long, and I could easily expand. We could really user your help, especially as our small core team is not familiar with many of the other mentioned technologies: even if you don't feel like coding but are in the mood for bleeding edge testing that would be great.

Version 5.0.0.CR1 of Hibernate Search is now available.

Numeric Field(s) being used by default

If you don't specify any FieldBridge for your Numeric attributes, or Date or Calendar fields, now Hibernate Search will encode them by default using Lucene's specialized NumericField format. This format was available since long in both Lucene and Hibernate Search, but so far you had to explicitly enable it as Hibernate Search would - by default - stick to the backwards compatible format of transforming these types into keywords (strings). The NumericField format is much more efficient to perform range queries - which we expect being common for these types.

Remember that - unless you had explicit field configuration - this implies that you might need to fix how your queries are created. By using the Hibernate Search Query DSL you will get an exception to warn you if you try to force it using the wrong type. If you're using the Lucene API directly, make sure to check you're getting the results you expect.

API changes

I have no other major changes to report regarding our public API; however for power users and other frameworks integrating with Hibernate Search you might notice a significant reorganization of our SPI. We've documented all relevant changes in the Migration Guide.

The final release of version 5 will be released very soon, so please make sure you test this quickly. Any comment is welcome on the mailing list or via IRC.

Sanne

If you are around in London the evening of the 14th of January, I would love to see you at our monthly JBUG event.

We'll start the evening with an introduction to Hibernate Search, including basics of concepts from Apache Lucene, and then discuss the novelties you'll find in Hibernate Search 5.0, before discussing the more advanced features. That should be interesting both for those of you already familiar with the technology, and for those who never heard of it and are now wondering how and if it could help you.

The event will be at Skills Matters, organized by our partner C2B2, and after the demo we'll have plenty of time for pizza, beers and face to face discussions about all things Hibernate.

Please find all venue details on meetup.com and don't forget to register here.

Today we're releasing two maintenance versions of Hibernate Search:

Backporting performance improvements

Normally we would not backport new features to maintenance releases, but some of the great performance improvements of the new indexing engine of upcoming Hibernate Search 5 such as {HSEARCH-1693, HSEARCH-1699, HSEARCH-1725} seem to be very desirable. These are not introducing any API or functionality change, so we could backport them at virtually no risk.

This means you can now easily upgrade your Hibernate Search 4.4.x and 4.5.x applications without necessarily needing to migrate to Hibernate Search 5. Remember though: there are a lot more improvements coming in 5! If you want all the nice improvements you'll have to eventually migrate.

What was not backported

These new backends were created because performance testing of the Infinispan indexing engine highlighted some problems in our backend when using an Infinispan Directory; so while these patches provide an impressive boost on their own, they will be far more effective when paired up with latest Infinispan 7 as some changes where applied to Infinispan too. But we're not upgrading these maintenance branches of Hibernate Search 4 to Infinispan 7 as that would break all of your configurations. To take benefit of the updated Infinispan integration you'll need Hibernate Search 5. Another great reason to move to Hibernate Search 5 is of course the update to latest Apache Lucene; so these updates announced today should be a nice an easy performance boost but if you are serious about needing the highest speed please keep testing version 5.

Feedback needed!

While these impressive improvements were created after specific diagnostics work on Infinispan, the benefits are not Infinispan specific: you should be able to experience a significant throughput boost with any storage. The exception is if you were using the NRT backend: I don't expect you to see any benefit in that case. Although if you were forced to use NRT because of throughput needs but didn't like the tradeoffs, you might no longer need to use NRT as the new non-NRT backend could be nearly as efficient.

You can now upgrade to Version 5.0.0.Beta3 of Hibernate Search, and benefit from the following improvements:

Indexing Performance

We did some further polishing of the shiny new backend improvements introduced by last week. I would be really happy to get some feedback on this, as you should be able to get a very significant performance boost on index writing - whatever the storage technology you're using. We're preparing some large scale tests, but the environments we can test on are limited so I'd be happy if you could send us a note on what your experience with it looks like.

The new design should have a significant improvement in throughput, but also requires less locking, needs less threads and will result into less pressure on GC as it has a lower allocation rate.

JDK9 compatibility

We now have continuous integration running for Java 9 (preview builds) running as well. Except the OSGi integration tests running in Apache Karaf, everything else seems to work fine.

API changes

We're now polishing the API, and it's possible that this might be the last Beta. Two very frequently used interfaces were renamed; please don't miss the Migration Guide.

As always, looking forward for your experience with it! ideas and suggestions on the mailing list or via IRC.

Sanne

Showing 1 to 5 of 49 blog entries