Help

Inactive Bloggers

After 14 months of hard work, please welcome Hibernate Search 5 !

Let's have a look at the highlights of why you should be eager to upgrade:

  • Upgraded to Lucene 4.10
  • Lots of internal improvements, especially performance
  • Thanks to Hibernate Search abstraction, most of your code should be upgradable easily despite the massive changes in Lucene APIs
  • Numeric properties now indexed as NumericField by default
  • Requires JDK 7
  • Compatible with Hibernate ORM 4.3 and WildFly 8.x
  • Stable

How to get it

Everything you need is available on Hibernate Search's web site. Download the full distribution from here. And don't hesitate to reach us in our forums.

If you are new to Hibernate Search, best is to start with our getting started guide.

Feature list

Let's dive into the feature list.

Lucene 4.10

Hibernate Search 4 has been stuck with the quite outdated 3.6.x version of Apache Lucene, while the Lucene 4 series is introducing lots of improvements. Lucene has now reached version 4.10.3 and is considered stable, reliable and significantly more efficient than previous versions; you can now benefit from all these improvements. Some APIs changed, you might need to make some adjustments to your code such as Analyzer class names, but generally if you were using the Hibernate Search API, the most tricky changes of Lucene are encapsulated and won't affect your code directly.

Why version 5.0

The major number was increased because the Lucene upgrade is a significant change, and because it forced us to break our API compatibility promise which we apply on minor versions. Don't assume that this will require Hibernate ORM at version 5 too: it still depends on Hibernate ORM versions 4.3.x (as did Hibernate Search 4.5) and is still compatible with WildFly 8, and we expect it will be compatible with WildFly 9 as well. It is possible that Hibernate Search 5 will be compatible with ORM version 5; we'll certainly aim for that, but cannot guarantee it.

So if you have an application using Hibernate ORM 4.3.x and Hibernate Search 4.5.x, it should be simple to upgrade as you won't have to upgrade ORM and can focus on changes needed for Search and Lucene only.

Indexing Performance

The indexing engine has been revisited, providing great performance enhancements and also simplifying configuration: you no longer need to configure a number of backend workers.

Both asynchronous indexing and synchronous indexing have been redesigned.

For the asynchronous indexing backend you now have a per-index index_flush_interval property which you can use to limit the time between your updates committed on the database and the related index commit.

The synchronous backend is now able to merge write requests from multiple parallel transactions so to provide both the benefits of batched writes on the index while still having synchronous updates. This new model allows to have performance similar to what was previously only possible when selecting the NRT backend, but doesn't have the drawbacks such as not being compatible with the Infinispan Directory.

OSGi, Apache Karaf, JBoss FUSE

The project code and build has been refactored to produce nice OSGi compatible libraries. We run integration tests with Apache Karaf so our artefacts should be safe to consume via JBoss FUSE. The Lucene jars are still a bit troublesome, but if you have any problem with it please let us know we might be able to find a solution.

JDK 7, 8 and 9 compatibility

Hibernate Search 5 now requires a Java 7 runtime, but we also test regularly with Java 8 and previews of Java 9.

Automatic bridge discovery for property conversion

For those developers defining custom domain types, it's now possible to automatically bind a given Java type to a FieldBridge. You won't have to copy/paste those @FieldBridge annotations all over your model. This feature is explained in the BridgeProvider section of the documentation. You could use it for example to contribute the missing converters for Java 8 Date/Time types.

MoreLikeThis queries

Using the new MoreLikeThis query capabilities you don't have to target specific fields but can provide an instance of an indexed object. This model is also known as query by example and will trigger a similarity query matching all fields (or a subset of your choice). A full exaxmple can be seen on this previous blog post.

Dropped dependency to Apache Solr

Until this version Hibernate Search depended on Apache Lucene for most of the work, and also on Lucene's sister project Apache Solr to provide a richer set of analyzers. Since the Lucene project incorporated this functionality from Solr, there is no longer any need to depend on Solr artifacts.

Improved modularity: clean WildFly integration

With requirements such as OSGi support, other projects like CapeDwarf and Infinispan integrating Hibernate Search (but excluding dependencies to Hibernate ORM), advanced needs for the Hibernate OGM project our integration API and modularity was extensively stretched and tested, resulting in lots of improvements which you might not directly notice, but will make it much easier to avoid dependency conflicts with any other library you might use, or integrate nicely in your favorite container / framework.

One example is the new structure of the modules we provide for easy WildFly integration: highly encapsulated, and significantly less dependencies than previous versions.

For example the JGroups backend can use a JGroups version of your choice, and it doesn't need to match the JGroups version of Infinispan even if Hibernate Search is using Infinispan as well (which depends on its own JGroups version); this will not be a problem, and JGroups wouldn't even be exposed to your application so in theory you could be using a third different version of the clustering library in your app directly. In practice you would probably want to keep the versions aligned, but if you prefer otherwise it won't be a problem.

Numeric Fields

Any numeric property, including Calendar and Date types, are now by default indexed as a NumericField. A NumericField is more efficient to perform range queries, so we think this is what you should be using in most cases. Of course it's still possible to explicitly annotate the property to revert to the old behaviour: this is just a change in the defaults.

Please keep this change in mind when running queries, as you'll now need to query these as a NumericField. If you use our Query builder DSL this is going to be correct transparently, but if you use the Lucene native APIs to create queries the results won't match and you won't get any kind of warning.

Migration Guide

We normally keep track of any API change in our wiki's migration guide; that's the right place to look for API / compatibility changes between any specific version.

For a summary of the changes for people jumping from version 4.x to 5.x, we created a new dedicated Migration page on the website which you can find from the Documentation page.

Index Migration

Technically it is possible that this latest version of Lucene could read your existing indexes, but with such a large version increase of Lucene's code, and considering the numeric mapping changes, and the many changes in the Analyzers over time, we highly recommend you replace your old indexes and use the MassIndexer to trigger a fresh rebuilt.

What's next?

We have several interesting plans ahead, but our priority is defined by feedback. Please let us know what you'd need, or even if it works great for you it's nice for us to hear about it and what you do with it. You can get in touch with us with any of these media, especially the forums should be a good starting point.

This is what we hope to work in the near future:

  • dynamic defined models (not strictly bound to annotated classes)
  • Alternatives to embedded Lucene backends: Apache Solr or ElasticSearch seem to be good candidates for this
  • Support for the new Java 8 types
  • Integration in WildFly 9
  • Support for Forge
  • Openshift / Docker / Kubernetes templates and guides
  • Improve performance (Always!)
  • Improved clustering functionality (master election?) on Infinispan/JGroups
  • Take better advantage of the new Lucene 4 capabilities (Faceting, query-time join, etc..) Can you suggest?

This list is long, and I could easily expand. We could really user your help, especially as our small core team is not familiar with many of the other mentioned technologies: even if you don't feel like coding but are in the mood for bleeding edge testing that would be great.

Today is a big day. The first release of Hibernate OGM with final status. Ever! Don't be fooled by the 4.1 number. Hibernate OGM is an object mapper for various NoSQL stores and offers the familiar JPA APIs. This final version offers mapping for MongoDB, Neo4J, Infinispan and Ehcache.

It has been a long journey to reach this release, much longer than we thought. And there is a long journey ahead of us to implement our full (and exciting!) vision. But today is the time to celebrate: download this puppy and try it out.

What is Hibernate OGM?

Hibernate OGM is an object mapper. It persists object graphs into your NoSQL datastore.

We took great care to map the object structures the most natural way possible ; we considered all the best practices for each of the NoSQL store we support. Storing a association in a document store is vastly different than storing the same association in a graph database.

// Unidirectional one to many association

// Basket
{
  "_id" : "davide_basket",
  "owner" : "Davide",
  "products" : [ "Beer", "Pretzel" ]
}

// Products
{
  "_id" : "Pretzel",
  "description" : "Glutino Pretzel Sticks"
}
{
  "_id" : "Beer",
  "description" : "Tactical nuclear penguin"
}

Hibernate OGM is 90% Hibernate ORM. We changed the parts that are specific to SQL and JDBC, but most of the engine remains untouched. Same power, same flexibility.

What is the API like?

Very simple. It's JPA. Or Hibernate ORM. Map your entities with JPA annotations (or via XML), then use JPA or the Hibernate native APIs to manipulate your objects.

@PersistenceContext EntityManager em;

// the transaction boundary is really here to express the flush time
@Transactional
public void createSomeUser() {
    Employer redHat =
        em.createQuery("from Employer e where e.name = :name")
        .setParamater("name", "Red Hat")
        .getSingleResult();
    User emmanuel = new User("Emmanuel", "Bernard");
    user.setTwitterHandle("emmanuelbernard");
    user.setEmployer(redHat);
    em.persist(user);
}

Our goal is to have a zero barrier of entry to NoSQL object mappers for people familiar with JPA or Hibernate ORM.

Hibernate OGM also has a flexible option system that lets you customize some of the NoSQL store specifics or mapping options. For example what is the MongoDB Write Concern for this entity (see code example) or should associations be stored in the owning entity document.

@Entity
@WriteConcern(JOURNALED) // MongoDB write concern
public class User {
    ...
}

And queries?

We cannot talk about JPA without mentioning JP-QL. Offering JP-QL support is challenging at many levels. To mention only two, joins usually don't exist in NoSQL, and each store has a very different set of query capabilities.

Hibernate OGM can convert JP-QL queries to the underlying native query language of the datastore. This functionality is still limited however. Besides some queries will never map to JP-QL. So we also let you write native queries specific to your NoSQL store and map the results to managed entities.

// native query using CypherQL
String query = "MATCH ( n:Poem { name:'Portia', author:'Oscar Wilde' } ) RETURN n";
Poem poem = (Poem) em.createNativeQuery( query, Poem.class ).getSingleResult();

Where can I use Hibernate OGM?

It works anywhere Hibernate ORM or any JPA provider works. Java SE, Java EE, all should be good. We do require JPA 2.1 though. If you use WildFly (8.2), we have a dedicated module to make things even easier.

For which NoSQL store?

MongoDB, Neo4J, Infinispan and Ehcache are the one we consider stable. We are working on CouchDB and Cassandra. But really, any motivated person can try and map other NoSQL stores: that's how a few got started. We have an API that has proven flexible enough so far.

Can Hibernate OGM do X, Y, Z?

Probably. Maybe not.

The best is to talk to us in our forums, check our documentation (we spent a lot of time on it), and simply give it a try.

Generally, the mapping support is complete. Our query support is still a bit limited compared to where we want it to be. It will improve quickly now that the foundations are here.

We want to know what you need out of Hibernate OGM, what feature you miss, which one should be changed. Come and talk to us in our forum or anywhere you can find us.

How to get started?

Most of what you need is available in our web site. There is a getting started guide, and the more complete reference documentation. Get the full Hibernate OGM distribution. And last but not least, for any help, reach us via our forum.

It would be impossible to mention all the persons that contributed to Hibernate OGM and how: conversations, support, code, documentation, bootstrapping new datastore providers... Many thanks to all of you for making this a reality.

We are not done yet, far from it. We have plenty of ideas on where we want to bring Hibernate OGM. That's a discussion for another day.

12. Dec 2014, 11:58 CET, by Hardy Ferentschik

While working on the finishing touches for Hibernate Search 5 and updating the projects integrating with Hibernate Search, we discovered a couple of issues which we needed to address to proceed. So without further ado, here comes:

Hibernate Search 5.0.0.CR2

Enjoy!

After a huge push, we are now one release away from our Final version. So without further due, I present you Hibernate OGM 4.1.0.CR1. To sum it up:

  • stable mapping for each of the supported datastores (Infinispan, Ehcache, MongoDB, Neo4J)
  • new and better one cache per entity structure for key/value stores
  • improvement in Neo4J and MongoDB around embedded objects and composite ids
  • better documentation

Our goal for Hibernate OGM 4.1 is to offer a good Object Mapper for each of the primary datastores we target. Go test it before the final and give us feedback!

Mapping stability and documentation

This CR release signals the stable version of how we persist each data structure on the various datastores. We strive to offer a mapping that is natural to each individual datastore. We made some final improvements and are confident we can support this version.

For each datastore, we documented how each JPA mapping is persisted (entities, star-to-one, star-to-many, embedded id, etc.). It makes for a tree-killer documentation but shows what is the truth for each mapping.

We took the opportunity to improve the documentation even further and plan to finish that work for the final version.

Additional key/value cache structure

Our tests have showed that storing each entity type and association in a dedicated cache is actually more efficient than sharing the same cache for all entities. Since we also think it is a more natural mapping, we now offer this option and make it the default.

A User and an Address entities would lead to the following caches:

  • User: contains the users
  • Address: contains the address
  • associations_User_Address: contains the navigation from a user to its list of addresses

An interesting side effect is that it makes the keys smaller in size and faster to compare. Both Infinispan and Ehcache are benefiting from this.

Improvements around embedded objects, embedded ids and properties

In Neo4J, (non id) embedded objects are now represented as individual nodes. It is more in line with the connection behavior of graph databases.

In MongoDB, embedded id foreign keys have been improved and are now represented as nested documents like embedded ids were already. We did not hear complains about this one, so we think you guys don't use composite ids with MongoDB. That's good, keep doing that :)

The null properties are no longer stored in any of the data stores. While it might make some queries involving null values a bit harder, it is the more natural mapping for the datastores.

Go, go, go

Version 5.0.0.CR1 of Hibernate Search is now available.

Numeric Field(s) being used by default

If you don't specify any FieldBridge for your Numeric attributes, or Date or Calendar fields, now Hibernate Search will encode them by default using Lucene's specialized NumericField format. This format was available since long in both Lucene and Hibernate Search, but so far you had to explicitly enable it as Hibernate Search would - by default - stick to the backwards compatible format of transforming these types into keywords (strings). The NumericField format is much more efficient to perform range queries - which we expect being common for these types.

Remember that - unless you had explicit field configuration - this implies that you might need to fix how your queries are created. By using the Hibernate Search Query DSL you will get an exception to warn you if you try to force it using the wrong type. If you're using the Lucene API directly, make sure to check you're getting the results you expect.

API changes

I have no other major changes to report regarding our public API; however for power users and other frameworks integrating with Hibernate Search you might notice a significant reorganization of our SPI. We've documented all relevant changes in the Migration Guide.

The final release of version 5 will be released very soon, so please make sure you test this quickly. Any comment is welcome on the mailing list or via IRC.

Sanne

Showing 1 to 5 of 1232 blog entries