Help

Inactive Bloggers

New releases of both Hibernate Entity Manager and Hibernate Annotations are available.

Hibernate Entity Manager now supports full .par archive including entity and hbm files auto discovery (see the previous blog entry for more details on this feature) and fixes some critical bugs (see the release notes http://sourceforge.net/project/shownotes.php?release_id=346915).

Hibernate Annotations new release is focused on better support of Hibernate specific features (all Hibernate id generators support, column indexes generation, non PK referencedColumnName support for @OneToOne, @ManyToOne, and @OneToMany...) and of course bug fixes (See the release notes http://sourceforge.net/project/shownotes.php?release_id=346914 for more details).

These releases are compatible with the latest Hibernate core release (3.1 beta1).

05. Aug 2005, 01:34 CET, by Emmanuel Bernard

Packaging has always been a manual operation in ORM world. In Hibernate, you have to list the mapped entities either through the configuration API or through the hibernate.cfg.xml file. For a while now, JBoss AS has introduced the notion of .har, basically an archive scanned by the deployer to discover the Hibernate configuration and the hbm.xml files in it.

Packaging inside an EJB3 container

The EJB3 expert group has introduced the same notion in the EJB3 public draft. A PAR archive is basically a jar file having the .par extension. All you have to do is put all your annotated entities in the archive and the container has the responsibility to scan it and find all annotated entities. A PAR archive is a persistence unit definition that will be used to create an EntityManagerFactory (aka SessionFactory in the Hibernate world). You will then be able to use your persistence unit (by looking up or injecting an EntityManager or an EntityManagerFactory) named by the name of the PAR file without the extension (ie mypersistenceunit.par will be referred as mypersistenceunit).

Since you might want to customize your persistence unit configuration, a persistence.xml file can be added in the META-INF directory.

<?xml version="1.0" encoding="UTF-8"?>
<entity-manager>
   <name>FinancialPU</name>
   <provider>org.hibernate.ejb.HibernatePersistence</provider>
   <jta-data-source>jdbc/MyDB</jta-data-source>
   <class>com.acme.MyClass</class>
   <jar-file>externalEntities.jar</jar-file>
   <properties>
       <property name="hibernate.max_fetch_depth" value="4"/>
   </properties>
</entity-manager>

Let's analyze this small but comprehensive example.

The name element allows you to override the persistence unit name (defaulted to the PAR file name minor the .par suffix).

The provider element allows you to express the Entity Manager implementation you want to use for this persistence unit. The value is defaulted to Hibernate Entity Manager if none is specified. This is a interesting one, this basically means that you can use several Entity Manager implementations in the same application or use the Hibernate Entity Manager implementation in lieu of your vendor EJB3 persistence implementation in a standard way!

The jta-data-source aside with the non-jta-data-source let you specify the datasource the persistence unit will work onto.

The class element, allows you to add explicitly some entities to be mapped. These entities are typically outside of the PAR archive and the Entity Manager will search them in the EAR classpath. This is particularly convenient to be able to share the same entity definition across several persistence unit.

The jar-file element, allows you to ask the entity manager implementation to add all the entities contained in a particular JAR and include them in the configuration. In the case of the Hibernate Entity Manager, it will also look at the hbm.xml files. This is particularly convenient to share a certain amount of entities definitions across several persistence units.

There is also a mapping-file element currently not supported in Hibernate Entity Manager's implementation.

The properties elements is a way to provide some implementation specific properties to your entity manager. In the case of Hibernate you can add most of the hibernate.* properties. You can also define the second level cache informations using hibernate.ejb.classcache.* and hibernate.ejb.collectioncache.*, please refer to the reference documentation for more information.

This is good news for JBoss users, the .har archive is now standardized. The packaging that has always been a strong concept in J2EE is now extended to the ORM world in a very ease of use manner.

Packaging in J2SE environment

The very new point is that the PAR packaging simplicity works in the exact same manner in the J2SE world. The only difference is that you'll need to define your datasource not through the jta-data-source element but through the classic hibernate.* connection properties. The PAR archive is still scanned to find its contained entities and hbm.xml files. In order to let the Hibernate Entity Manager discover the PAR files, they need to have a persistence.xml file in the META-INF directory (Hibernate Entity Manager basically request any resources named META-INF/persistence.xml and deduces the PAR archive location from it).

Let's imagine the following acmedomainmodel.par archive structure

com/acme/model/Animal.class (an @Entity annotated class)
com/acme/model/Dog.class (an @Entity annotated class)
com/acme/model/Cat.class (an @Entity annotated class)
com/acme/model/Customer.class (a non annotated POJO)
com/acme/model/Customer.hbm.xml (the metadata definitions of Customer)
META-INF/persistence.xml

where persistence.xml is

<?xml version="1.0" encoding="UTF-8"?>
<entity-manager>
   <properties>
       <property name="hibernate.max_fetch_depth" value="4"/>
       <property name="hibernate.dialect" value="org.hibernate.dialect.MySQLInnoDBDialect"/>
       <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver"/>
       <property name="hibernate.connection.username" value="emmanuel"/>
       <property name="hibernate.connection.password" value="secret"/>
       <property name="hibernate.connection.url" value="[=>jdbc:mysql:///test]"/>
       <property name="hibernate.cache.provider_class" value="org.hibernate.cache.EhCacheProvider"/>
   </properties>
</entity-manager>

My persistence unit named acmedomainmodel will then contains automatically Animal, Dog, Cat, Customer. Note that the Customer class don't really have to be in the PAR archive, it just need to be in the classpath as long as its hbm.xml definition itself is inside the PAR archive.

Note that you can tune the discovery mechanism through the hibernate.ejb.autodetection. The possible values are none (no auto detection), class (auto detection of the annotated entities), hbm (auto detection of the hbm files) and class,hbm (auto detection of both annotated entities and hbm files).

With a simple ant task you can then create a PAR archive which contains automatically your persistent domain model. No need to manually add the mapped entities inside the hibernate.cfg.xml file anymore.

Several persistent unit in my application

You can of course use several PARs archive in your application. The appropriate PAR archive will be processed based on the name your provide.

//create a keep the emf for later entity manager creations
EntityManagerFactory emf = Persistence.createEntityManagerFactory("acmedomainmodel");
...
EntityManager em = emf.createEntityManager();
em.getTransaction().begin();
em.persist(customer);
wolfy = em.find(Dog.class, wolfyId);
em.getTransaction().commit();
em.close();

Note that if there is only one PAR archive in your classpath, you don't have to pass the name to the createEntityManagerFactory() method, but is considered good practice however.

The PAR archive mechanism offers a very convenient and standard way to package your ORM persistent units. By its autodiscovery mechanism, the packaging setting up scales in a very elegant manner.

20. Jul 2005, 20:45 CET, by Steve Ebersole

As I mentioned in my previous blog about Bulk Operations , both UPDATE and DELETE statements are challenging to handle against single entities contained across multiple tables (not counting associations), which might be the case with:

  • inheritence using <joined-subclass/>
  • inheritence using <union-subclass/>
  • entity mapping using the <join/> construct

For illustration purposes, lets use the following inheritance hierarchy:

Animal
  /   \
 /     \
Mammal   Reptile
   / \
  /   \
Human   Dog

all of which is mapped using the joined-subclass strategy.

Deletes

There are three related challenges with deletes.

  • deletes against a multi-table entity need to recursively cascade to:
  • all sub-class(es) row(s) matched by primary key (PK) value
  • its super-class row
  • all these orchestrated deletes need to occur in an order to avoid constraint violations
  • which rows need to get deleted?

Consider the following code:

session.createQuery( "delete Mammal m where m.age > 150" ).executeUpdate();

Obviously we need to delete from the MAMMAL table. Additionally, every row in the MAMMAL table has a corresponding row in the ANIMAL table; so for any row deleted from the MAMMAL table, we need to delete that corresponding ANIMAL table row. This fulfills cascading to the super-class. If the Animal entity itself had a super-class, we'd need to delete that row also, etc.

Next, rows in the MAMMAL table might have corresponding rows in either the HUMAN table or the DOG table; so, again, for each row deleted from the MAMMAL table, we need to make sure that any corresponding row gets deleted from the HUMAN or DOG table. This fulfills cascading to the sub-class. If either the Human or Dog entities had further sub-classes, we'd need to delete any of those rows also, etc.

The other challenge I mentioned is proper ordering of the deletes to avoid violating any constraints. The typical foreign key (FK) set up in our example structure is to have the FKs pointing up the hierarchy. Thus, the MAMMAL table has a FK from its PK to the PK of the ANIMAL table, etc. So we need to be certain that we order the deletes:

( HUMAN | DOG ) -> MAMMAL -> ANIMAL

Here, it does not really matter whether we delete from the HUMAN table first, or from the DOG table first.

So exactly which rows need to get deleted (a lot of this discussion applies to update statements as well)? Most databases do not support joined deletes, so we definitely need to perform the deletes seperately against the individual tables involved. The naive approach is to simply use a subquery returning the restricted PK values with the user-defined restriction as the restriction for the delete statement. That actually works in the example given before. But consider another example:

session.createQuery( "delete Human h where h.firstName = 'Steve'" ).executeUpdate();

I said before that we need to order the deletes so as to avoid violating defined FK constraints. Here, that means that we need to delete from the HUMAN table first; so we'd issue some SQL like:

delete from HUMAN where ID IN (select ID from HUMAN where f_name = 'Steve')

So far so good; perhaps not the most efficient way, but it works. Next we need to delete the corresponding row from the MAMMAL table; so we'd issue some more SQL:

delete from MAMMAL where ID IN (select ID from HUMAN where f_name = 'Steve')

Oops! This won't work because we previously deleted any such rows from the HUMAN table.

So how do we get around this? Definitely we need to pre-select and store the PK values matching the given where-clause restriction. One approach is to select the PK values through JDBC and store them within the JVM memory space; then later the PK values are bound into the individual delete statements. Something like:

PreparedStatement ps = connection.prepareStatement( 
        "select ID from HUMAN where f_name = 'Steve'"
);
ResultSet rs = ps.executeQuery();
HashSet ids = extractIds( rs );
int idCount = ids.size();

rs.close();
ps.close();

....

// issue the delete from HUMAN
String sql = 

ps = connection.prepareStatement(
        "delete from HUMAN where ID IN (" +
        generateCommaSeperatedParameterHolders( idCount ) +
        ")"
);
bindParameters( ps, ids );
ps.executeUpdate();

...

The other approach, the one taken by Hibernate, is to utilize temporary tables; where the matching PK values are stored on the database server itself. This is far more performant in quite a number of ways, which is the main reason this approach was chosen. Now we have something like:

// where HT_HUMAN is the temporary table (varies by DB)
PreparedStatement ps = connection.prepareStatement( 
        "insert into HT_HUMAN (ID) select ID from HUMAN where f_name = 'Steve'"
);
int idCount = ps.executeUpdate();
ps.close();

....

// issue the delete from HUMAN 
ps = connection.prepareStatement(
        "delete from HUMAN where ID IN (select ID from HT_HUMAN)"
);
ps.executeUpdate();

In the first step, we avoid the overhead of potential network communication associated with returning the results; we also avoid some JDBC overhead; we also avoid the memory overhead of needing to store the id values. In the second step, we again minimized the amount of data traveling between us and the database server; the driver and server can also recognize this as a repeatable prepared statement and avoid execution plan creation overhead.

Updates

There are really only two challenges with multi-table update statements:

  • partitioning the assignments from the set-clause
  • which rows need to get updated? This one was already discussed above...

Consider the following code:

session.createQuery( "update Mammal m set m.firstName = 'Steve', m.age = 20" )
        .executeUpdate();

We saw from before that the age property is actually defined on the Animal super-class and thus is mapped to the ANIMAL.AGE column; whereas the firstName property is defined on the Mammal class and thus mapped to the MAMMAL.F_NAME column. So here, we know that we need to perform updates against both the ANIMAL and MAMMAL tables (no other tables are touched, even though the Mammal might further be a Human or a Dog). Partitioning the assignments really just means identifying which tables are affected by the individual assignments and then building approppriate update statements. A minor challenge here was accounting for this fact when actually binding user-supplied parameters. Though, for the most part, partitioning the assignments and parameters was fairly academic exercise.

20. Jul 2005, 00:24 CET, by Steve Ebersole

The EJB3 persistence specification calls for implementors to support Bulk Operations in EJB-QL (the EJB Query Language). As part of Hibernate's implementation of EJB3 persistence, HQL (the Hibernate Query Language : which is a superset of EJB-QL) needed to support these Bulk Operations. This support is now code complete, even going beyond what is offered in the EJB3 persistence specification. There is one task outstanding against this bulk operation support in HQL, but this is completely beyond the scope of the support called for in the EJB3 persistence specification. I'll blog about this one later as it simply rocks ;)

So what exactly are Bulk Operations? Well for those of you familiar with SQL, it is analogous to Data Manipulation Language (DML) but, just like HQL and EJB-QL, defined in terms of the object model. What is DML? DML is the SQL statements which actually manipulate the state of the tabular data: INSERT, UPDATE, and DELETE.

Essentially, all that is to say that EJB-QL and HQL now support UPDATE and DELETE statements (HQL also supports INSERT statements, but more about that at a later time).

In its basic form, this support is not really all that difficult. I mean Hibernate already knows all the information pertaining to tables and columns; it already knows how to parse WHERE-clauses and the like. So what's the big deal? Well, in implementation, we ran across a few topics that make this support more challenging; which of course made it all the more fun to implement ;)

Update Statements

From the EJB3 persistence specification:

Bulk update and delete operations apply to entities of a single entity class 
(together with its subclasses, if any). Only one entity abstract schema type 
may be specified in the FROM or UPDATE clause.

The specification-defined psuedo-grammar for the update syntax:

update_statement ::= update_clause [where_clause]

update_clause ::=UPDATE abstract_schema_name [[AS ] identification_variable]
    SET update_item {, update_item}*

update_item ::= [identification_variable.]state_field = new_value

new_value ::=
    simple_arithmetic_expression |
    string_primary |
    datetime_primary |
    boolean_primary

The basic jist is:

  • There can only be a single entity (abstractschemaname) named in the update-clause; it can optionally be aliased. If the entity name is aliased, then any property references must be qualified using that alias; if the entity name is not aliased, then it is illegal for any property references to be qualified.
  • No joins (either implicit or explicit) can be specified in the update. Sub-queries may be used in the where-clause; the subqueries, themselves, can contain joins.
  • The where-clause is also optional.

Two interesting things to point out:

  • According to the specification, an UPDATE against a versioned entity should not cause the version to be bumped
  • According to the specification, the assigned new_value does not allow subqueries; HQL supports this!

Even though the spec disallows bumping the version on an update of a versioned entity, this is more-often-than-not the desired behavior. Because of the spec, Hibernate cannot do this by default so we introduced a new keyword VERSIONED into the grammar instead. The syntax is update versioned MyEntity ..., which will cause the version column values to get bumped for any affected entities.

Delete Statements

From the EJB3 persistence specification:

Bulk update and delete operations apply to entities of a single entity class 
(together with its subclasses, if any). Only one entity abstract schema type 
may be specified in the FROM or UPDATE clause.

A delete operation only applies to entities of the specified class and its 
subclasses. It does not cascade to related entities.

The specification-defined psuedo-grammar for the delete syntax:

delete_statement ::= delete_clause [where_clause]

delete_clause ::= DELETE FROM abstract_schema_name [[AS ] identification_variable]

The basic jist is:

  • There can only be a single entity (abstractschemaname) named in the from-clause; it can optionally be aliased. If the entity name is aliased, then any property references must be qualified using that alias; if the entity name is not aliased, then it is illegal for any property references to be qualified.
  • No joins (either implicit or explicit) can be specified in the delete. Sub-queries may be used in the where-clause; the subqueries, themselves, can contain joins.
  • The where-clause is also optional.

One very interesting thing to point out there. The specification specifically disallows cascading of the delete to releated entities (not including, abviously, db-level cascades).

Caching

Automatic and transparent object/relational mapping is concerned with the management of object state. This implies that the object state is available in memory. Bulk Operations, to a large extent, undermine that concern. The biggest issue is that of caching performed by the ORM tool/EJB3 persistence implementor.

The spec even makes a point to caution regarding this:

Caution should be used when executing bulk update or delete operations because 
they may result in inconsistencies between the database and the entities in the 
active persistence context. In general, bulk update and delete operations 
should only be performed within a separate transaction or at the beginning of a 
transaction (before entities have been accessed whose state might be affected 
by such operations).

In Hibernate terms, be sure to perform any needed Bulk Operations prior to pulling entities into the session, as failing to do so poses a risk for inconsistencies between the session (the /active persistence context/) and the database.

Hibernate also offers, as do most ORM tools, a shared cache (the second level cache). Executing Bulk Operations also poses a risk of inconsistencies between the shared cache and the database. Hibernate actually takes the responsility of managing this risk for you. Upon completion of a Bulk Operation, Hibernate invalidates any needed region(s) within the shared cache to maintain consistency. It has to be done through invalidation because the UPDATE or DELETE is executed solely on the database server; thus Hibernate has no idea about the ids of any affected entities, nor (in the case of updates) what the new state might be.

Conclusion

Bulk Operations are complimentary to the functionality provided by ORM tools. Especially in the case of batch processes, Bulk Operations coupled with the new StatelessSession functionlity (available > 3.1beta1) offer a more performant alternative to the normal row-based ORM focus.

This-n-that

Entities which are contained across multiple tables (not counting associations) cause particular challenges that I'll blog about later.

Have a look at the reference manual for discussion of these Bulk Operations within HQL.

For those of you familiar with ANTLR and its grammar definitions, the authoritative source for what is supported by HQL is the grammar files themselves.

The first edition of Hibernate in Action has spread quite successfully. On training or consulting somewhere on-site I often see people with a copy on their desk. And it has proven to be invaluable to me (and others at JBoss) bringing a few copies along every-time. There is simply no better additional training material than a professionally edited full-length book. The only downside is that it is only covering Hibernate 2.x.

Soon after the release of Hibernate in Action about a year ago we thought about an update. After all, development on Hibernate3 had already started and we knew that interesting stuff would happen in EJB3 persistence as well. I've mentioned a second edition a few times on the forum but we haven't been very specific about release dates and new or updated content - hence this blog entry to keep everybody up-to-date. The reason why we were quiet for some time is that we simply had to finish Hibernate3 first, an effort that was completed only a few months ago. But also the EJB3 specification and its influence on Hibernate had to be watched before we could start updating the book. Since Hibernate 3.0 is now long stable and even 3.1 is already on the horizon, and with EJB 3.0 available in public draft, we can continue updating the manuscript for Hibernate in Action, Second Edition.

I guess most of you first want to know when it is going to be available. Both Hibernate 3.1 and EJB 3.0 are being finalized (despite the current alpha tag on Hibernate 3.1, it's soon feature complete and not a big release anyway) but some things might still change. Usually these minor changes have not much impact on development but can make whole sections of documentation obsolete. After discussing the issue with our editor and publisher at Manning, we think that the updated edition can be available end of September 2005, or early in Q4. As always, the eBook edition might be available earlier than the print version.

We'll update the book for Hibernate3 and EJB3, and based on the feedback we got from readers and during training (our first Hibernate training last year was following the books structure) we'll make some major changes:

  • the Toolset chapter will be removed and integrated into a new beginners tutorial that also shows the new Eclipse-based and Ant tools, with a hands-on basic project setup
  • a new chapter will be added with best practices, patterns, and general tips & tricks - this will include a lot of FAQs from the forum and our customers, such as caching tricks, metadata-driven applications, dealing with large values, complex deployment scenarios, etc.
  • more illustrations will be included with many mapping examples

So, you can expect quite a lot of new content, especially wrt EJB3 API usage for all of you who want to learn the new interfaces and lifecycle (it's easy if you know Hibernate...) and more best practices.

We also have an updated version of CaveatEmptor for the second edition. I've packaged an alpha release you can already download . It includes a complete mapping of the domain model with EJB3/Hibernate3 annotations and ready-to-run EJB3 persistence unit tests in straightforward J2SE, using Hibernate EntityManager and Hibernate Annotations .

I'll keep you updated here and release new versions of CaveatEmptor as I work on the manuscript.

P.S. Don't miss the new EJB3 TrailBlazer tutorial for the JBoss EJB3 Application Server and send feedback to the expert group on the public draft .

Showing 1176 to 1180 of 1237 blog entries