Help

Inactive Bloggers
25. Aug 2004, 19:20 CET, by Gavin King

We were doing some work with a customer with a very large project recently, and they were concerned about traceability of the SQL issued by Hibernate. Their problem is one that I guess is common: suppose I see something wong in the Hibernate log (say, some N+1 selects problem), how do I know which of my business classes is producing this? All I've got in the Hibernate log is org.hibernate.SQL, line 224 as the source of the log message!

I started to explain how Hibernate3 can embed comments into the generated SQL, so you could at least track the problem back to a particular HQL query. But then Steve remembered that log4j provides the /nested diagnostic context/. Now, I've seen a lot of projects using log4j, but I've never actually seen this used anywhere. I think it might be a better alternative to adding entry and exit logging everywhere, since we can see this context even if the entry/exit log categories are disabled. It's a good way to track the source of SQL in the Hibernate log. All you need to do is add calls to push() and pop() in your DAO:

public List getCustomersByName(String pattern) {
    NDC.push("CustomerDAO.getCustomersByName()");
    try {
        return getSession()
            .createQuery("from Customer c where c.name like :pattern")
            .setString("pattern", pattern)
            .list();
    }
    finally {
        NDC.pop();
    }
}

Then, if I set my pattern right:

log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} %5p %c{1}:%L - %m (%x)%n

I'll get a log message like this:

20:59:38,249 DEBUG [=>SQL:244] - select .... like ? (CustomerDAO.getCustomersByName())

Just thought I'd mention it, in case it helps someone.

25. Aug 2004, 13:12 CET, by Gavin King

One of the joys of working on an open source project with commercial competitors is having to implement features that our users simply don't ask for, and probably won't use in practice, just because those competitors try to spin their useless features as a competitive advantage. We realized ages ago that it's really hard to tell people that they don't need and shouldn't use a feature if you don't have it.

Multi-table mappings started out as a good example of that kind of features. We have been repeating the your object model should be at /least/ as fine-grained as your relational schema mantra for years now. Unfortunately, we keep hearing this echo back as Hibernate can't do multitable mappings. Nobody has ever once shown me a truly compelling usecase for multitable mappings in a real application, but apparently, if our competitors are to be believed, it is common to find schemas with attributes of the same entity scattered randomly across several different physical tables. I'll have to take their word on that one. I'm not saying you will /never/ run into this kind of thing and, indeed, I've seen a few borderline cases, though nothing that wasn't at least arguably better represented as an association. But certainly, to my mind, valid usecases for multitable mappings are not something you run into commonly enough for this to be an important feature. Perhaps the difference in perception is due to the fact that only /sane/ organizations use Hibernate.

Anyway, we introduced the <join/> mapping, just so we could tell people not to use it. Actually, it was fun to implement, and helped me make some really nice refactorings to the EntityPersister hierarchy.

Then a funny thing happened. I started to think of all kinds of useful things to do with <join/>, none of which had anything much to do with multitable mappings, as usually understood. And I'm pretty certain that these things were not what the other guys were talking about!

The first application I came up with is a mixed inheritance mapping strategy. Before, you had a choice between <subclass/> and <joined-subclass/> (now also <union-subclass/>), and you had to stick with that one strategy for the whole hierarchy.

It's now possible to write a mapping like this:

<class name="Superclass" 
        table="parent"
        discriminator-value="0">
    <id name="id">.....</id>
    <discriminator column="type" type="int"/>
    <property ...../>
    ...
    
    <subclass name="Subclass" 
            discriminator-value="1">
        <property .... >
        ...
    </subclass>
    
    <subclass name="JoinedSubclass" 
            discriminator-value="-1">
        <join table="child">
            <property ...../>
            ....
        </join>
    </subclass>
    
</class>

That's /really/ useful.

The next thing that <join/> can be used for required a little tweak. I added an inverse attribute to the join element, to declare that the joined table should not be updated by the owning entity. Now, it's possible to map an association (link) table - which usually represents a many-to-many association - with one-to-many multiplicity in the domain model. First, we have a basic many-to-many mapping, on the Parent side:

<class name="Parent">
    ...
    <set name="children" table="ParentChild" lazy="true">
        <key column="parentId"/>
        <many-to-many column="childId" class="Child"/>
    </set>
</class>

Now, we use a <join> mapping, to hide the association table from the Child end:

<class name="Child">
    ...
    <join table="ParentChild" inverse="true">
        <key column="childId"/>
        <many-to-one name="parent" column="parentId"/>
    </join>
</class>

Well, I'm not sure really how useful this is, but I was always jealous of the TopLink guys when they bragged how they could do this, and we got it /almost/ for free!

A third trick was also inspired by TopLink. A number of former TopLink users porting code to Hibernate found that Hibernate's table-per-class mapping strategy has significantly different performance characteristics to TopLink's. Hibernate has what seems to be a unique implementation of the table-per-class mapping strategy, in that no discriminator column is required to achieve polymorphism. Instead, Hibernate performs an outer join across all sublass tables, and checks which primary keys values are null in each returned row of results in order to determine the subclass that the row represents. In most circumstances, this offers an excellent performance balance, since it is not vulnerable to the dreaded N+1 selects problem. Furthermore, it does not require the addition of a type discriminator column to the table of the root class, which really feels extremely unnatural and redundant for this relational model.

An alternative approach, that TopLink uses, is to perform an initial query, check the value of a discriminator column, and then issue an extra query if the row represents a subclass instance. This isn't usually very efficient for shallow inheritance trees, but what we've seen is that some ex-TopLink users have created very deep or wide inheritance trees, in which case Hibernate's strategy can result in a single query with simply too many joins.

So, I added the outer-join attribute to <join/>. Its effect is slightly subtle. Consider the following mapping:

<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

When we execute a HQL query against the subclass Bar, Hibernate will generate SQL with an inner join between foos and bars. If we query against the superclass Foo, Hibernate will use an outer join.

(Note that you would not write the above mapping in practice; instead you would use <joined-subclass/> and eliminate the need for the discriminator.)

Suppose we set outer-join="false":

<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars" outer-join="false">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

Now, when we query the subclass, the same SQL inner join will be used. But when we query the superclass, Hibernate won't use an outer join. Instead, it will issue an initial query against the foos table, and a sequential select against the bars table, whenever it finds a row with a discriminator value of 1.

Well, that's not such a great idea in this case. But imagine if Foo had a very large number of immediate subclasses. Then we might be avoiding a query with very many outer joins, in favor of several queries with no joins. Well, perhaps some people will find this useful....

Hibernate3 is now ready for a public test, go get it! It has all (well almost all) features we'll ever need for object/relational mapping, and if it doesn't have it, it's easy to subclass, extend, and implement.

We still have some things left on our TODO for the beta (no release date yet on the final), but it's getting better every day and we might have a very stable first beta. If you want to help, we are still looking for documentation translators.

Incidentally, the Hibernate project is now 1000 days old, if you believe the SourceForge stats . We actually had the Hibernate3 alpha finished for the anniversary, but then Gavin's laptop didn't agree with its owner anymore. At least it was an excuse to finish some website redesign.

P.S. The first copies of Hibernate in Action arrived! Mine was sent to an old address (thats the problem if you need years to finish something) and I'm going to hunt it down now. I already received a Thank You! email from the finder...

23. Aug 2004, 17:06 CET, by Gavin King

There's been a certain amount of noise recently surrounding simple JDBC frameworks like iBATIS. I've liked the idea of iBATIS myself, for use in applications which don't need an object-oriented domain model, and don't work with deep graphs of associated entities in a single transaction. A JDBC framework also makes good sense if you are working with some kind of insane legacy database; ORM solutions tend to assume that associations are represented as nice clean foreign keys with proper referential integrity constraints (Hibernate3 much less so than Hibernate 2.x).

Some people even suggest that JDBC frameworks are a suitable alternative to ORM, even for those systems to which ORM is best suited: object-oriented applications with clean relational schemas. They argue that you are /always/ better off with hand-written SQL than generated SQL. Well, I don't think this is true, not only because the overwhelming bulk of SQL code needed by most applications is of the tedious kind, and simply does not require human intervention, but also because a JDBC framework operates at a different semantic level to ORM. A solution like iBATIS knows a lot less about the semantics of the SQL it is issuing, and of the resulting datasets. This means that there is much less opportunity for performance optimizations such as effficent caching. (By efficient, I am referring mainly to efficient cache /invalidation strategies, which are crucial to the usefulness of the cache/.) Furthermore, whenever we have seen handwritten SQL, we have seen N+1 selects problems. It is extremely tedious to write a new SQL query for each combination of associations I might need to fetch together. HQL helps /significantly/ here, since HQL is much less verbose than SQL. For a JDBC framework to be able to make the kind of optimizations that an ORM can make, it would have to evolve to a similar level of sophistication. Essentially, it would need to become an ORM, minus SQL generation. In fact, we already start to see this evolution taking place in existing JDBC frameworks. This begins to erode one of the stated benefits: the claimed simplicity.

It also raises the following interesting thought: if, by gradually adding stuff, a JDBC framework will eventually end up as ORM, minus SQL generation, why not just take an existing ORM solution like, ooh, um ... Hibernate, maybe ... and subtract the SQL generation?

The Hibernate team has long recognized the need to mix and match generated SQL with the occasional handwritten query. In older versions of Hibernate, our solution was simply to expose the JDBC connection Hibernate is using, so you can execute your own prepared statement. This started to change a while ago, and Max Andersen has recently done a lot of work on this. Now, in Hibernate3, it is possible to write an entire application with no generated SQL, while still taking advantage of all of Hibernate's other features.

Do we really expect or intend people to use Hibernate in this way? Well, not really - I doubt there are many people out there who really enjoy writing tedious INSERT, UPDATE, DELETE statements all day. On the other hand, we do think that quite a few people need to customize the occasional query. But to prove a point, I'll show you how you can do it, if you really want to.

Let's take a simple Person-Employment-Organization domain model. (You can find the code in the org.hibernate.test.sql package, so I'm not going to reproduce it here.) The simplest class is Person; here's the mapping:

<class name="Person" lazy="true">
    <id name="id" unsaved-value="0">
        <generator class="increment"/>
    </id>
    
    <property name="name" not-null="true"/>
    
    <loader query-ref="person"/>
    
    <sql-insert>INSERT INTO PERSON (NAME, ID) VALUES ( UPPER(?), ? )</sql-insert>
    <sql-update>UPDATE PERSON SET NAME=UPPER(?) WHERE ID=?</sql-update>
    <sql-delete>DELETE FROM PERSON WHERE ID=?</sql-delete>
</class>

The first thing to notice is the handwritten INSERT, UPDATE and DELETE statements. The ? order of the parameters matches to the order in which properties are listed above (we'll have to eventually support named parameters, I suppose). I guess there is nothing especially interesting there.

More interesting is the <loader> tag: it defines a reference to a named query which is to be used anytime we load a person using get(), load(), or lazy association fetching. In particular, the named query might be a native SQL query, which it is, in this case:

<sql-query name="person">
    <return alias="p" class="Person" lock-mode="upgrade"/>
    SELECT NAME AS {p.name}, ID AS {p.id} FROM PERSON WHERE ID=? FOR UPDATE
</sql-query>

(A native SQL query may return multiple columns of entities; this is the simplest case, where just one entity is returned.)

Employment is a bit more complex, in particular, not all properties are included in the INSERT and UPDATE statements:

<class name="Employment" lazy="true">
    <id name="id" unsaved-value="0">
        <generator class="increment"/>
    </id>
    
    <many-to-one name="employee" not-null="true" update="false"/>
    <many-to-one name="employer" not-null="true" update="false"/>
    <property name="startDate" not-null="true" update="false" 
        insert="false"/>
    <property name="endDate" insert="false"/>
    <property name="regionCode" update="false"/>
    
    <loader query-ref="employment"/>
    
    <sql-insert>
        INSERT INTO EMPLOYMENT 
            (EMPLOYEE, EMPLOYER, STARTDATE, REGIONCODE, ID) 
            VALUES (?, ?, CURRENT_DATE, UPPER(?), ?)
    </sql-insert>
    <sql-update>UPDATE EMPLOYMENT SET ENDDATE=? WHERE ID=?</sql-update>
    <sql-delete>DELETE FROM EMPLOYMENT WHERE ID=?</sql-delete>
</class>

<sql-query name="employment">
    <return alias="emp" class="Employment"/>
    SELECT EMPLOYEE AS {emp.employee}, EMPLOYER AS {emp.employer}, 
        STARTDATE AS {emp.startDate}, ENDDATE AS {emp.endDate},
        REGIONCODE as {emp.regionCode}, ID AS {emp.id}
    FROM EMPLOYMENT
    WHERE ID = ?
</sql-query>

The mapping for Organization has a collection of Employments:

<class name="Organization" lazy="true">
    <id name="id" unsaved-value="0">
        <generator class="increment"/>
    </id>
    
    <property name="name" not-null="true"/>
    
    <set name="employments" 
        lazy="true" 
        inverse="true">
        
        <key column="employer"/> <!-- only needed for DDL generation -->
        
        <one-to-many class="Employment"/>
        
        <loader query-ref="organizationEmployments"/>
    </set>
    
    <loader query-ref="organization"/>
    
    <sql-insert>
        INSERT INTO ORGANIZATION (NAME, ID) VALUES ( UPPER(?), ? )
    </sql-insert>
    <sql-update>UPDATE ORGANIZATION SET NAME=UPPER(?) WHERE ID=?</sql-update>
    <sql-delete>DELETE FROM ORGANIZATION WHERE ID=?</sql-delete>
</class>

Not only is there a <loader> query for Organization, but also for its collection of Employments:

<sql-query name="organization">
    <return alias="org" class="Organization"/>
    SELECT NAME AS {org.name}, ID AS {org.id} FROM ORGANIZATION
    WHERE ID=?
</sql-query>

<sql-query name="organizationEmployments">
    <return alias="empcol" collection="Organization.employments"/>
    <return alias="emp" class="Employment"/>
    SELECT {empcol.*}, 
        EMPLOYER AS {emp.employer}, EMPLOYEE AS {emp.employee},
        STARTDATE AS {emp.startDate}, ENDDATE AS {emp.endDate},
        REGIONCODE as {emp.regionCode}, ID AS {emp.id}
    FROM EMPLOYMENT empcol
    WHERE EMPLOYER = :id AND DELETED_DATETIME IS NULL
</sql-query>    

When I was writing this code, I really started to feel the advantages of having Hibernate write the SQL for me. In just this simple example, I would have eliminated more than 35 lines of code that I would have to later maintain.

Finally, for ad hoc querying, we can use a native SQL query (a named query, or one embedded in the Java code). For example:

<sql-query name="allOrganizationsWithEmployees">
    <return alias="org" class="Organization"/>
    SELECT DISTINCT NAME AS {org.name}, ID AS {org.id} 
    FROM ORGANIZATION org
    INNER JOIN EMPLOYMENT e ON e.EMPLOYER = org.ID
</sql-query>

Personally, I prefer to program in Java than in XML, so all this stuff is much too XML-heavy for my liking. I think I'll stick with SQL generation, wherever I can, which is almost everywhere. It's not that I don't like SQL. In fact, I am a great fan of SQL, and just love watching the queries scroll past when I turn Hibernate's logging on. It's just that Hibernate is much better at writing SQL than I am.

14. Aug 2004, 14:11 CET, by Gavin King

Just had an interesting discussion on ejb3-feedback@sun.com, started by David Cherryhomes, which saw me stupidly insistingthat something can't be done when in fact, now that I think about it, /I realize I've actually done it before/, and that even the Hibernate AdminApp example uses this pattern!

So, just so I don't forget this pattern again, I'm going to write it down, and also write a reuseable class implementing it.

The basic problem is pagination. I want to display next and previous buttons to the user, but disable them if there are no more, or no previous query results. But I don't want to retrieve all the query results in each request, or execute a separate query to count them. So, here's the correct approach:

public class Page {
   
   private List results;
   private int pageSize;
   private int page;
   
   public Page(Query query, int page, int pageSize) {
       
       this.page = page;
       this.pageSize = pageSize;
       results = query.setFirstResult(page * pageSize)
           .setMaxResults(pageSize+1)
           .list();
   
   }
   
   public boolean isNextPage() {
       return results.size() > pageSize;
   }
   
   public boolean isPreviousPage() {
       return page > 0;
   }
   
   public List getList() {
       return isNextPage() ?
           results.subList(0, pageSize-1) :
           results;
   }

}

You can return this object to your JSP, and use it in Struts, WebWork or JSTL tags. Getting a page in your persistence logic is as simple as:

public Page getPosts(int page) {
    return new Page( 
        session.createQuery("from Posts p order by p.date desc")
        page,
        40
    );
}

The Page class works in both Hibernate and EJB 3.0.

Showing 1176 to 1180 of 1217 blog entries