Help

One of the joys of working on an open source project with commercial competitors is having to implement features that our users simply don't ask for, and probably won't use in practice, just because those competitors try to spin their useless features as a competitive advantage. We realized ages ago that it's really hard to tell people that they don't need and shouldn't use a feature if you don't have it.

Multi-table mappings started out as a good example of that kind of features. We have been repeating the your object model should be at /least/ as fine-grained as your relational schema mantra for years now. Unfortunately, we keep hearing this echo back as Hibernate can't do multitable mappings. Nobody has ever once shown me a truly compelling usecase for multitable mappings in a real application, but apparently, if our competitors are to be believed, it is common to find schemas with attributes of the same entity scattered randomly across several different physical tables. I'll have to take their word on that one. I'm not saying you will /never/ run into this kind of thing and, indeed, I've seen a few borderline cases, though nothing that wasn't at least arguably better represented as an association. But certainly, to my mind, valid usecases for multitable mappings are not something you run into commonly enough for this to be an important feature. Perhaps the difference in perception is due to the fact that only /sane/ organizations use Hibernate.

Anyway, we introduced the <join/> mapping, just so we could tell people not to use it. Actually, it was fun to implement, and helped me make some really nice refactorings to the EntityPersister hierarchy.

Then a funny thing happened. I started to think of all kinds of useful things to do with <join/>, none of which had anything much to do with multitable mappings, as usually understood. And I'm pretty certain that these things were not what the other guys were talking about!

The first application I came up with is a mixed inheritance mapping strategy. Before, you had a choice between <subclass/> and <joined-subclass/> (now also <union-subclass/>), and you had to stick with that one strategy for the whole hierarchy.

It's now possible to write a mapping like this:


<class name="Superclass" 
        table="parent"
        discriminator-value="0">
    <id name="id">.....</id>
    <discriminator column="type" type="int"/>
    <property ...../>
    ...
    
    <subclass name="Subclass" 
            discriminator-value="1">
        <property .... >
        ...
    </subclass>
    
    <subclass name="JoinedSubclass" 
            discriminator-value="-1">
        <join table="child">
            <property ...../>
            ....
        </join>
    </subclass>
    
</class>

That's /really/ useful.

The next thing that <join/> can be used for required a little tweak. I added an inverse attribute to the join element, to declare that the joined table should not be updated by the owning entity. Now, it's possible to map an association (link) table - which usually represents a many-to-many association - with one-to-many multiplicity in the domain model. First, we have a basic many-to-many mapping, on the Parent side:


<class name="Parent">
    ...
    <set name="children" table="ParentChild" lazy="true">
        <key column="parentId"/>
        <many-to-many column="childId" class="Child"/>
    </set>
</class>

Now, we use a <join> mapping, to hide the association table from the Child end:


<class name="Child">
    ...
    <join table="ParentChild" inverse="true">
        <key column="childId"/>
        <many-to-one name="parent" column="parentId"/>
    </join>
</class>

Well, I'm not sure really how useful this is, but I was always jealous of the TopLink guys when they bragged how they could do this, and we got it /almost/ for free!

A third trick was also inspired by TopLink. A number of former TopLink users porting code to Hibernate found that Hibernate's table-per-class mapping strategy has significantly different performance characteristics to TopLink's. Hibernate has what seems to be a unique implementation of the table-per-class mapping strategy, in that no discriminator column is required to achieve polymorphism. Instead, Hibernate performs an outer join across all sublass tables, and checks which primary keys values are null in each returned row of results in order to determine the subclass that the row represents. In most circumstances, this offers an excellent performance balance, since it is not vulnerable to the dreaded N+1 selects problem. Furthermore, it does not require the addition of a type discriminator column to the table of the root class, which really feels extremely unnatural and redundant for this relational model.

An alternative approach, that TopLink uses, is to perform an initial query, check the value of a discriminator column, and then issue an extra query if the row represents a subclass instance. This isn't usually very efficient for shallow inheritance trees, but what we've seen is that some ex-TopLink users have created very deep or wide inheritance trees, in which case Hibernate's strategy can result in a single query with simply too many joins.

So, I added the outer-join attribute to <join/>. Its effect is slightly subtle. Consider the following mapping:


<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

When we execute a HQL query against the subclass Bar, Hibernate will generate SQL with an inner join between foos and bars. If we query against the superclass Foo, Hibernate will use an outer join.

(Note that you would not write the above mapping in practice; instead you would use <joined-subclass/> and eliminate the need for the discriminator.)

Suppose we set outer-join="false":


<class name="Foo" table="foos" discriminator-value="0">
    <id name="id">...</id>
    <discriminator column="type" type="int"/>
    <property name="name"/>
    <subclass name="Bar" discriminator-value="1">
        <join table="bars" outer-join="false">
            <key column="fooId"/>
            <property name="amount"/>
        </join>
    </subclass>
</class>

Now, when we query the subclass, the same SQL inner join will be used. But when we query the superclass, Hibernate won't use an outer join. Instead, it will issue an initial query against the foos table, and a sequential select against the bars table, whenever it finds a row with a discriminator value of 1.

Well, that's not such a great idea in this case. But imagine if Foo had a very large number of immediate subclasses. Then we might be avoiding a query with very many outer joins, in favor of several queries with no joins. Well, perhaps some people will find this useful....

4 comments:
 
30. Aug 2004, 15:09 CET | Link
Leonid Shlyapnikov
In my previous project we couldn't change database schema and interfaces of some entities, and we needed multitable mappings with hibernate 2.1.x.
josh, if you need multitable mapping you are able to use decomposition + delegation, it looks like adapter design pattern.
ReplyQuote
 
26. Aug 2004, 21:12 CET | Link
Adam Greene | ccm682(AT)hotmail.com
"Have yet to see any one use Multitable" I 'inherited' a project from the Government of Canada, that required 3 different tables to complete the definition of a 'user' (and that was just so they could log in!!!) It would have been great to have had Hibernate 3 to work on that project (as the whole project was done in JDBC and JSP and desperately needed a redo, but couldn't because of the table problems).
 
29. Dec 2004, 10:12 CET | Link
Frans Bouma
"Nobody has ever once shown me a truly compelling usecase for multitable mappings in a real application, but apparently, if our competitors are to be believed, it is common to find schemas with attributes of the same entity scattered randomly across several different physical tables. I'll have to take their word on that one."
If you take ORM (Object Role Modelling, http://www.orm.net) or NIAM (predecessor of ORM) and you have a supertype employee and a subtype manager and you define a relationship between manager and 'company car', it's hard to create that physical relationship in an entity-relational model without 2 tables for the supertype/subtype construction.

I.o.w.: the 2-table approach (all employee attributes in the employee table and the manager specific attributes in the manager table) is required to have the RDBMS do the FK constraint checking for manager - company car. If you flatten the supertype-subtype hierarchy, you'll get an FK attribute in employee which in theory could be filled for a non-manager as well, which requires external check code which is never that great.

However as Nijssen / Halpin also suggest in their book Conceptual schema and relational database design (NIAM): flattening supertype/subtype hierarchies before converting an abstract relational model to an entity-relational model is often more practical, i.e.: will thus result in a 1-table per supertype-subtype hierarchy.

For the rest I agree with you, and I'd go even a step further: inheritance for entities is overrated. Often you can replace the inheritance by a role attribute, as that's more flexible and often solves a problem that is unsolveable with inheritance (person <- customer and person <- employee, employee is also customer.. oops)

It was refreshing to read your article. Finally someone of a widely used O/R mapper team speaks about this in other terms than "Inheritance/multi-table/the more mapping types the better"-marketing goo.

Frans Bouma, lead developer LLBLGen Pro (O/R mapper for .NET, one of the marketleaders there)
 
26. Aug 2004, 20:03 CET | Link
This is an excellent addition to hibernate, one that we will be able to use. Our database is indeed modeled as a wide AND deep hierarchy, and the discriminator value is at the very top. This database design predated my employment, and it has been painful to work with. But with this sort of H3 support it will be more tractable.

Is there a timeframe for H3's solidity? Are you still embroiled in the EJB3 JSR creation?
Post Comment