Help

In Hibernate there is a particular branch of logic where we need to parse and validate an org.xml.sax.InputSource that might represent either a Hibernate mapping (hbm.xml) file, a 1.0-compliant orm.xml file or a 2.0-compliant orm.xml file. Now currently Hibernate mapping files are defined by a DTD and both versions of the orm.xml files by an XSD. Currently the code builds a SAXReader with DTD and Schema validation enabled and tries to read in the source. It first maps Schema validation to the 2.0 version of the XSD; if an error occurs it then tries re-parsing mapping Schema validation to 1.0 version of the XSD.

Now I am not an XML guru, but this seemed wasteful to me. But try as I could I could not find a way to say that we need to resolve the XSD to one file (locally) if the root element defined version=2.0 as an attribute versus another if it defined version=1.0. Really I guess its a matter of conditionally resolving the schemaLocation. Anyone know if this is possible?

My next thought was to leverage the javax.xml.validation.Validator contract added in JDK 1.5. So basically, I would enabled DTD validation of the document during parse and simply parse the document initially. Then I peeked at the root element to see if the document was a Hibernate mapping or an orm.xml. If an orm.xml, I then check its version attribute, load a Validator based on the proper XSD and do a Validator.validate( new DOMSource(...) ). First, due to the separate parse and then validate steps, just how much slower will this be?

Also, I had a very irksome issue with this approach anyway. In my tests I created a document that is valid according to the 2.0 schema, but I identified the version as 1.0 and used the 1.0 version of the Schema. When running in (Sun) JDK 1.6 the test was successful in that the validator did in fact complain; but on (Sun) JDK 1.5 the validator simply did not complain at all. 1.5 and 1.6 appear to be using different internal SchemaFactory implementations. 1.6 used com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory, while 1.5 used com.sun.org.apache.xerces.internal.jaxp.validation.xs.SchemaFactoryImpl.

Perhaps I am just doing something wrong? The code can be seen at https://fisheye2.atlassian.com/browse/Hibernate/core/trunk/core/src/main/java/org/hibernate/util/xml/MappingReader.java?r=20321#l118 (its the commented out code).

Assuming I did not make a mistake, what is the best way around this?

4 comments:
 
09. Sep 2010, 08:59 CET | Link

I have had problems with differences in the JAXB implementation between Java 5 and 6 as well. I think the supported version and implementation has changed between these two versions. The easiest approach is to just not use javax.xml.validation.Validator and stick with what we have. In Hibernate Validator we explicitly define a dependency to jaxb-api (2.2) and jaxb-impl (2.1.12). The scope of the dependency is set to 'provided'. People using Java 6 don't have to do anything, but on Java 5 you will have to add the dependencies to ensure you have the right jaxb version and implementation. If you want to go down this path in Hibernate Core is another question.

ReplyQuote
 
09. Sep 2010, 09:23 CET | Link
Grzegorz Grzybek | gr.grzybek(AT)gmail.com

I have also had problems with JavaSE provided JAXB implementations. Since then I always explicitly declare dependencies on JAXB impl, currently:

<dependency>
  <groupId>com.sun.xml.bind</groupId>
  <artifactId>jaxb-impl</artifactId>
  <version>2.2.1.1</version>
</dependency>

(that way I surpass container's JavaEE conformance - I use my own version).

For Validation - I had strange error with standard jdk1.6.020 implementation - in multithreaded scenarios I was receiving random validation errors (the same schema, the same document!) - and because SUN (Oracle) is not using JIRA, I hadn't time to fill an issue :) Validation problems disappeared after I've switched to Xerces (either 2.10.0 or 2.9.1) put to the endorsed directory. In Hibernate configuration validation that's not the case, but be aware that standard validation implementation (which is com.sun.org.apache.xerces.* but somehow different).

And for the last and most important case - I've also had problems with different XMLs using both DTDs or different schemas - you're in better position - you know the schema/DTD by looking at root element. But that's makes you thing there should be a way to automatically select proper XSD/schema in one pass. I've had to read the schema/DTD location by parsing the beginning of XML.

My suggestion (to make you feel better regarding the speed loss resulting from double parsing) is to use StAX API (or SAX handler exiting after you read what's interesting) for first pass just to read the first element...

regards

 
09. Sep 2010, 19:36 CET | Link
Hardy Ferentschik wrote on Sep 09, 2010 02:59:
In Hibernate Validator we explicitly define a dependency to jaxb-api (2.2) and jaxb-impl (2.1.12). The scope of the dependency is set to 'provided'. People using Java 6 don't have to do anything, but on Java 5 you will have to add the dependencies to ensure you have the right jaxb version and implementation.

Right, this is the option (2) listed in the referenced code. As for whether to go this route, well I'd prefer it JustWorked(tm) and would be much more inclined to if that were the case.

 
12. Sep 2010, 01:38 CET | Link
Grzegorz Grzybek wrote on Sep 09, 2010 03:23:
My suggestion (to make you feel better regarding the speed loss resulting from double parsing) is to use StAX API (or SAX handler exiting after you read what's interesting) for first pass just to read the first element...

Hey Grzegorz, thanks for the suggestion. I in fact did try this approach (using SAX) and it worked great here locally. I was nervous about the fact that this code receives an InputSource which probably 90% of the time is built with an InputStream obtained from classpath lookups. The concern I had was that I had to explicitly mark/reset the stream which made me nervous since this is going to be very env specific as to whether the mark/reset works. I check up front using markSupported, and only if true do this SAX code as an optimization.

Post Comment