Help

Today Oracle's Mark Reinhold published the requirements for comment a public draft of the Java Module system being proposed for inclusion into the Java SE 8 platform. The requirements as given are fairly high-level yet comprehensive, and many of these requirements align well with the goals and specifications of our own JBoss Modules system, which is not only a strong validation of our own design but also what I think is a good sign for the future of the Java platform.

The requirements as posted however contain some things that are specifically noteworthy and some things which in my view should definitely be changed, based on my own experience with implementing the JBoss Modules environment as well as applying that environment to JBoss AS 7 and 6 (which we did on an experimental basis a while back), as well as feedback from my colleagues responsible for JBoss OSGi, which is presently implemented atop JBoss Modules. Being very sizable and very different projects, these efforts have taught us a lot about the practical nature of modularity in Java.

Since this is a big document, I'm going to comment on it a bit at a time and see where we end up.

The Requirements

The document, which may be found here, is broken up into a number of sections, which I will attempt address approximately in order.

Fundamentals: Versions

The fundamentals paragraph items all indicate a strong sense that the Jigsaw project is on the right general track. However I found a couple sections noteworthy:

Resolution in all phases — Build-time, install-time, and run-time module resolution and linking must all be supported. UNIX has all three models and they are all still in use 40 years later.

coupled with the following paragraph:

Fidelity across all phases — The set of modules seen by a library or application must be computed by the same algorithm at build time, install time, and run time.

and this one:

Versioning — The module system must support common version-string schemes [...]. It must be possible to specify a range of allowable versions when declaring a module dependence.

These seemingly reasonable statements hide a particularly tricky little goblin; namely, build reproducibility. At build-time, an artifact (to use the Maven term) is identified in part by its version. And that identity has a specific meaning. If I check out version 1.2.3 of jboss-cobwobble from source control and build it, I should get an artifact which is equivalent to one which anyone else who checks out and builds the same version of the same project gets.

However, that means that such builds cannot use version ranges. By definition this creates a way for two parties building the same project to get different output, thus blurring the meaning of what a version is. In the first paragraph, Mark emphasises the similarities to the shared library model and recommends that similar practices be followed to that paradigm, however there is a key difference in that when one builds against a shared library the linkage is much less invasive. A change to a compiled library used by a project does not generally affect the compilation results of that project, whereas in Java this is much easier to do, since class files are a much richer format than a shared object symbol table.

Therefore it is very important to allow (better yet, require) builds to use specific dependency versions when building, and allow version ranges only at install and run time, where version ranges are a much better fit.

Another very important consideration is that after an artifact is produced, the versions of its dependencies that it is known to be compatible with evolves over time as new versions of various things are released. Thus dependency version ranges with a closed upper end cannot be a fixed part of the module's internal metadata, or there is a risk over time that the module will have to be repackaged to accomodate new version compatibilities. My personal recommendation in fact is that modules only ever support a lower bound to version ranges for package/run-time constraining.

Fundamentals: Target constraints on modules

I admit I do not understand the purpose of this paragraph, which reads as follows:

Target constraints on modules — On a particular target platform it must be possible to declare that only modules with specific properties can be installed, e.g., particular authors, publishers, or licenses. The resolution algorithm must ignore any module that does not satisfy the target platform’s module constraints.

If the purpose of this is security, which I can very much get behind, then I think the only sensible approach is to constrain module loading to signed modules. Otherwise, if the goal is simply to create a mechanism by which administrators can annoy each other, then I guess this fits the bill; the restriction can be bypassed by changing the metadata, which means that it provides neither security nor convenience.

Overall I think this mechanism also risks fragmentation of the module ecosystem that would (hopefully) arise around this enhancement. I think Perl for example benefits greatly from a single repository of modules which is yet open to contributors with relatively little restriction; yet operating system distributors (know any of those?) have established best practices for the distribution of these modules. And of course today's Java developers often use the Maven central repository and expect it to contain everything in reasonable locations. However by creating these filtering mechanisms, the problem of many publishers is being solved at the wrong place, and it allows for some potentially very surprising behavior (I can see the FAQ now: I installed a module but Java says it's not found... what gives?).

Fundamentals: Native code

This section basically says that modules need to support native code somehow, though it seems to imply that the native library has to live inside the module itself, which I think is probably unneccessary (though of course it would have to exist within the module packaging, else there would be no way to actually install it). In JBoss Modules, we simply require that native libraries exist in filesystem-backed resource loaders (as opposed to jar-backed resource loaders which one would normally use for classes). In addition the filesystem resource loader uses the name of the OS and hardware platform to allow for packaging modules which contain native bits for more than one platform (actually not unlike how Perl's DynaLoader does it, if I recall correctly).

Fundamentals: Package subsets and Shared class loaders

These sections I have a serious problem with:

Package subsets — In support of platform modularization, it must be possible for the types defined in a Java package to be provided by more than one module yet still be loaded by the same class loader at run time. This is required so that CDC-sized subsets of large legacy packages such as java.util can be defined.
Shared class loaders — In support of platform modularization, it must be possible to declare that the types defined in a specific set of modules must be loaded by the same class loader.

I am very much against this notion. One module, one class loader, period, end of story. If you are splitting a single class loader among many modules, you haven't created modules, you've created one module with a class path of several JARs, of which some stuff may be missing and others not, and a hell of a lot of useless complexity, all for the privilege of saying sure, we modularized this. There is no isolation of any sort between them beyond your standard access controls provided by the Java language. Without a strong definition of a module as an encapsulating vehicle for a class loader, the whole idea is weakened - visibility is only sensibly enforced in terms of whole class loaders (yes, while it is probably possible to use a different unit of division like class, it will definitely have a negative performance impact).

Rather than thus weakening the definition of a module, I recommend that modules be allowed to be packaged in portions, which may be separately shipped and installed - perhaps a module/submodule concept could be introduced, where the contract of a submodule is clearly defined as sharing the parent module's class loader. In other words, the label should be clearly marked, so to speak. There is no reasonable expectation that a package may be split between modules, and doing so is definitely a violation of the Principle of Least WTF.

Security: Where art thou?

Though there are various paragraphs which allude to it, there is no specific section addressing security, which I think is a somewhat serious oversight. I would personally like to see some enhancements to the standard security provider mechanism to accomodate modules and module signers more conveniently than the mechanisms at our disposal today, and I commented earlier about module signing, to say nothing about modular security provider discovery (okay, this kind of thing is currently listed as a non-requirement to be fair - but it will always be in the back of my mind, and you can bet that it will show up in JBoss AS 7.x at some point if I have anything to say about it).

Summary, Part 1

That's about it for the first chapter. Overall the Fundamentals leave me with a sense that things are somewhat more under control in Jigsaw-land... in particular, it's good news that the brakes are apparently being gently applied to implementation in order to come up with real requirements, which are essential to ensuring that a project is firmly on the correct planetary surface. Just ask any of my colleagues, they'll tell you how I like a good requirements document!

Next time I'll try to cover a little more ground, as the subsequent chapters are mostly rather shorter than Fundamentals is.

15 comments:
 
26. May 2011, 10:46 CET | Link

The issue with builds not being able to use version ranges is solved by doing the build in context of a target platform which adds restrictions on to the builds on what versions are allowed in that context.

That sounds doable to me without breaking Fidelity across all phases — The set of modules seen by a library or application must be computed by the same algorithm at build time, install time, and run time.

 

--max

ReplyQuote
 
26. May 2011, 11:21 CET | Link
A change to a compiled library used by a project does not generally affect the compilation results of that project, whereas in Java this is much easier to do, since class files are a much richer format than a shared object symbol table.

David, can you elaborate or provide an example how this becomes a problem when linking to a later version of a jar?

 
26. May 2011, 14:08 CET | Link
Sakuraba | saku(AT)raba.jp
It amazes me that in the Java world it takes us that long to get something like this done, while non-company backed languages like Ruby (gems), Python (pip) and even JavaScript (npm for nodejs) have been getting modules done right for a long time.
 
26. May 2011, 14:13 CET | Link
Package subsets — In support of platform modularization, it must be possible for the types defined in a Java package to be provided by more than one module yet still be loaded by the same class loader at run time. This is required so that CDC-sized subsets of large legacy packages such as java.util can be defined.

AIUI this is required to allow the JDK to be modularized. So yes, were this aimed only at greenfield, I guess it wouldn't be needed.

 
26. May 2011, 14:50 CET | Link
Jim Tyrrell

I love the idea that if I build something I get the exact same thing I'd class file. One thing to think about is at least in the past a date/time was included into he compiled class file. It would be great to have complete binary compatiability with the binaries built fron the same source.

Jim

 
26. May 2011, 15:40 CET | Link
Anonymous
<blockquote>
_Pete Muir wrote on May 26, 2011 08:13:_<br/>

<blockquote>
Package subsets — In support of platform modularization, it must be possible for the types defined in a Java package to be provided by more than one module yet still be loaded by the same class loader at run time. This is required so that CDC-sized subsets of large legacy packages such as java.util can be defined.
</blockquote>

AIUI this is required to allow the JDK to be modularized. So yes, were this aimed only at greenfield, I guess it wouldn't be needed.
</blockquote>

I think that's the whole point, it is fine that the JDK needs this, so the Jigsaw team can still implement this feature internally and use it, but it shouldn't be part of the end-developer facing "standard" Java module system, since it does not promote modularity.
 
26. May 2011, 16:17 CET | Link
The issue with builds not being able to use version ranges is solved by doing the build in context of a target platform which adds restrictions on to the builds on what versions are allowed in that context.

Again this really blurs the definition of a version, since I can take the same version of the same code and compile it in two different places and get two different results, which is not good.

 
26. May 2011, 16:19 CET | Link
David, can you elaborate or provide an example how this becomes a problem when linking to a later version of a jar?

Off the top of my head - you can add a more specific method, which would change how the same code is compiled the next time it is compiled; you can change the return type of a method to make it covariant, causing linking to occur to the new bridge method instead of the original; I'm sure there are quite a few generics changes you can do which cause compilation against a bridge method. Etc, etc.

 
26. May 2011, 16:23 CET | Link

While I don't buy that this is necessary for the JDK to be modularized - I think it's due in large part to Java ME (Mark expounds on that a very little bit in the document) - I acknowledge that there is very, very little chance that an outside party will be able to affect this requirement; thus I think at the least it should be mitigated to avoid this unfortunate problem from affecting everyone who uses modules. By introducing a submodule concept the module contract can include class loader association and yet the requirement can still be met.

 
26. May 2011, 17:29 CET | Link
Anonymous
David Lloyd wrote on May 26, 2011 10:23:
While I don't buy that this is necessary for the JDK to be modularized - I think it's due in large part to Java ME (Mark expounds on that a very little bit in the document) - I acknowledge that there is very, very little chance that an outside party will be able to affect this requirement; thus I think at the least it should be mitigated to avoid this unfortunate problem from affecting everyone who uses modules. By introducing a submodule concept the module contract can include class loader association and yet the requirement can still be met.

+1

But I agree, it is unlikely that this plea will be heard.

 
26. May 2011, 18:00 CET | Link
Rob Cernich
David Lloyd wrote on May 26, 2011 10:23:
While I don't buy that this is necessary for the JDK to be modularized - I think it's due in large part to Java ME (Mark expounds on that a very little bit in the document) - I acknowledge that there is very, very little chance that an outside party will be able to affect this requirement; thus I think at the least it should be mitigated to avoid this unfortunate problem from affecting everyone who uses modules. By introducing a submodule concept the module contract can include class loader association and yet the requirement can still be met.

I don't see why there couldn't be a java.util module which exposes the API, but is implemented by private sub-modules. I wouldn't be surprised if this is behind the friend requirement.

Is there any consideration for making module resolution an spi? Seems like this would allow different module implementations to interact natively without having to change their packaging specifications (e.g. OSGi).

 
26. May 2011, 18:12 CET | Link
Anonymous
Rob Cernich wrote on May 26, 2011 12:00:
Is there any consideration for making module resolution an spi? Seems like this would allow different module implementations to interact natively without having to change their packaging specifications (e.g. OSGi).

That would only really be helpful if there is agreement about the metadata of a module. For example, it would be difficult for an OSGi resolver to reason about side-by-side version consistency, since Jigsaw modules won't include uses constraint information.

 
26. May 2011, 22:35 CET | Link
Solerman Kaplon | solerman(AT)wonder.com.br
I admit I do not understand the purpose of this paragraph, which reads as follows: Target constraints on modules — On a particular target platform it must be possible to declare that only modules with specific properties can be installed, e.g., particular authors, publishers, or licenses. The resolution algorithm must ignore any module that does not satisfy the target platform’s module constraints.

Simple: as a company, I don't want mix GPL non-GPL code, period. Get it? And then there is the whole patent thingie getting companies scared all around (and they follow lawyers, not coders).

 
27. May 2011, 20:25 CET | Link
as a company, I don't want mix GPL non-GPL code, period

I don't think it's wise to let your legal obligations be determined by some package meta data. I don't think any judge will agree when you say in court but we had the dont-allow-gpl flag set! really!

 
21. Dec 2011, 19:07 CET | Link
I just have the strong feeling that this is not enough, too late, to little?

The "new" big picture

    http://cr.openjdk.java.net/~mr/jigsaw/notes/jigsaw-big-picture-01

turns out to be a very small one. If you really need modularization today, take a look at the really big pictures drawn by maven, spring, osgi.

And here are my thoughts:

    http://blog.oio.de/2011/12/21/project-jigsaw-jdk8-big-picture/
Post Comment