Help

The latest Hibernate Search beta v. 4.2.0.Beta2 is available!

In this iteration we introduce Apache Tika integration, Spatial Queries are now able to sort on distance, and as usual a list of less noticeable improvements.

Apache Tika integration

Apache Tika allows you to extract text and index any kind of documents, like MP3 metadata, PDF text, office files. You can annotate a Blob field if loading the media files from a database, or have the String field point to a resource or file path.

@Entity
@Indexed
public class Book {

	Integer id;
	Blob content;

	@Id @GeneratedValue
	public Integer getId() {
		return id;
	}

	public void setId(Integer id) {
		this.id = id;
	}

	@Lob @Basic(fetch = FetchType.LAZY)
	@Field @TikaBridge // <- just add the TikaBridge as an adaptor to make the Blob indexable as any
	public Blob getContent() {
		return content;
	}

	public void setContent(Blob content) {
		this.content = content;
	}
}

The @TikaBridge annotation supports more options to tune the kind of text extraction; refer to the documentation for more details. Consider this feature experimental for now: we didn't add an option to make the text extraction asynchronous yet, so we might need to change the API to introduce that.

Spatial Queries sorted by distance

Thanks to all of Nicolas's Helleringer work, it's now easy to

  • Return the distance from the search center to each hit (via a projection)
  • Apply a sort criteria on the distance

Let's see an example from our large collection of self-documenting examples (the testsuite!):

QueryBuilder builder = em.getSearchFactory().buildQueryBuilder().forEntity( Cafe.class ).get();

org.apache.lucene.search.Query luceneQuery = builder.spatial()
    .onCoordinates( "location" )
    .within( 100, Unit.KM )
        .ofLatitude( centerLatitude )
        .andLongitude( centerLongitude )
    .createQuery();

FullTextQuery hibQuery = em.createFullTextQuery( luceneQuery, Cafe.class );

Sort distanceSort = new Sort( new DistanceSortField( centerLatitude, centerLongitude, "location" ) );

hibQuery.setSort( distanceSort );

hibQuery.setProjection( FullTextQuery.THIS, FullTextQuery.SPATIAL_DISTANCE );

hibQuery.setSpatialParameters( centerLatitude, centerLongitude, "location" );

List results = hibQuery.getResultList();

Several more reasons to upgrade

  • Apache Lucene upgraded to version 3.6.1
  • JMS and JMX integrations improved
  • The MassIndexer now correctly applies EntityIndexingInterceptor
  • Lower memory usage
  • Spatial Queries improved
  • Improved some classloaders for better integration with other libraries

The complete list of changes can be found here. Check the Migration Guide.

It has been a while since 4.2.0.Beta1 but the summer is over, so try these quickly as we'll move to the Final soon! As always, feedback is very welcome.

15 comments:
 
19. Oct 2012, 08:16 CET | Link

Yah! Against all odds, we are progressing this week :)

ReplyQuote
19. Oct 2012, 09:32 CET | Link
MDMD
"The MassIndexer now correctly applies EntityIndexingInterceptor" - good, thanks!
 
19. Oct 2012, 19:50 CET | Link
The MassIndexer now correctly applies EntityIndexingInterceptor

So how does this actually work? I always thought that the MassIndexer worked with raw data and the EntityIndexingInterceptor works with entity references. BTW that captcha is darn hard

 
20. Oct 2012, 01:48 CET | Link
I always thought that the MassIndexer worked with raw data and the EntityIndexingInterceptor works with entity references.

It's partially raw, not all of it. It's a concurrent pipeline with different stages of transformation; at some point it actually incarnates the usual entity so we have a chance to apply interception and avoid a good deal of work - but not all of it. It's also nice to take advantage of 2nd level caching.

If HSEARCH-499 was resolved it could even skip some data loading in the two first stages, but these are usually very quick anyway while the complexity would rise significantly. I'd need time for some experiments, or someone to volunteer trying the different approaches out.

 
22. Mar 2014, 09:40 CET | Link

Apache Tika permits you to concentrate content and file any sort of reports, in the same way as Mp3 metadata, PDF content, office indexes. You can comment a Blob field if stacking the media documents from a database, or have the String field point to an asset or index way.file conversion service

 
02. Apr 2014, 16:28 CET | Link

Additionally it is seen as the popular alternative for your people participate in the game industry or will be in the game vocation. This minimizes the particular too much warmth build-up and also lessens the particular moisture which leads for the trouble of Candida sourcing underneath the feet. To truly realize why guys ought to wear the particular bottom socks, http://watches079.co.uk you need to comprehend the particular downsides from the frequent regular socks. With all the frequent socks the digits obtain constantly within a confined place and lots of warmth assimilated the following. Heat which in turn builds up inside will cause the particular too much moisture sourcing and with continuing friction involving the digits the particular dissipated warmth and also the too much moisture will cause the breitling replica watches issues similar to blisters and Candida. Your bottom socks pertaining to guys stay clear of each one of these issues and it also offers a person the particular ease and comfort that you constantly seek out.

 
26. May 2014, 14:15 CET | Link

It is really a good news to know that Apache Tika has been released. Tika is a project of Apache software foundation. In order to know more about how tika works you can log on to the following website. The functioning is very simple.

 
27. May 2014, 22:27 CET | Link
jack

In case your business feels the call for a star persuade you their strategy is good, you ought to be careful of them.opzioni binarie opinioni

 
08. Jun 2014, 14:57 CET | Link

Companies that frequently send employees abroad may essentially act as their own Forex currency exchange by reimbursing their employees in the local currency and using the forex currency exchange

 
24. Jun 2014, 10:17 CET | Link

This is perfect, the script that I have been looking for my project. Thanks a lot! It really healps a lot.

buy soundcloud plays

how to buy soundcloud followers

 
20. Jul 2014, 02:17 CET | Link
jack

I just want to let you know that I just check out your site and I find it very interesting and informative. Guarda a questo

 
20. Jul 2014, 06:51 CET | Link
Don

This website has very good content. Thank you for the great article I did enjoyed reading it, I will be sure to bookmark your blog and definitely will come back from again. Find Out More Here

 
20. Jul 2014, 06:52 CET | Link
Paul

This blog post really grabbed my attention. With that said I am going to subscribe. Therefore I will get more updates on what you have to say. Please keep writing as I want to learn more. Find Out More Here

 
27. Jul 2014, 21:07 CET | Link
Dermaslim

Nice article and a very dense amount of information is provided here. Although the info is not that much but how much it is, all is very bold and right to the point. Every line tells something new and I really like that. Keep this up! Dermaslim

 
31. Jul 2014, 01:29 CET | Link
jack

Actually I read it yesterday but I had some thoughts about teeth whitening miramar it and today I wanted to read it again because it is very well written.

Post Comment