Quick Intro to Hibernate Search and Lucene

In my previous post “Adding the Power of Search to Your Hibernate App – The Easy Way“, I talked a little about when you may want to consider integrating a search capability into your application using Hibernate Search, as well as a bit about Hibernate Search and how it relates to Hibernate Core, Lucene and Solr. In this post, we’re going to take a quick look at a sample application (really, it’s a JUnit test case) that uses Hibernate Core, with Java Persistence API (JPA) annotations, to persist a simple entity to a relational database and Hibernate Search to run some searches against the Lucene indexes created/updated as the Hibernate-managed entities are updated in the database. 

There are a few things I believe are worth noting before we dig in:

  1. The example is based on Hibernate Search, version 4.3.0. I’ve attached the source code to this post as an Eclipse project. Simply use the Eclipse “import” function. You may need the m2eclipse plugin. However, you may be able to build and run the unit tests without it.
  2. Hibernate Search automatically creates the Lucene index and keeps it up-to-date in the background, as persistent entities are created/updated/deleted in the database. There’s no code to write to update the index.
  3. Annotations – both JPA and Hibernate Search annotations – trigger the Hibernate framework to do most of the work for us relative to creating the database table for the entity, persisting the test data in the database and causing Lucene to build and populate its indexes with the data we want to be able to search.
  4. As this is a JUnit test, we’re taking advantage of the in-memory database capability of the H2 Database, as well as the in-memory Lucene index capability. If we wanted to create a persistent, on-disk database and indexes, only a few lines in the Hibernate XML configuration file would change.
  5. The purpose of this post is simply to illustrate how you can use Hibernate Search to make Hibernate-managed entities searchable and also how you can search for these entities in the Lucene indexes. This post simply gives you an idea of how simple it is to add search capability to a Hibernate-based application. There’s a lot more to know about Lucene, including how text tokenizers and analyzers work, and there’s a lot more you can do with Lucene and Hibernate Search – we’re just scratching the surface here. If this post peaks your interest, I’d encourage you to explore the Lucene and Hibernate Search projects at their respective web sites.

I’ll illustrate how you can get started with Hibernate Search using a very simple example. In this example, we have a single entity – a Car – that we want to persist. Our unit test will persist this to the H2 Database in-memory store. I’ll use JPA annotations to instruct Hibernate on how to persist this entity. Hibernate will also automatically create the database table to store our Car objects. I’ll also use Hibernate Search annotations to inform that framework that we want our Car objects indexed, as well as which Car fields we want indexed and whether we want their values actually stored within the index (versus requiring them to be retrieved from the H2 Database). Finally, we’ll put it together by configuring Hibernate and Hibernate Search using a Hibernate configuration file and creating a JUnit test case that creates some test Cars in the H2 Database, triggering Hibernate Search to also send the indexed fields to be indexed by Lucene. Each individual test within the test case illustrates a slightly different approach to using the Hibernate Search and Lucene APIs to achieve the same search.

Below is the Car entity we want Hibernate to persist for us. We apply JPA annotations, such as @Entity, @Id and @GeneratedValue to tell Hibernate JPA that this class represents a persistent entity, that the “id” field is the primary key field and that we want to have the database auto-generate the “id” values for us, respectively. We use several Hibernate Search annotations as well:

  • @Indexed: indicates that this entity should be indexed by Lucene, making it searchable.
  • @Analyzer: tells Hibernate Search which Lucene analyzer to use when tokenizing its fields and updating the Lucene index. IMPORTANT: It will be very important when you search later on that you use the same analyzer that was used by Lucene to index the documents you’re searching for. While using a different analyzer may indeed return desired results, there is no guarantee – so, always research the analyzers you select for both indexing and searching and choose wisely. Although many domain-specific applications seem to create their own custom analyzers, the Lucene StandardAnalyzer seems to be the best “middle-of-the-road” implementation among the out-of-the-box analyzers, so start there.
  • @DocumentId: indicates that the Car’s “id” field should be used as the ID of the Lucene documents in the index. This is almost always the same field as the entity’s primary key in the database.
  • @Field: tells Hibernate Search to index this field and provides other instructions as to how that field should be treated within the index.
@Analyzer(impl = org.apache.lucene.analysis.standard.StandardAnalyzer.class)
public class Car {

	private Long id;

	@Field(store = Store.YES)
	private String make;

	@Field(store = Store.YES)
	private String model;

	@Field(store = Store.YES)
	private short year;

	@Field(store = Store.NO)
	private String description;

	public Car() {

	public Car(String make, String model, short year, String description) {
		this.make = make;
		this.model = model;
		this.year = year;
		this.description = description;

	public String getMake() {
		return make;
    // more getters/setters

That’s all it takes to tell Hibernate everything it needs to know about persisting Car objects and Hibernate Search all it needs to know to make Cars searchable. Now, let’s take a look at how we can search for Cars. First, we need to load up the Hibernate configuration, then create a Hibernate database Session:

public void setUp() throws Exception {
	Configuration configuration = new Configuration();
	ServiceRegistry serviceRegistry = new ServiceRegistryBuilder().applySettings(configuration.getProperties())
	hibernateSessionFactory = configuration.buildSessionFactory(serviceRegistry);
	hibernateSession = hibernateSessionFactory.openSession();

We can now persist some Car objects for testing:

private void populateDBWithTestData() {
        Car[] testCars = { new Car("Shelby American", "GT 350", (short) 1967, "This is Tim's car!"),
			new Car("Chevrolet", "Bel Air", (short) 1957, "This is a true classic") };

	Transaction tx = hibernateSession.beginTransaction();



Our two test Cars are now saved to the H2 Database. But, they’ve also now been indexed by Lucene! We can now search Lucene for our Cars:

public void testUsingLuceneBooleanQueryReturningFullEntity() throws Exception {
	FullTextSession fullTextSession = Search.getFullTextSession(hibernateSession);

	BooleanQuery bq = new BooleanQuery();
	TermQuery gt350TermQuery = new TermQuery(new Term("model", "GT 350"));
	TermQuery belAirTermQuery = new TermQuery(new Term("model", "Bel Air"));
	bq.add(gt350TermQuery, BooleanClause.Occur.SHOULD);
	bq.add(belAirTermQuery, BooleanClause.Occur.SHOULD);
	Query q = new QueryParser(Version.LUCENE_36, "cs-method", new StandardAnalyzer(Version.LUCENE_36)).parse(bq

	org.hibernate.Query hibernateQuery = fullTextSession.createFullTextQuery(q, Car.class);
	List searchResults = hibernateQuery.list();

	boolean foundShelby = false;
	boolean foundBelAir = false;
	for (Car car : searchResults) {
		if (car.getModel().equals("GT 350")) {
			foundShelby = true;
		} else if (car.getModel().equals("Bel Air")) {
			foundBelAir = true;
	Assert.assertEquals(2, searchResults.size());
	Assert.assertTrue(foundShelby && foundBelAir);

Here are some key points about the above search code:

  • On line 3, we obtain a Hibernate Search FullTextSession. Notice that this session decorates a normal Hibernate database Session object. This allows Hibernate Search to be aware of inserts/updates/deletes of indexed entities, so that it can keep the Lucene index up-to-date.
  • In lines 5 through 13, we build a query to search for both of our test Cars. We choose to use Lucene’s BooleanQuery, although, as you’ll see if you download the full source code, there are many ways to build this same search. The search we’re performing is this: “Find all Cars with the model of EITHER ‘GT 350’ OR ‘Bel Air'”. We build up the query, have the Lucene QueryParser parse the query and then have the FullTextSession translate that to a standard Hibernate Query.
  • The rest of the code checks that our search did indeed return the Cars we expected to find in the Lucene index and asserts such.

Download the Hibernate Search sample code for this post if you want to see other ways to perform the same search or just want to load up the code and play around with Hibernate Search on your own.

The Hibernate configuration file for our test case is very simple. It configures our H2 Database, points Hibernate to our Car entity class and tells Hibernate Search where to store the Lucene index. I just want to point out one specific section of that file (it’s named “hibernate-test-cfg.xml” in the downloadable project):

<!-- Store index in memory, so no index cleanup required after tests -->

<!-- Would set this in production application. Index stored on disk. -->
<property name="hibernate.search.default.directory_provider">
<property name="hibernate.search.default.indexBase">c:/temp/lucene/indexes</property>

<!-- Define Hibernate entity mappings. Standard Hibernate stuff - not specific to Hibernate Search. -->
<mapping class="net.timontech.hibernate.search.Car"/>

As I mentioned above, we’re storing both the database and Lucene index in memory. We only do this for testing, so that our unit tests have no cleanup to do. It’s very unlikely you’d want to do this in a production application. In the above configuration, Hibernate Search provides a RAMDirectoryProvider, which stores the index in memory, as well as a FSDirectoryProvider, which stores the index at a location you specify. Switching from in-memory to on-disk index storage is as simple as changing a single Hibernate Search property.

The “mapping” element in the above configuration tells Hibernate Core to examine the Car class for annotations that will instruct it on how to persist the Car entity to the database.

There’s obviously a lot more to learn about both Hibernate Search and Lucene. Lucene in itself is a very powerful and flexible library. However, if you hadn’t yet explored Hibernate Search and/or Lucene, I hope this post at least gave you a flavor for these technologies and enough information to get started. If you’re new to Hibernate Search, I’d encourage you to download the attached project, import it into Eclipse, take a look at the Hibernate Search and Lucene APIs and play around with the capabilities. After that, I’d recommend studying the Lucene documentation if Lucene is new to you.

Download source code for this article here.

Tagged with: , , , , ,
Posted in Database Programming, Java Programming, Software Development

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: