Lucene

What is Lucene ?

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java.
It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Lucene can index plain text, integers, index PDF, Office Documents. etc.

Indexing concept to enable faster search by lucene or Inverted Index concepts ?

Lucene creates something called Inverted Index. Normally we map

document -> terms in the document. (here document is collection of information or searchable items)

But, Lucene does the reverse.

Creates a index term -> list of documents containing the term, which makes it faster to search.

How to user Apache lucene library into your application though maven ?

Maven Dependency

<groupid>org.apache.lucene</groupid>

<artifactid>lucene-core</artifactid>

<version>${Version}</version>

<scope>compile</scope>

</dependency>

Download Dependency

•Download Lucene from http://lucene.apache.org/ and

•add the lucene-core.jar in the classpath

Note: The current Apache lucene version is 5.2.X (as of 7th Aug 2015). ${version} should be replaced with proper version what you may want to use.

Lucene Indexing flow to enable faster search

Let's understand the picture first from Bottom to Center.

The Raw Text is used to create a Lucene "Document" which is analyzed using the provided lucene Analyzer and Document is added to the index based on the Store,TermVector and Analyzed property of the Fields. Next, the search from top to center.The users specify the query in a text format.The query Object is build based on the query text and the result of the executed query is returned as TopDocs

* * Document is a class provided by lucene core library.

Friday, 7 August 2015

What is Lucene ?

Indexing concept to enable faster search by lucene or Inverted Index concepts ?

How to user Apache lucene library into your application though maven ?

Lucene Indexing flow to enable faster search