apache lucy search examples

Investigating search engines and this time apache Lucy 0.4.2. I am showing a basic indexer and a small search application. See below code for indexer (This will take documents one by one and then index them). Search module will take arugument as STDIN and then will show the search result.

This is pure command line utility just to show how basic indexing and searching works using apache lucy.



use strict;
use warnings;
use Lucy::Simple;

# Ensure the index directory is both available and empty.
my $index = "/ppant/LucyTest/index";
system( "rm", "-rf", $index );
system( "mkdir", "-p", $index );
# Create the helper...a new Lucy::Simple object
my $lucy = Lucy::Simple new( path = $index, language = 'en', );

# Add the first "document". (We are mainly adding meta data of the document)
my %one = ( title ="This is a title of first article" , body ="some text inside the body we need to test the implementaion of lucy", id =1 );
$lucy-add_doc( \%one );

# Add the second "document".
my %two = ( title ="This is another article" , body ="I am putting some basic content, using some words which are also in first document like implementation", id =2 );
$lucy add_doc( \%two );

# Both the documents are now indexed in path

One indexing of the documents is done we'll make a small search script.



use strict;
use warnings;

use Lucy::Search::IndexSearcher;

my $term = shift || die "Usage: $0 search-term";

my $searcher = Lucy::Search::IndexSearcher new( index ='/ppant/LucyTest/index');
# A basic search command line which will look for indexed items based on STDIN and will show that in which document query string is found and no of hits
my $hits = $searcher hits( query =$term );
while ( my $hit = $hits next ) {
print "Title: $hit {title} - ID: $hit {id}\n";
# End of search.cgi


If you want to explore more check Full Code on GitHub

Implementing mapper attachment in elasticsearch

Create a new index
[code lang=”bash”]curl -X PUT "" -d ‘{
"settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }}

Mapping attachement type
[code lang=”bash”]curl -X PUT "" -d ‘{
"attachment" : {
"properties" : {
"file" : {
"type" : "attachment",
"fields" : {
"title" : { "store" : "yes" },
"file" : { "term_vector":"with_positions_offsets", "store":"yes" }
} } } } }'[/code]

Shell script to convert content to base64 encoding and index
[code lang=”bash”]#!/bin/sh</code>

coded=`cat TestPDF.pdf | perl -MMIME::Base64 -ne ‘print encode_base64($_)’`
echo "$json" &gt; json.file
curl -X POST "" -d @json.file
Query  (Search esults will be highlighted)
[code lang=”bash”]curl "" -d ‘{
"fields" : ["title"],
"query" : {
"query_string" : {
"query" : "Cycling tips"
"highlight" : {
"fields" : {
"file" : {}
} } }'[/code]


If you want to explore more check Full Code on GitHub

Apache Lucy search engine

Apache Lucy is full-text search engine library written in C and targeted at dynamic languages. The good news is that, the inaugural release provides Perl bindings. For more information, you can visit the Apache Lucy website Let’s hope for some good result. I still have to try but you can download from here.