Apache Lucene seems to be turning into one of
the most fruitful open source projects managed by the Apache Software
Foundation, short of the original HTTP server that started it all.
A few months ago, I wrote about Cloudera, a firm started to offer
support for the Lucene distributed computing subproject Hadoop (see Bringing Hadoop to the Enterprise). I just got off the phone with some of the principals of Lucid Imagination,
which employs several key members of the open source project team and
is more closely aligned with the original Lucene mission of search. As
with most of these companies building a business around an open source
product, Lucid has no particular lock on the software or source code
but is putting itself out there as a native guide for organizations
that aren't comfortable relying entirely on web-based support
structures. Lucid provides consulting, training, and certified,
commercially supported distributions of software.
In particular, Lucid is offering support for Lucene and for Apache
Solr, a Lucene-based search system originally developed for use at CNET
by Yonik Seeley, who is now part of the Lucid technical team.
Lucene itself is a Java software library that you can use to build a
search engine. That is it provides the basis for the core engine only,
not the user interface or other software components you need to build a
complete application. It's become popular with search engine technology
researchers and inventors as a tool for building other things, but by
itself it requires too much hard-core development expertise for the
average enterprise.
Solr is a neater package that builds on Lucene and forms the basis for Lucid's entry into the enterprise search market.
"Solr is quite a bit easier to get up and running out of the box.
Lucene is like the engine, but you have to build the rest of the car,"
said Grant Ingersoll, a member of the Lucid technical team and chairman
of the Lucene project management committee. On the other hand, one of
the beauties of working with Solr, is that developers always have the
option of tinkering with the underlying Lucene software.
"You can get exactly what you need, but you still save yourself from
having to write the very low-level stuff that makes every search engine
go," Ingersoll said.
There are other search technologies in the Lucene family, such as
Nutch, but one distinction is that Nutch was really intended for very
large scale tasks like indexing the entire web. That's why the Hadoop
project arose as a spin-off of Nutch. Lucene, Nutch, and Hadoop all
started with Doug Cutting, who has invested years into search
technology and distributed computing research, formerly for Excite and
Xerox PARC and now for Yahoo.
Ingersoll said Solr has emerged as a good choice for indexing and
search challenges that are large, but not necessarily as world-beating
as those Nutch is taking on. Solr is good for searching a website, or
an intranet, or an e-commerce catalog, and it provides features such as
faceted search that can help make the search results more useful.
Faceted search (which also goes by other near-synonymous names such
as guided search) imposes a little more structure on the search
results, rather than relying solely on text indexing and link analysis.
For example, the results from a search of a product catalog might give
the user the option of subdividing the results by price point (for
example, "video cameras under $1,000").
Although enterprise search is a crowded market with established
players such as Autonomy and Google (with its search appliances),
Ingersoll and Anil Uberoi, Lucid's chief marketing officer, said they
are attracting interest even from some enterprises with established
search vendor relationships. Uberoi said the company is talking to
about two dozen potential customers who had been doing business with
FAST, a search vendor that was recently acquired by Microsoft, partly
because some of them have implemented the Linux version of FAST's
software and are wondering about its future.
If nothing else, you might find Lucid worth talking to for negotiating leverage with your current search vendor.
"When a customer goes back to the vendor utters the words Lucene and
Solr, it's funny how all of a sudden the price drops," Ingersoll said.
Only registered users can write comments.
Please login or register.