topleft
topright
Enter the Member Network Zone View the Top 10 Points Leaderboard View Members Who Are Currently Online View Latest Member Activity

Featured Members


Member Network Zone

Expert Blog Comments

How Do I Get Relevant Industry Experience?
Hi I would like to thank the builder of this website because it is helping so much people to find a ...
Project Managment Superheros: 6 Project-Saving Superpowers
Hinder the pace http://www.chanelbagsoutlet.com/ of our progress is often not the body extremely ht...
Employees Complain About Blocked Websites
I'm with Sean, basically. But there's probably not a one-size-fits-all solution here. Consultants ...
The Most Important Skill A Programmer Needs Isn’t Code Writing
It’s true, code generation made easy by development tools, programmers should have domain expertis...
5 Keys to Effective Status Reporting
great one. thanks for your work..
An Open Source Option for Enterprise Search
Written by David F. Carr

Apache Lucene seems to be turning into one of the most fruitful open source projects managed by the Apache Software Foundation, short of the original HTTP server that started it all.


A few months ago, I wrote about Cloudera, a firm started to offer support for the Lucene distributed computing subproject Hadoop (see Bringing Hadoop to the Enterprise). I just got off the phone with some of the principals of Lucid Imagination, which employs several key members of the open source project team and is more closely aligned with the original Lucene mission of search. As with most of these companies building a business around an open source product, Lucid has no particular lock on the software or source code but is putting itself out there as a native guide for organizations that aren't comfortable relying entirely on web-based support structures. Lucid provides consulting, training, and certified, commercially supported distributions of software.


In particular, Lucid is offering support for Lucene and for Apache Solr, a Lucene-based search system originally developed for use at CNET by Yonik Seeley, who is now part of the Lucid technical team.


Lucene itself is a Java software library that you can use to build a search engine. That is it provides the basis for the core engine only, not the user interface or other software components you need to build a complete application. It's become popular with search engine technology researchers and inventors as a tool for building other things, but by itself it requires too much hard-core development expertise for the average enterprise.


Solr is a neater package that builds on Lucene and forms the basis for Lucid's entry into the enterprise search market.


"Solr is quite a bit easier to get up and running out of the box. Lucene is like the engine, but you have to build the rest of the car," said Grant Ingersoll, a member of the Lucid technical team and chairman of the Lucene project management committee. On the other hand, one of the beauties of working with Solr, is that developers always have the option of tinkering with the underlying Lucene software.


"You can get exactly what you need, but you still save yourself from having to write the very low-level stuff that makes every search engine go," Ingersoll said.


There are other search technologies in the Lucene family, such as Nutch, but one distinction is that Nutch was really intended for very large scale tasks like indexing the entire web. That's why the Hadoop project arose as a spin-off of Nutch. Lucene, Nutch, and Hadoop all started with Doug Cutting, who has invested years into search technology and distributed computing research, formerly for Excite and Xerox PARC and now for Yahoo.


Ingersoll said Solr has emerged as a good choice for indexing and search challenges that are large, but not necessarily as world-beating as those Nutch is taking on. Solr is good for searching a website, or an intranet, or an e-commerce catalog, and it provides features such as faceted search that can help make the search results more useful.


Faceted search (which also goes by other near-synonymous names such as guided search) imposes a little more structure on the search results, rather than relying solely on text indexing and link analysis. For example, the results from a search of a product catalog might give the user the option of subdividing the results by price point (for example, "video cameras under $1,000").


Although enterprise search is a crowded market with established players such as Autonomy and Google (with its search appliances), Ingersoll and Anil Uberoi, Lucid's chief marketing officer, said they are attracting interest even from some enterprises with established search vendor relationships. Uberoi said the company is talking to about two dozen potential customers who had been doing business with FAST, a search vendor that was recently acquired by Microsoft, partly because some of them have implemented the Linux version of FAST's software and are wondering about its future.


If nothing else, you might find Lucid worth talking to for negotiating leverage with your current search vendor.


"When a customer goes back to the vendor utters the words Lucene and Solr, it's funny how all of a sudden the price drops," Ingersoll said.




Comment on this article
RSS comments

Only registered users can write comments.
Please login or register.

[ Back ]




White Paper Library

Copyright © 2007-2010 CIOZones. All Rights Reserved. CIOZone is a property of PSN, Inc.