Provides automated distributed storage of huge amounts of data and near-real-time search access. It's built on Hadoop and HBase (with some MapReduce), and uses both SQL and Lucene-style query languages, returning JSON objects. It can work in public or private clouds.
Sphinx is an open source search (GPLv2) engine that works across platforms and indexes content in SQL databases, NoSQL, and files. It does not have a web crawler or robot spider. It's just a code library, so has no user interface, just API calls and SQL queries. It can scale dramatically up, indexing billions of documents, with search distributed among many machines, easily handling 3,000 queries per second, and it's the search for Craigslist. Developers offer support and implementation services.