December 4th, 2010

searchtools.com

site/commerce/intranet/enterprise search notes from all over

  • Stopwords section - Search User Interfaces | Marti Hearst | Cambridge University Press 2009

    This points out that ignoring stopwords is opaque to users, who expect if they type "a", "an", and "the", that the search engine will find them.  From the book: In a famous example in the early days of Web search, a searcher who typed “to be or not to be” in a search engine would be shocked to be served empty results. In 1996, a review of eight major search engines found that only AltaVista could handle the Hamlet quote; all others ignored stopwords (Peterson, 1997, Sherman, 2001).  Web search engines have since found efficient ways to index and store even stopwords, because they are so valuable once in a while. So smaller search engines should follow their lead.

    tags: stopwords transparency ux ui

    • A classic case of system behavior that is opaque to system users is the elimination of stopwords from user queries. (Stopwords are the most common words in the language, usually what linguists call “closed-class” words in that new ones rarely enter the language. Examples from English are articles such as a, an, the and prepositions such as in, on.) In a famous example in the early days of Web search, a searcher who typed “to be or not to be” in a search engine would be shocked to be served empty results. In 1996, a review of eight major search engines found that only AltaVista could handle the Hamlet quote; all others ignored stopwords (Peterson, 1997, Sherman, 2001). (Stopword elimination is common in statistical ranking systems for which a paragraph-length query is assumed; not indexing stopwords by position results in significant savings in indexing time and disk space.) Today, this problem is solved on all the major Web search engines.
  • Autonomy's "Put FAST in the Past" Rescue Program

    Microsoft will only be developing FAST for Windows in the future, and will be cutting support for Unix and Linux versions.  Autonomy is aggressively marketing to those customers. "Autonomy will match an organization's Microsoft FAST license implementation with like-for-like capability on all platforms – for 50% of the organization's original license fee for orders placed before December 31. Autonomy will provide conceptual search and a Sharepoint connector free of charge Autonomy's Microsoft FAST to IDOL migration tool will index an organization's data, enabling a seamless migration Autonomy IDOL Enterprise Search can be used transparently from within Microsoft applications including Word, SharePoint, etc., providing end users with a seamless and easy transition"

    tags: unix linux enterprise search engines search-vendors

  • Search implementation maturity level

    A set of useful measures to classify the sophistication of a search implementation, and clarify what steps it would take to move up from one level to the next. I have some quibbles about the exact order of steps, but I really like the overall approach.  

    tags: search engines evaluation

  • WSDM2011 conference (2011-2-9)

    Web Search and Data Mining - international ACM conference, Hong Kong, during February 9-12, 2011.

    tags: confrences

  • Celebros - search, navigation & analytics solution for online stores

    concept-based semantic e-commerce search.  I haven't tested it yet.

    tags: ecommerce site search engine

  • Nextopia e-commerce search

    Offers search for online catalogs, with images and faceted search options.

    tags: site search engine ecommerce

  • Lucene Java 3.0.3 and 2.9.4 (bug fix releases)

    Bugfixes for Lucene Java 2.x (old branch) and 3.x (new trunk).

    tags: open-source search engine java

  • Constellio | Open Source Enterprise Search

    Constellio is transitioning from a closed to open-source search engine, based on Lucene/Solr and compatible with Google Enterprise Connector Manager.  

    tags: open-source search-vendors file-formats connectors

  • Contegra Systems | Services

    A systems integrator for information management, they work with several search engines including dtSearch, Exalead and FAST.

  • MaxxCAT - Enterprise Search Appliances

    Hardware-software combination designed for easy connection to data sources, lightweight JSON API, scalability to hundreds of millions of items with fast response and high availability.  These are significantly cheaper than the Google Mini and GSA, and the licenses don't time out.

    tags: enterprise search appliances APIs database indexing

  • Fusing Enterprise Search and Social Bookmarking - MIKE2.0

    More practical than most social search proposals, this treats public bookmarks as a form of metadata to be included in relevance and results display.  It also has a note about the value of weak social ties in diffusing information beyond one's normal circle.

    tags: social search engines enterprise folksonomy

Posted from Diigo. The rest of my favorite links are here.