SearchTools Blog (searchtools) wrote,
SearchTools Blog
searchtools

Product Report: Datapark Search Engine

This is an off-site copy of the corresponding Product report page on the SearchTools.com website, and it is designed to allow you to comment on the product and/or the reporting. For more information about the topic of search and tools visit SearchTools.com where you can browse many articles, in-depth analysis and overviews of external resources.

Datapark Search Engine

Product Information

  • Datapark Search is an open source search engine.

Platform: Apache
Price: free

Features (Search and Retrieval)

  • Effective caching gives significant time reduction in search times.
  • Supports http, https (SSL), ftp, nntp, news URLs, htdb virtual URL support for indexing SQL databases, as well as html, xml, plain, mpeg, and gif MIME types.
  • Option to query with all words, any words, or boolean queries.
  • Supports synonym lists and stopwords.
  • Index multilingual sites using content negotiation using ispell affixes and dictionaries
  • Multiple character sets supported.
  • Accent insensitive search.
  • Phrase segmenting for Chinese, Japanese, Korean and Thai Languages
  • Open source web-based search engine that uses the GNU Public Licence.
  • Includes an indexer and a web CGI front-end.
  • Supports external parsers.
  • Results can be sorted by relevancy, popularity rank, last modified time and by importance (which is a produect of relevancy and popularity rank).
  • Can scale to at least 300,000+ pages (based on one example).
  • DataparkSearch Reference Manual is well done (Russian version as well).
  • Active and searchable forums in English and Russian.

Examples:

  • Sochi's Internet Search - Sochi, is a resort town in southwestern Russia on the Black Sea. It's search engine has about 100,000 pages.
  • 43°N 39°E - Has been implemented to search specifically English, Russian, German, and French sites.
  • News Lookup Service - Crawls news sites on the web, and allows you to search news sites by media type, region, and/or different aspects of a page (title, body, ect.). Results can be sorted by relevancy and last date modified. Also, news can be browsed by region or topic.
Subscribe

  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 0 comments