June 14th, 2004

searchtools.com

Product Report: Inxight SmartDiscovery

This is an off-site copy of the corresponding Product report page on the SearchTools.com website, and it is designed to allow you to comment on the product and/or the reporting. For more information about the topic of search and tools visit SearchTools.com where you can browse many articles, in-depth analysis and overviews of external resources.

Inxight SmartDiscovery

Product Information

  • SmartDiscovery overall product includes a search engine, taxonomy creation and categorization, entity extraction, document summarization, related documents, etc.
    • MetaText Server automatically extracts and stores topical metadata about documents: summaries, related documents, people, places and things. Includes a metadata repository and a search engine with concept search, full-text, and Boolean queries.
    • VizServer offers visualization technologies for data in relational databases and unstructured data repositories, improving data mining and text mining processes.
    • Categorizer automatically assigns documents to categories from a subject taxonomy using linguistic and statistical algorithms.

Platform: Windows 2000, Unix: Solaris 2.7
Price: contact company

Features (Search and Retrieval)

  • Web robot crawler, file system indexer.
  • Integrates with Documentum, Lotus Notes, JDBC/ODBC for databases
  • Uses the security model in the system.
  • Metadata repository based on entity extraction, identifying companies, locations and key people, can store in XML in database, or back in original environment.
  • Languages supported are: Danish, Dutch, English, Finnish, French, German, Italian, Norwegian (Bokmål), Norwegian (Nynorsk), Portuguese, Spanish, Swedish.
  • Additional modules for Simplified and Traditional Chinese, Japanese and Korean.
  • File formats supported include Microsoft Word, Microsoft Excel, PDF, XML, Lotus WordPro, AmiPro, Corel Presentations, WordPerfect, Lotus Freelance, Microsoft Works, Corel Quattro Pro, HTML, Lotus 1-2-3, ASCII and approximately 70 other file formats.
  • Keyword and Boolean searching.
  • Concept searching, based on noun-phrase co-occurrence, other topics related to search words.
  • Browser-based administration and editing tools, defining collections, scheduling.
  • Collection Explorer for results pages, optional visualization.
  • Supports faceted metadata search display and browsing.
  • Includes taxonomy using the Star Tree Java visualization engine, also for taxonomy management.
  • APIs in Java or C, XML access to content.
  • Sample code and extensive documentation.
  • Other Modules
    • LinguistX Library: A collection of components for many languages that provide word and phrase analysis, stemming, tokenization, parts of speech analysis, noun phrase extraction, language identification, summarization, etc.
    • Murax natural-language search engine and organizes search results into clusters, using co-occurrence analysis. Also shows snippets of documents rather than the entire file.
    • WhizBang Engine - designed to crawl internal and external sources, classify them according to type, and extract key entities. This applies external structure to otherwise unstructured data.

Articles

  • Unstructured Data Management: the elephant in the corner (guest or customer access required) the(451) Report, November 2002 by Nick Patience and Rachel Chalmers
    Describes the general features of SmartDiscovery, which combines the MetaText, VizServer and Categorizer and taxonomy generation bought from WhizBang Labs. Praises the language recognition and natural-language processing features.

  • Inxight acquires Information extraction technology Content Wire, October 29 2002
    Reports that Inxight acquired the technology assets of WhizBang! Labs.

  • From search to find InfoWorld September 28, 2001 by Mario Apicella
    Describes new search engines seeking to make it easier for buyers to find products. Mentions H5 Technologies categorization of documents, Ultraseek (then Inktomi Search) Software access to multiple file formats and Inxight Star Tree and Table Lens visualization features.

Examples:

  • NASS website - the USDA's National Agricultural Statistics Service (NASS) has census data for about two million farms. They use the Inxight VizServer, and the TableLens software included to do visualizations of this data.
  • The Inxight site is using Star Tree for visual navigation and the Ultraseek engine for text search.

Applications Using Multilingual Search Features

searchtools.com

Product Report: Inxight SmartDiscovery

This is an off-site copy of the corresponding Product report page on the SearchTools.com website, and it is designed to allow you to comment on the product and/or the reporting. For more information about the topic of search and tools visit SearchTools.com where you can browse many articles, in-depth analysis and overviews of external resources.

Inxight SmartDiscovery

Product Information

  • SmartDiscovery overall product includes a search engine, taxonomy creation and categorization, entity extraction, document summarization, related documents, etc.
    • MetaText Server automatically extracts and stores topical metadata about documents: summaries, related documents, people, places and things. Includes a metadata repository and a search engine with concept search, full-text, and Boolean queries.
    • VizServer offers visualization technologies for data in relational databases and unstructured data repositories, improving data mining and text mining processes.
    • Categorizer automatically assigns documents to categories from a subject taxonomy using linguistic and statistical algorithms.

Platform: Windows 2000, Unix: Solaris 2.7
Price: contact company

Features (Search and Retrieval)

  • Web robot crawler, file system indexer.
  • Integrates with Documentum, Lotus Notes, JDBC/ODBC for databases
  • Uses the security model in the system.
  • Metadata repository based on entity extraction, identifying companies, locations and key people, can store in XML in database, or back in original environment.
  • Languages supported are: Danish, Dutch, English, Finnish, French, German, Italian, Norwegian (Bokmål), Norwegian (Nynorsk), Portuguese, Spanish, Swedish.
  • Additional modules for Simplified and Traditional Chinese, Japanese and Korean.
  • File formats supported include Microsoft Word, Microsoft Excel, PDF, XML, Lotus WordPro, AmiPro, Corel Presentations, WordPerfect, Lotus Freelance, Microsoft Works, Corel Quattro Pro, HTML, Lotus 1-2-3, ASCII and approximately 70 other file formats.
  • Keyword and Boolean searching.
  • Concept searching, based on noun-phrase co-occurrence, other topics related to search words.
  • Browser-based administration and editing tools, defining collections, scheduling.
  • Collection Explorer for results pages, optional visualization.
  • Supports faceted metadata search display and browsing.
  • Includes taxonomy using the Star Tree Java visualization engine, also for taxonomy management.
  • APIs in Java or C, XML access to content.
  • Sample code and extensive documentation.
  • Other Modules
    • LinguistX Library: A collection of components for many languages that provide word and phrase analysis, stemming, tokenization, parts of speech analysis, noun phrase extraction, language identification, summarization, etc.
    • Murax natural-language search engine and organizes search results into clusters, using co-occurrence analysis. Also shows snippets of documents rather than the entire file.
    • WhizBang Engine - designed to crawl internal and external sources, classify them according to type, and extract key entities. This applies external structure to otherwise unstructured data.

Articles

  • Unstructured Data Management: the elephant in the corner (guest or customer access required) the(451) Report, November 2002 by Nick Patience and Rachel Chalmers
    Describes the general features of SmartDiscovery, which combines the MetaText, VizServer and Categorizer and taxonomy generation bought from WhizBang Labs. Praises the language recognition and natural-language processing features.

  • Inxight acquires Information extraction technology Content Wire, October 29 2002
    Reports that Inxight acquired the technology assets of WhizBang! Labs.

  • From search to find InfoWorld September 28, 2001 by Mario Apicella
    Describes new search engines seeking to make it easier for buyers to find products. Mentions H5 Technologies categorization of documents, Ultraseek (then Inktomi Search) Software access to multiple file formats and Inxight Star Tree and Table Lens visualization features.

Examples:

  • NASS website - the USDA's National Agricultural Statistics Service (NASS) has census data for about two million farms. They use the Inxight VizServer, and the TableLens software included to do visualizations of this data.
  • Note: The Inxight site is using Star Tree for visual navigation and the Ultraseek engine for text search.

Applications Using Multilingual Search Features