SearchTools Blog (searchtools) wrote,
SearchTools Blog
searchtools

Product Report: Apple e.g.

This is an off-site copy of the corresponding Product report page on the SearchTools.com website, and it is designed to allow you to comment on the product and/or the reporting. For more information about the topic of search and tools visit SearchTools.com where you can browse many articles, in-depth analysis and overviews of external resources.

Apple e.g.

Price: free
Platforms: Macintosh 68K and PowerPC

Apple e.g. was a demonstration of AIAT (Apple Information Access Technology, also known as Sherlock and VTwin) which provides content awareness, including fast and efficient text indexing and search, to applications. No longer supported even via mailing list.

System Requirements

Macintosh 68K or PowerPC
Mac OS Web Server with CGI Support

Indexing

File Selection

Local file indexer, and does not crawl URLs. The default selection is the entire web server folder, but in most cases, you'll want to choose the folders to include and omit. You can also specify a list of file suffixes to either to index or not index (but not both). The default excludes the suffixes ps, gif, jpeg, hqx and xbm, which are not text files, and should not be indexed.

Updates and Scheduling

  • Incremental update supported (rather than always re-creating from scratch)
  • Automated update: once per day.

Other Index Features and Issues

  • The Stopwords list is a text file of words which will not be indexed or searched. This can reduce the size of your index substantially.
  • The Substitutions list lets you do two things:
    • Controlled vocabulary, where the index and search engine automatically substitute one word for a synonym (for example, treating the words flower, blossom and bud as identical for purposes of searching.
    • Word-root truncation, where all forms of a word are treated like the root form (for example run, runs, running).
  • Only one index per copy of e.g. (although you could, theoretically, have two copies of the CGI running on one machine).

Search Engine

The vector-mapping retrieval engine takes the query and compares it to the documents indexed. The "best match" is a complex algorithm based on the number of words matched between the query and the document. This works best for multi-word searches, including natural-language queries.

Search Form

The default search form has a text field for the query, a popup for number of results per page, and an option for whether the results should be in text format or table format.

The default search uses the vector mapping described above, but you can also use special codes for Boolean operators: And (&), Or (|) and Not (!).

Results Listings

Data Displayed

  • relevance score as a partially filled bar
  • title of the page
  • URL
  • summary (the meta description data, if any, or the first part of the text)
  • checkbox for "more like this", a special feature of AIAT. You can check the file or files that seem to fit your needs well, and then press the More Like This button to ask the system to find others like them. This powerful feature provides a great way to discover information when you don't really know the terminology.
  • search terms matched in the file (very important, because vector searching may not match all search terms)

Results Options

  • Location of results list on page.

You can't reorganize the individual elements in each result item.

Summary

Apple e.g. is free, innovative, and interesting but the vector searching is unusual and sometimes disconcerting.

Subscribe

  • Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 0 comments