This is an off-site copy of the corresponding Product report page on the SearchTools.com website, and it is designed to allow you to comment on the product and/or the reporting. For more information about the topic of search and tools visit SearchTools.com where you can browse many articles, in-depth analysis and overviews of external resources.
SWISH-E stands for "Simple Web Indexing System for Humans - Enhanced"
Price: Free (open source) under the GNU License
Platforms: Linux, Solaris, AIX, VMS, and Windows 95/NT/2000
- Indexes local files or web sites using a robot spider
- Index and search data in tags, including Dublin Core meta tags and XML nested fields
- Use external converters to index binary files including PDF, Microsoft Word, compressed files
- Portable indexes can be moved to other machines
- Search allows Boolean And, Or, Not, and parentheses
- Fuzzy matching including truncation and stemming
- Sort results by relevance, date, size, other fields
- Code library with API provided
- Version 2.2, September 2002
- External document indexing option, easy to add special indexing gatherers for databases, CMSs, etc.
- new XML parsers, expat or libxml2
- Improved filtering of binary file formats
- Ignores text in <!-- noindex --> <!-- index --> blocks and follows meta robots instructions.
- Special case indexing for "buzzwords" - complex terms including punctuation such as C++ and SWISH-E
- Much faster indexing and searching
- Searching, merging, and ranking results from multiple indexes
- Improved security in temporary files and parameter checking
- Search results layout can be edited directly or via the Perl HTML::Template
- Result page match words highlighted in context
- Windows binary in installer package
- Extended documentation
- Note: be sure to re-index after installing the new version, old indexes are not supported.
Articles & Reviews
Open Source Indexers Infomotions Musings; May 29, 2001 by Eric Lease
Describes the history and features of eight open-source search engines, freeWAIS-sf (aging code and hard to install, but good for searching email and public domain etexts); Harvest (powerful gathering features for frequently-changing data stores, good with structured documents); ht://Dig (tricky to configure, no phrase searching, automatic stemming and match word highlighting); Isearch (weak documentation and support, easy to install, dated interface, Z39.50 support); MPS Information Server (zippy indexing of both text and structured data, Z39.50 support, Perl API, limited documentation); SWISH-E (simple to install engine, CGIs in Perl and PHP still beta, good for HTML pages, recognizes new META tags, sorts results by field; WebGlimpse (easy to install and configure, requires commercial version for customized output); Yaz/Zebra (mainly Z39.50, no Perl API, mainly a toolkit to index and respond to distributed client queries). Article also points out that chaotic information is less than helpful and encourages organization, structure and vocabulary control.