March 5th, 2009

More about file format parsing tools for indexing

This is a follow-up to my article about file format access tools and Lucene Tika.

More free open-source packages: Charlie Hull directed me to a file converters listed in the Omega overview, part of the Xapian project. Presumably, this lists packages that they've tested for quality.

Collapse )

Corrections to my previous file format access article:

Corrected Tika project link
Outside-In is now owned by Oracle
Corrected Mark Bennett's Filters article link

(if only the Firefox Link Check plugin still worked...)