October 14th, 2010


Search and Business Intelligence: The Humble Inverted Index Wins Again

infotoday newsbreak, october 14, 2010 by Avi Rappoport, Search Tools Consulting

In this modern age, big institutions have giant piles of data about all their operations: the question is what to do with all those bits. Extracting the right information can help avoid waste, delays, systems failures, even terrorist threats. For example, look at Toyota’s customer support and repair data: if the management had been looking, they would have noticed that something was going terribly wrong. Business intelligence (BI) means mining through all that digital data—in legacy systems, databases, and even spreadsheets—and reporting what’s going on. This generally requires creating aggregations that need server farms with big hard disks and lots of memory. But text search engine technology, using sophisticated versions of inverted indexing, can create files that are effectively shadow databases in much less space, optimized for fast retrieval. These search/BI hybrids also provide sophisticated access to the contents of text fields, making customers very happy indeed. more...

I admit to being pleased that inverted indexes turn out to be so good, as per Zobel and Moffat, "Inverted files for text search engines" (2006). But I'd really like to know what the limits of these BI tools are: anyone have any insights?
