April 2nd, 2009

searchtools.com

Two new Search / Information Retrieval textbooks

Introduction to Information Retrieval by: Christopher D Manning, Prabhakar Raghavan, Hinrich Schütze; July 2008 from Cambridge University Press [disclosure: the link has my amazon affiliate code]

I've been going through this book for a while, and I like it. It's an interesting way of ordering the content, but that content itself seems very much more practical than previous textbooks. with helpful information about language detection and the issues of index structure and caching, classification (and evaluation thereof), machine learning for interactive search (as opposed to batch), and various algorithms for relevance ranking They cover practical topics like lowercasing in the index, which I agree with, and there's not much I find maddeningly wrong. However, they postpone citations to chapter reference sections, so it is sometimes not clear that there are no citations for some practical topics, such as whether excluding stopwords causes more harm than good -- I agree -- but I sure would like to see the research. And if there isn't any, I'd like to know that too (and hope someone will fill the gap, soon!)

Collapse )

I just saw and ordered Search Engines: Information Retrieval in Practice by Bruce Croft, Donald Metzler, Trevor Strohman; March 2009 from Addison-Wesley (Pearson Higher Ed) ISBN-10: 013607782X; ISBN-13: 9780136077824 [disclosure: the link has my amazon affiliate code]

This looks even closer to my approach to practical information retrieval. I'm all for reducing the distance between theoretical IR, which lives in algorithms and equations, and real-life search, which concentrates on handling short queries, providing useful information foraging pathways, and so on. More when I read it.