Log in

No account? Create an account
SearchTools Blog
Two new Search / Information Retrieval textbooks 
2nd-Apr-2009 03:44 pm

Introduction to Information Retrieval by: Christopher D Manning, Prabhakar Raghavan, Hinrich Schütze; July 2008 from Cambridge University Press [disclosure: the link has my amazon affiliate code]

I've been going through this book for a while, and I like it. It's an interesting way of ordering the content, but that content itself seems very much more practical than previous textbooks. with helpful information about language detection and the issues of index structure and caching, classification (and evaluation thereof), machine learning for interactive search (as opposed to batch), and various algorithms for relevance ranking They cover practical topics like lowercasing in the index, which I agree with, and there's not much I find maddeningly wrong. However, they postpone citations to chapter reference sections, so it is sometimes not clear that there are no citations for some practical topics, such as whether excluding stopwords causes more harm than good -- I agree -- but I sure would like to see the research. And if there isn't any, I'd like to know that too (and hope someone will fill the gap, soon!)

Table of Contents:
  • Boolean retrieval
  • The term vocabulary and postings lists
  • Dictionaries and tolerant retrieval
  • Index construction
  • Index compression
  • Scoring, term weighting and the vector space model
  • Computing scores in a complete search system
  • Evaluation in information retrieval
  • Relevance feedback and query expansion
  • XML retrieval
  • Probabilistic information retrieval
  • Language models for information retrieval
  • Text classification and Naive Bayes
  • Vector space classification
  • Support vector machines and machine learning on documents
  • Flat clustering
  • Hierarchical clustering
  • Matrix decompositions and latent semantic indexing
  • Web search basics
  • Web crawling and indexes
  • Link analysis

I just saw and ordered Search Engines: Information Retrieval in Practice by Bruce Croft, Donald Metzler, Trevor Strohman; March 2009 from Addison-Wesley (Pearson Higher Ed) ISBN-10: 013607782X; ISBN-13: 9780136077824 [disclosure: the link has my amazon affiliate code]

This looks even closer to my approach to practical information retrieval. I'm all for reducing the distance between theoretical IR, which lives in algorithms and equations, and real-life search, which concentrates on handling short queries, providing useful information foraging pathways, and so on. More when I read it.

This page was loaded Nov 17th 2018, 4:16 am GMT.