October 17th, 2008


Results Header Accuracy (WikiMedia Search Analysis)

Why MediaWiki's Site Search Stinks, Reason #4

How many actual matches does your search find? It's a mystery...

People often start their search with very generic words, such as cookie on a recipe site. I see this in log file analysis all the time, and I think it's a little test of the contents of the site, and the functionality of the search engine. Once they see that it works, they usually either click a category or a results item, or add words to the search. So search engines should be prepared to perform very large searches. The MediaWiki search engine can handle this reasonably well, for example finding all cookie recipes out of nearly 50,000 recipes on recipes.wikia.com (but not cookies, not in the same search, see the Limited Query tools analysis).

The result is a set of Page title matches and Page text matches, but how many did the search actually find?

screencap of results header

I happen to know that it's 1518 recipes, but is there any way to tell? There's no indication of that here: it looks like there were only twenty. I've found that people use the results count to give them a general idea of what content is on the site. In the recipes case, it looks like this wiki is nearly empty, with only twenty cookie recipes. I notice that Wikipedia itself has added the total numer of results to this part of the header.

The results header is combining the number of results with the results listing navigation in just about the most confusing way possible.. The link (next 20) is a fairly normal results page link, though much less clear than the oooo of Google's results, or the familiar simple number list for navigating results pages from other search engines.

The links (20 | 50 | 100 | 250 | 500) do not mean "skip to 20" or 50 or 100. Rather, it's the MediaWiki way of setting the number of results per page, without a label in sight. To see all the recipes, one has to click on the (next 20) link 759 times, or reset the page to show 500 results per page, and click on the next link three times. That is also the only way to find out how many pages are found.

In contrast, the Exalead example wikipedia search says precisely how many articles it found, and the footer has page numbers for navigation:

exalead search results header
exalead search results footer

Web search conventions exist for a reason. Any search results interface that doesn't follow them should have justification and usability studies to make sure that its doing more good than harm. MediaWiki doesn't seem to have either.

Avi Rappoport
Search Tools Consulting