October 21st, 2008


Markup Symbols in Search Results (MediaWiki search)

This is Why MediaWiki's Site Search Stinks, Reason #5

Search Results Show Markup Symbols

MediaWiki's site search does a good thing with search results: for each article, it shows not just the title, date and size, it also displays the search terms matched in the article, with some extracted text from the area around the term, so searchers can understand the context of the match. This is particularly useful for words with multiple meanings, such as rose, pound or bank. (For more information, see Matching Search Terms In Context.)

However, most search engines remove the page markup (HTML or other) before saving the page, or at least before displaying the results with the match terms in context. MediaWiki Search does not do this, so searching for andy quotes on the Tolkien Gateway will show not just the text from the page, but also hidden text (such as in graphic file names) and markup symbols.

results items show text in markup, not visible on page

Even worse, searching on MediaWiki itself for the phrase search extensions brings results that include parts of online documentation internal hierarchy and other confusing extracts from the page markup. It's nearly worse than having no text from the page at all. Basic simple search should not look like this:

Again, understanding the point of the showing Matching Search Terms In Context is to provide valuable context for users scanning the search results. Showing mark-up characters doesn't help at all.

In the same vein, showing the exact number of kilobytes and time of day the page was last updated is probably a mistake. That text adds very little to user understanding of the results, and takes up not just space, but web-surfers' fickle attention.

Please comment with questions, clarifications, even arguments: