The classic IR example of how a search term can be ambiguous is "bank" -- does that mean "financial institution", "to store something", "side of a river", "airplane maneuver" or what? How should the search engine handle this situation? It gets even more complex to cope with when there are names, slang, acronyms and abbreviations added to the mix. Does a person searching for "coke" want to find the cola, the drug, the form of coal? How about searching for "freddy mac" or "jones" or "ARIA"?
( Collapse )
Oddly enough, there seems to be no accepted linguistic term for words which are spelled the same, may or may not sound the same, but mean different things:
- Homonyms sound the same or are spelled these same but mean different things (e.g. bore vs. boar).
- Homophones sound the same but are different in meaning or spelling or both
- Homographs are spelled the same and may or may not sound the same, but mean different things (e.g. bow, card, swallow). Note: many linguists use this term only for words that are spelled the same but do not sound the same.
- Polysemes have the same etymological word source but multiple meanings (according to some)
- Heteronyms are spelled the same but have different pronunciations (according to some)
For text search purposes, we only care about homographs, because the spelling is what matters.
( Collapse )