SearchTools Blog ([info]searchtools) wrote,

Thoughts on Federated and Aggregated Search

Federated vs. Aggregated Search Architectures

Federated search systems accept user queries, convert the query language, and send the queries to one or more remote search engines. They then display the results, sometimes in separate blocks, sometimes merged together. vital for searching external or un-owned data sources, such as national patent databases or legal archives. Federated search requires a lot of work to translate queries and deal with results. The heavy lifting is done at search time, which is good for absolutely current content and access control.

Aggregated search systems gather and index text from many different data sources. When the user sends a query, it can be handled locally. Aggregated search requires some work to get data from multiple data sources, and the ability to scale index size nearly exponentially

My research for this presentation indicated that each is useful in specific circumstances (I know, no surprise there). Many data sources are obviously best accessed by one or the other, but it's the corner cases that are tricky. Aspects to consider include:

  • size of the content in the source,
  • how often your users need that content,
  • content change rate
  • importance of real-time access control permissions changes
  • content licensing rules
  • available tools for indexing / querying
  • difficulty of extracting and indexing
  • quality of the internal search engine
  • difficulty of sending queries and receiving results.

Slides (with fish!)     presented by Avi Rappoport at ESS, May 2010

Federated and Aggregated Search, Web View (color PDF)

Federated and Aggregated Search, Printable (grayscale 4-up PDF)

Comments? Arguments? Explanations? Please discuss below. Want an analysis of your data sources? I can do that, comment here or send me a message.


  • Post a new comment

    Error

    Your reply will be screened

    Your IP address will be recorded 

  • 3 comments

[info]dp_maxime

June 13 2010, 13:31:21 UTC 1 year ago

May Aggregated search be called an Universal search, or it's different to what Google does ?

[info]searchtools

June 14 2010, 19:52:56 UTC 1 year ago

Universal is more of a marketing term, and it's wrong because it does not deliver anything close to universal search. Aggregated is what I've invented as a more accurate term.

Anonymous

September 7 2010, 14:48:18 UTC 1 year ago

Making connections

A traditional search user interface takes the user's query and connects them to a store of crawled/indexed content which has been populated from a content repository. The process of acquiring content and keeping it updated belongs to the owner of the search engine and can be expensive and time consuming to configure and maintain. The first step in most projects is to implement and configure the correct custom connectors for the repository that you are working with. These connectors work by normalising the content into one form that the internal index can digest.

In the federated case, the remote search engine has done the work of indexing the content and so the federated search user should be getting a freebie. Unfortunately, the remote search engine was designed with people in mind and produces results using an HTTP/HTML interface. This in turn results in a poor computer to computer exchange. The enabling connector is now a proxy service that manages the query submission and result translation on a case by case basis.

To get the best of both worlds, we need connectors that improve the efficiency of the federated search by allowing the computers at either end to have an open dialogue. We have enjoyed this for years in the database world with ODBC and the various language specific variations. The ODBC connectors that you used didn't have to be acquired from the people selling the database.

My experience of vendors selling federated solution is that they do the federating bit but then offer you a kit of parts to write your own connector.
Create an Account
Forgot your login or password?
Facebook Twitter More login options
English • Español • Deutsch • Русский…