<?xml version='1.0' encoding='utf-8' ?>
<!--  If you are running a bot please visit this policy page outlining rules you must respect. http://www.livejournal.com/bots/  -->
<rss version='2.0' xmlns:lj='http://www.livejournal.org/rss/lj/1.0/' xmlns:media='http://search.yahoo.com/mrss/' xmlns:atom10='http://www.w3.org/2005/Atom'>
<channel>
  <title>SearchTools Blog</title>
  <link>http://searchtools.livejournal.com/</link>
  <description>SearchTools Blog - LiveJournal.com</description>
  <lastBuildDate>Tue, 17 Nov 2009 01:31:32 GMT</lastBuildDate>
  <generator>LiveJournal / LiveJournal.com</generator>
  <lj:journal>searchtools</lj:journal>
  <lj:journalid>1461002</lj:journalid>
  <lj:journaltype>personal</lj:journaltype>
  <atom10:link rel='hub' href='http://pubsubhubbub.appspot.com/' />
  <image>
    <url>http://l-userpic.livejournal.com/90795156/1461002</url>
    <title>SearchTools Blog</title>
    <link>http://searchtools.livejournal.com/</link>
    <width>16</width>
    <height>16</height>
  </image>

<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/90353.html</guid>
  <pubDate>Tue, 17 Nov 2009 01:31:32 GMT</pubDate>
  <title>Fundamentals of Enterprise Search workshop slides</title>
  <link>http://searchtools.livejournal.com/90353.html</link>
  <description>the workshop went really well, interesting discussions and questions.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/90353.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/90353.html</comments>
  <lj:mood>accomplished</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/90010.html</guid>
  <pubDate>Sat, 14 Nov 2009 00:46:07 GMT</pubDate>
  <title>Enterprise Search Summit West: Nov. 16-19, 2009</title>
  <link>http://searchtools.livejournal.com/90010.html</link>
  <description>The &lt;a href=&quot;http://enterprisesearchsummit.com/west2009/&quot;&gt;Enterprise Search Summit&lt;/a&gt; (west and east) are always great meetings, and very productive for me.  The presentations are less vendor-brainwashing and more valuable insights and case studies.  I learn a lot from those exhibitors who have programmers and product managers for me to talk with, the group lunches, and the hallway conversations.  &lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/90010.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/90010.html</comments>
  <lj:mood>excited</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/89756.html</guid>
  <pubDate>Fri, 13 Nov 2009 23:50:52 GMT</pubDate>
  <title>Apache Solr 1. and Apachecon Meetup notes</title>
  <link>http://searchtools.livejournal.com/89756.html</link>
  <description>Solr is a leading open-source enterprise search package: this new version is is built on Apache 2.9.1 and Java 1.5 VM.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/89756.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/89756.html</comments>
  <lj:mood>tracking</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/88949.html</guid>
  <pubDate>Tue, 03 Nov 2009 03:20:46 GMT</pubDate>
  <title>Looking at real-time and near-real-time search</title>
  <link>http://searchtools.livejournal.com/88949.html</link>
  <description>&lt;strong&gt;Real-time vs. near-real-time search&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/88949.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/88949.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>11</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/88595.html</guid>
  <pubDate>Fri, 09 Oct 2009 02:30:56 GMT</pubDate>
  <title>open source search levels the playing field</title>
  <link>http://searchtools.livejournal.com/88595.html</link>
  <description>I wrote a &lt;a href=&quot;http://newsbreaks.infotoday.com/NewsBreaks/Lucene--and-the-Power-of-Open-Source-56497.asp&quot;&gt;NewsBreak&lt;/a&gt; for &lt;a href=&quot;http://infotoday.com&quot;&gt;infotoday.com&lt;/a&gt; about the new version of the open source search engine library &lt;a href=&quot;http://lucene.apache.org&quot;&gt;Lucene 2.9&lt;/a&gt; and associated projects.   Stepping back a bit to look at the whole thing rather than a &lt;a href=&quot;http://searchtools.livejournal.com/88266.html&quot;&gt;feature summary&lt;/a&gt;.  And I ended up with an even deep appreciation of open source search engines in general and in particular, the &lt;a href=&quot;http://lucene.apache.org&quot;&gt;Lucene&lt;/a&gt; family of search-related tools (language ports, &lt;a href=&quot;http://lucene.apache.org/solr/&quot;&gt;Solr&lt;/a&gt; search engine, &lt;a href=&quot;http://lucene.apache.org/nutch/&quot;&gt;Nutch&lt;/a&gt; web crawler, &lt;a href=&quot;http://lucene.apache.org/tika/&quot;&gt;Tika&lt;/a&gt; file format converter, and more).  They are as capable and powerful as many commercial enterprise search engines.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/88595.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/88595.html</comments>
  <lj:mood>busy</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/88266.html</guid>
  <pubDate>Thu, 24 Sep 2009 19:07:10 GMT</pubDate>
  <title>Lucene 2.9 to be released very soon!</title>
  <link>http://searchtools.livejournal.com/88266.html</link>
  <description>&lt;p&gt;&lt;a href=&quot;http://lucene.apache.org&quot;&gt;Apache Lucene&lt;/a&gt; is the most prominent open source search engine, and powers search on a lot of really interesting sites.  The new version, 2.9, has internal improvements, re-factoring and new functionality.&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/88266.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/88266.html</comments>
  <lj:mood>encouraged</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/87866.html</guid>
  <pubDate>Thu, 24 Sep 2009 00:23:32 GMT</pubDate>
  <title>good introduction to search analytics article</title>
  <link>http://searchtools.livejournal.com/87866.html</link>
  <description>Very clear and well-written: &lt;a href=&quot;http://www.alistapart.com/articles/internal-site-search-analysis-simple-effective-life-altering/&quot;&gt;Internal Site Search Analysis: Simple, Effective, Life Altering!&lt;/a&gt;.  It covers both search on a site, and search from search engines leading to a site, with very useful examples and screenshots from Google Analytics.</description>
  <comments>http://searchtools.livejournal.com/87866.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/87726.html</guid>
  <pubDate>Thu, 03 Sep 2009 20:36:22 GMT</pubDate>
  <title>Enterprise Search Summit &amp; Infonortics Search Engines Meeting news</title>
  <link>http://searchtools.livejournal.com/87726.html</link>
  <description>&lt;a href=&quot;http://infotoday.com&quot;&gt;Information Today, Inc&lt;/a&gt; (producer of the &lt;a href=&quot;http://enterprisesearchsummit.com&quot;&gt;Enterprise Search Summit&lt;/a&gt;), has announced that it&apos;s acquiring  the &lt;a href=&quot;http://www.infonortics.com/searchengines/index.html&quot;&gt;Infonortics Search Engines Meeting&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/87726.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/87726.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/87513.html</guid>
  <pubDate>Fri, 24 Jul 2009 22:36:30 GMT</pubDate>
  <title>Google CSE - search and results interfaces</title>
  <link>http://searchtools.livejournal.com/87513.html</link>
  <description>&lt;p&gt;There are four kinds of Google Custom Search Engine search / results interface:
&lt;ul&gt;&lt;li&gt;Simple form, showing results on a normal Google-hosted page with minimal customization &lt;a href=&quot;http://searchtools.com/analysis/gsce-compare-interfaces.html&quot; target=&quot;_top&quot;&gt;(example&lt;/a&gt;)&lt;/li&gt; &lt;br /&gt;
&lt;li&gt;Form with links to a template page, JavaScript inserts &lt;strong&gt;iframe&lt;/strong&gt; with search results pre-formatted (&lt;a href=&quot;http://searchtools.com/analysis/gsce-compare-interfaces.html#iframe&quot; target=&quot;_top&quot;&gt;iframe example&lt;/a&gt;). Fits into site colors, design and navigation, but has minimal other results customization, including the width of the results list box.&lt;/li&gt;&lt;br /&gt;
&lt;li&gt;Custom Search Element - AJAX object draws a search form, JavaScript can draw result list anywhere (&lt;a href=&quot;http://searchtools.com/analysis/gsce-compare-interfaces.html#ajax&quot; target=&quot;_top&quot;&gt;AJAX example&lt;/a&gt;). This is the new cool programmatic toy, and &lt;a href=&quot;http://www.searchtools.com/analysis/google-cse-ajax-css.html&quot; target=&quot;_top&quot;&gt;with CSS&lt;/a&gt;, it&apos;s very flexible for customizing width, colors, sizes and styles and more.&lt;/li&gt;
&lt;br /&gt;
&lt;li&gt;XML query and result protocol (paid Site Search only) is by far the most comprehensive and flexible. See the &lt;a href=&quot;http://www.google.com/coop/docs/cse/resultsxml.html&quot; target=&quot;_top&quot;&gt;XML Protocol Reference&lt;/a&gt; for extensive documentation.&lt;/li&gt; 
&lt;/ul&gt;

If you have any questions, suggestions or corrections, please comment on this post (non-account comments allowed but screened), send email to nets 9 at searchtools.com, or contact me through the &lt;a href=&quot;http://www.searchtools.com/site/contact.html&quot; target=&quot;_top&quot;&gt;site contact form.&lt;/a&gt;
&lt;img src=&quot;http://stools.icons.ljtoys.org.uk/mi/dot.gif&quot; border=&quot;0&quot; alt=&quot;&quot;&gt;</description>
  <comments>http://searchtools.livejournal.com/87513.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/87190.html</guid>
  <pubDate>Thu, 09 Jul 2009 00:14:52 GMT</pubDate>
  <title>Netflix Recommender Prize Won (probably)</title>
  <link>http://searchtools.livejournal.com/87190.html</link>
  <description>Netflix has posted a &lt;a href=&quot;http://www.netflixprize.com/rules&quot;&gt;contest for improving its own movie recommendation system by at least 10%&lt;/a&gt; and the prize is a million dollars. People have been working on this since 2006, and there  have been several Progress prizes.  Finally, four teams merged to create the &lt;a href=&quot;http://www.research.att.com/~volinsky/netflix/bpc.html&quot;&gt;BellKor&apos;s Pragmatic Chaos Team&lt;/a&gt;, which added temporal dynamics to the recommendation weights.  This beat the current Netflix recommendation algorithm by 10.0%%.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/87190.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/87190.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/86836.html</guid>
  <pubDate>Wed, 08 Jul 2009 00:09:32 GMT</pubDate>
  <title>thoughts on search engine comparisons</title>
  <link>http://searchtools.livejournal.com/86836.html</link>
  <description>Vik Singh wrote an in-depth post about his &lt;a href=&quot;http://zooie.wordpress.com/2009/07/06/a-comparison-of-open-source-search-engines-and-indexing-twitter/&quot;&gt;comparison of open-source search engines&lt;/a&gt;.  He  tested the default configurations for Lucene, zettair, sphinx, and Xapian, with a nod to sqlite.  &lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/86836.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/86836.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/86572.html</guid>
  <pubDate>Fri, 26 Jun 2009 03:46:17 GMT</pubDate>
  <title>Recommended Book: Search User Interfaces</title>
  <link>http://searchtools.livejournal.com/86572.html</link>
  <description>Marti Hearst, Search User Interfaces, 1st ed.    Cambridge University Press, Septermber 2009. [Online: &lt;a href=&quot;http://www.searchuserinterfaces.com&quot;&gt;www.searchuserinterfaces.com&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/86572.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/86572.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/86521.html</guid>
  <pubDate>Thu, 25 Jun 2009 00:20:56 GMT</pubDate>
  <title>Google CSE AJAX API  - Using CSS on the search results</title>
  <link>http://searchtools.livejournal.com/86521.html</link>
  <description>&lt;h2 style=&quot;font-size:medium;&quot;&gt;CSS styling of AJAX search results&lt;/h2&gt;

&lt;p&gt;For customizing the fonts, colors, size, and styles of the Google Custom Search Engine (CSE), there is a many-layered hierarchy of div class names. For example, I found the right name to to turn all the result item titles red (even the bold subsections).  I had to use the name of my search results for the browser to render correctly:&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/86521.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/86521.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/86044.html</guid>
  <pubDate>Tue, 23 Jun 2009 03:50:55 GMT</pubDate>
  <title>Decoding the new Google Custom Search API</title>
  <link>http://searchtools.livejournal.com/86044.html</link>
  <description>&lt;p&gt;Google has released a new version of their Custom/Site Search service, and added an &amp;quot;Element&amp;quot; -- a wizard-driven JavaScript that non-technical users can copy and paste to their web sites, even blogs which do not allow uploading. Search Tools has a new &lt;a href=&quot;http://www.searchtools.com/analysis/google-cse-ajax-api-analysis.html&quot;&gt;Analysis of the CSE and AJAX API&lt;/a&gt;.  I also wrote a  &lt;a href=&quot;http://www.searchtools.com/analysis/google-cse-ajax-basic-example.html&quot;&gt;fully-commented sample code&lt;/a&gt; with a live version on the same page, because this is much harder for non-programmers to customize than the forms or even the Site Search XML interface (paid version only). I&apos;ll be doing more on customizing and functionality and display during this week.&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/86044.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/86044.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>3</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/85914.html</guid>
  <pubDate>Fri, 05 Jun 2009 20:50:17 GMT</pubDate>
  <title>Lucene/Solr meetup notes</title>
  <link>http://searchtools.livejournal.com/85914.html</link>
  <description>

&lt;h2&gt;Lucene/Solr Meetup, June 2009&lt;/h2&gt;
&lt;h3&gt;Notes by Avi Rappoport, Search Tools Consulting&lt;/h3&gt;
&lt;a name=&quot;cutid1&quot;&gt;&lt;/a&gt;&lt;h4&gt;Solr 1.4, Near-real-time indexing, Payload efficiency, Trierange, Query parser framework, Zevents, Xoopit, Lucid search, Stopwords are obsolete, OpenRelevance&lt;/h4&gt;

&lt;p&gt;&lt;a href=&quot;http://www.meetup.com/SFBay-Lucene-Solr-Meetup/calendar/10465433/&quot;&gt;Meetup event info&lt;/a&gt;&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/85914.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/85914.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/85749.html</guid>
  <pubDate>Thu, 28 May 2009 23:32:39 GMT</pubDate>
  <title>Best Practices for thumbnails in search results - article</title>
  <link>http://searchtools.livejournal.com/85749.html</link>
  <description>&lt;a href=&quot;http://www.uxmatters.com/mt/archives/2009/05/making-10000-a-pixel-optimizing-thumbnail-images-in-search-results.php&quot;&gt;Making $10,000 a Pixel: Optimizing Thumbnail Images in Search Results :: UXmatters&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/85749.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/85749.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/85390.html</guid>
  <pubDate>Thu, 28 May 2009 22:40:19 GMT</pubDate>
  <title>Twitter Search Has Big Ambitions</title>
  <link>http://searchtools.livejournal.com/85390.html</link>
  <description>My &lt;a href=&quot;http://newsbreaks.infotoday.com/NewsBreaks/Twitter-Search-Has-Big-Ambitions-53868.asp&quot;&gt;overview of the state of Twitter Search&lt;/a&gt; on infotoday.com - it doesn&apos;t even try to do relevance ranking right now, so it&apos;s not exactly a Google killer, despite the hype.&amp;nbsp; &lt;br /&gt;</description>
  <comments>http://searchtools.livejournal.com/85390.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/84737.html</guid>
  <pubDate>Tue, 21 Apr 2009 01:53:05 GMT</pubDate>
  <title>#AmazonFail: garbage in, garbage out</title>
  <link>http://searchtools.livejournal.com/84737.html</link>
  <description>My article on #amazonfail is up at &lt;a href=&quot;http://bit.ly/avirr&quot;&gt;Amazonfail: How Metadata and Sex Broke the Amazon Book Search&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;Amazon failed in a big way on Easter weekend. As the largest bookstore in the world, if a book does not appear in its lists or its search results, the book practically disappears. The event now known as #AmazonFail involves a great cast of characters-books, metadata, sex, search results, traditionally disenfranchised groups, a possible hacker, the Kindle, the absence of institutional response, and the emergence of Twitter for sharing information very quickly on a massive scale.&lt;/blockquote&gt;&lt;br /&gt;So that&apos;s what I was working on last week.&lt;br /&gt;&lt;img src=&quot;http://stools.icons.ljtoys.org.uk/mi/dot.gif&quot; border=&quot;0&quot; alt=&quot;&quot;&gt;</description>
  <comments>http://searchtools.livejournal.com/84737.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>2</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/84562.html</guid>
  <pubDate>Wed, 15 Apr 2009 21:10:43 GMT</pubDate>
  <title>Enterprise Search Summit / NY,  May 12 - 13 2009</title>
  <link>http://searchtools.livejournal.com/84562.html</link>
  <description>The Enterprise Search Summit &lt;a href=&quot;https://secure.infotoday.com/forms/default.aspx?form=ess2009&quot;&gt;early registration deadline (save $100) is this Friday&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/84562.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/84562.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/84476.html</guid>
  <pubDate>Thu, 02 Apr 2009 22:42:38 GMT</pubDate>
  <title>Two new Search / Information Retrieval textbooks</title>
  <link>http://searchtools.livejournal.com/84476.html</link>
  <description>&lt;p&gt;&lt;a href=&quot;http://www.amazon.com/gp/product/0521865719?ie=UTF8&amp;amp;tag=searchtoolscom&quot;&gt;Introduction to Information Retrieval&lt;/a&gt;  by: Christopher D Manning, Prabhakar Raghavan, Hinrich Schütze; July 2008 from Cambridge University Press  &lt;i&gt;[disclosure: the link has my amazon affiliate code]&lt;/i&gt;&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/84476.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/84476.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/84087.html</guid>
  <pubDate>Wed, 25 Mar 2009 22:04:21 GMT</pubDate>
  <title>Openfind Enterprise Search (OES) - New SearchTools Report</title>
  <link>http://searchtools.livejournal.com/84087.html</link>
  <description>&lt;h4&gt;March 25, 2009&lt;/h4&gt;
			&lt;blockquote&gt;
				&lt;h4&gt;&lt;a href=&quot;http://www.searchtools.com/tools/openfind.html&quot;&gt;Openfind Enterprise Search (OES)&lt;/a&gt;&lt;/h4&gt;
				&lt;p&gt;Openfind is a leading enterprise search engine company in Taiwan, providing search to many government departments and corporations since 1998, scalable to over 50 million items in their standard licence. In addition to  documents, it can index text and some numeric content from relational databases, off-loading the search and spreading the server load.&lt;/p&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/84087.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/84087.html</comments>
  <lj:mood>working</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/83857.html</guid>
  <pubDate>Tue, 10 Mar 2009 23:31:02 GMT</pubDate>
  <title>searchtools links</title>
  <link>http://searchtools.livejournal.com/83857.html</link>
  <description>Interesting stuff I found today:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/83857.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/83857.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/83651.html</guid>
  <pubDate>Mon, 09 Mar 2009 21:24:46 GMT</pubDate>
  <title>some thoughts on Wolfram Alpha search</title>
  <link>http://searchtools.livejournal.com/83651.html</link>
  <description>Stephen Wolfram, the guy who made Mathematica into a big software company and then wrote a book, is now plugging his new &lt;a href=&quot;http://www.wolframalpha.com/&quot;&gt;Wolfram | Alpha&lt;/a&gt; search engine, meant to compute knowledge (or at least answers) instead of return lists of pages which may have answers.  This is not new,  &lt;a href=&quot;http://trec.nist.gov&quot;&gt;TREC&lt;/a&gt; has done it for decades.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/83651.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/83651.html</comments>
  <lj:mood>hopeful</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>5</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/83212.html</guid>
  <pubDate>Fri, 06 Mar 2009 01:53:15 GMT</pubDate>
  <title>More about file format parsing tools for indexing</title>
  <link>http://searchtools.livejournal.com/83212.html</link>
  <description>This is a follow-up to my article about &lt;a href=&quot;http://searchtools.livejournal.com/83097.html&quot;&gt; file format access tools and Lucene Tika&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/83212.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/83212.html</comments>
  <lj:security>public</lj:security>
  <lj:reply-count>0</lj:reply-count>
</item>
<item>
  <guid isPermaLink='true'>http://searchtools.livejournal.com/83097.html</guid>
  <pubDate>Fri, 27 Feb 2009 23:05:11 GMT</pubDate>
  <title>Tika: open source access to text in many formats</title>
  <link>http://searchtools.livejournal.com/83097.html</link>
  <description>Search engines need text to index: this may seem obvious but the devil is in the details.  Extracting text is easy when working with txt, html or xml files, but much more difficult for binary files, including MS Office and archive formats.  So search indexers need to use file format parsers, also called &quot;filters&quot;.  These can access the binary file formats, extracting the text and keeping track of whatever structure is there. Some file parsers are better than others, and all of them may need updating: as Microsoft switched from the proprietary format to their xml (doc -&amp;gt; docx), the search indexers need updated filters to read the new formats.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;(&lt;a href=&quot;http://searchtools.livejournal.com/83097.html&quot;&gt;Read more ...&lt;/a&gt;)&lt;/b&gt;</description>
  <comments>http://searchtools.livejournal.com/83097.html</comments>
  <lj:mood>working</lj:mood>
  <lj:security>public</lj:security>
  <lj:reply-count>9</lj:reply-count>
</item>
</channel>
</rss>
