Categories
josephscott

Ranked Search, Merging Google And Yahoo

Last week I discovered Twingine via Russell. Twingine puts the results of your query from Google and Yahoo! into frames so that you can compare the results side by side. This seemed like an interesting idea, but the interface isn’t particularly useful, it’s too much work to visually compare the two. Then I remembered Matt’s announcement of using Yahoo’s search APIs at WordPress.org, which started me thinking about the availability of APIs from Yahoo! and from Google.

It seemed like there should be some way of combining these two resources, taking the search results from both Google and Yahoo! and mix them in some semi-meaningful way. So last night I started putting together the Ranked Search website. You enter a query and the site requests the top 10 results from both Yahoo! and Google via their search APIs, giving each link rank. The first link gets rank of ten and so on through all ten links from each result set. The idea being the the links with the highest rank are more likely to be what you are looking for. Then I look for links that appear in both sets, merging them into one, with a new rank that is the sum of their original ranks. All of the unique links from each set are then merged in and the new set is sorted by rank. The highest potential score is 20 (where Google and Yahoo! both return the same link in the #1 position) and the lowest possible score is 1. It is really basic stuff.

Making requests out over the Internet to both Google and Yahoo! isn’t the fastest thing in the west. So I put in some basic caching for every query. The result sets from every query is cached in a PostgreSQL database and is used when a exact query match is found and the results are less than 12 hours old. If the results are more than 12 hours old the query is sent off to Google and Yahoo! and the new results are cached again.

Everything is very plain and basic right now, consider it an experiment. If you have any additional thoughts leave a comment or use my contact form to drop me a note.

Most of this information is also available on the about page for Ranked Search.