Searching For Good News
I spend a lot of effort crawling through the web, looking for items that can extend my understanding of nuclear fission. Usually this consists of jumping from web site to web site based on links that others think are interesting, or I end up doing searches with various tools and scanning through their returns looking for new sites. This got me thinking about the differences between search engines and whether or not one is better than another. I am now trying to measure these differences to answer this question.
I started off by deciding to do a measured seach based on these keywords:
nuclear reactor positive benefit
I selected five search engines for comparison:
S1 - Google - http://www.google.ca
S2 - A9 - http://a9.com
S3 - Scirus - http://www.scirus.com
S4 - Teoma - http://www.teoma.com/
S5 - MSN - http://beta.search.msn.com/
For each search I ranked the returned URLs as 1, 2, 3, 4, 5, or 6. Anything beyond fifth place got a rank of six. Using five search engines, if each one returned something different in the first five places I would have 25 URLs to consider. In fact, I got 20 so there is some overlap, but not as much as I expected. The ranks are listed in the following table:
This data allows me to compare one search engine with another by calculating the sum of the squared differences of their ranks for each URL. This calculation will produce a value of 0 if both search engines rank all the URLs with the same values, and a value of 110 if their rankings are as different as possible. So similar search results will produce low scores and differing results will yield high values. The calculations gave:
I concluded from this that S1 and S2 (Google and A9) are very similar so there is no need to use A9 if Google has been run.
S4 and S5 (Teoma and MSN) also showed some overlap, but not enough to make me think they are similar.
So I think I get my best search power by using a combination of Google, Scirus, Teoma, and MSN. I have a list of about thirty search engines that I use often, and I am going to continue collecting this data to determine the best subset.
The URLs found for this study were:
These results indicate another problem. I wanted to find articles that discussed the positive benefits of nuclear reactors. Many of the found pages were decidely negative. The search engine does not make a distinction between "no benefit" and "benefit". So I am looking for a better way to search for relevant articles. The simple keyword list does not do it. In this regard it seems that MSN did a better job than the others.
I started off by deciding to do a measured seach based on these keywords:
nuclear reactor positive benefit
I selected five search engines for comparison:
S1 - Google - http://www.google.ca
S2 - A9 - http://a9.com
S3 - Scirus - http://www.scirus.com
S4 - Teoma - http://www.teoma.com/
S5 - MSN - http://beta.search.msn.com/
For each search I ranked the returned URLs as 1, 2, 3, 4, 5, or 6. Anything beyond fifth place got a rank of six. Using five search engines, if each one returned something different in the first five places I would have 25 URLs to consider. In fact, I got 20 so there is some overlap, but not as much as I expected. The ranks are listed in the following table:
Ranks | S1 | S2 | S3 | S4 | S5 |
URL1 | 1 | 1 | 6 | 6 | 6 |
URL2 | 2 | 2 | 6 | 6 | 6 |
URL3 | 3 | 3 | 6 | 6 | 6 |
URL4 | 4 | 5 | 6 | 6 | 6 |
URL5 | 5 | 6 | 6 | 6 | 6 |
URL6 | 6 | 4 | 6 | 6 | 6 |
URL7 | 6 | 6 | 1 | 6 | 6 |
URL8 | 6 | 6 | 2 | 6 | 6 |
URL9 | 6 | 6 | 3 | 6 | 6 |
URL10 | 6 | 6 | 4 | 6 | 6 |
URL11 | 6 | 6 | 5 | 6 | 6 |
URL12 | 6 | 6 | 6 | 1 | 3 |
URL13 | 6 | 6 | 6 | 2 | 6 |
URL14 | 6 | 6 | 6 | 3 | 6 |
URL15 | 6 | 6 | 6 | 4 | 6 |
URL16 | 6 | 6 | 6 | 5 | 6 |
URL17 | 6 | 6 | 6 | 6 | 1 |
URL18 | 6 | 6 | 6 | 6 | 2 |
URL19 | 6 | 6 | 6 | 6 | 4 |
URL20 | 6 | 6 | 6 | 6 | 5 |
This data allows me to compare one search engine with another by calculating the sum of the squared differences of their ranks for each URL. This calculation will produce a value of 0 if both search engines rank all the URLs with the same values, and a value of 110 if their rankings are as different as possible. So similar search results will produce low scores and differing results will yield high values. The calculations gave:
Search Engine | Search Engine | Comparison Score |
S1 | S2 | 6 |
S1 | S3 | 110 |
S1 | S4 | 110 |
S1 | S5 | 110 |
S2 | S3 | 110 |
S2 | S4 | 110 |
S2 | S5 | 110 |
S3 | S4 | 110 |
S3 | S5 | 110 |
S4 | S5 | 80 |
I concluded from this that S1 and S2 (Google and A9) are very similar so there is no need to use A9 if Google has been run.
S4 and S5 (Teoma and MSN) also showed some overlap, but not enough to make me think they are similar.
So I think I get my best search power by using a combination of Google, Scirus, Teoma, and MSN. I have a list of about thirty search engines that I use often, and I am going to continue collecting this data to determine the best subset.
The URLs found for this study were:
Identifier | URL | Positive? |
URL1 | http://www.umich.edu/~gs265/society/nuclear.htm | positive |
URL2 | http://www.info.gov.za/speeches/2001/0106281145a1003.htm | positive |
URL3 | http://www.inthenationalinterest.com/Articles/Vol3Issue35/Vol3Issue35Realist.html | neutral |
URL4 | http://www.reactnow.org/about_reactor.html | negative |
URL5 | http://www.american.edu/TED/irannuke.htm | neutral |
URL6 | http://www.uic.com.au/nip29.htm | positive |
URL7 | http://www.nrc.gov/reading-rm/doc-collections/commission/tr/2001/20010117b.html | positive |
URL8 | http://www.lib.ncsu.edu/archives/etext/engineering/reactor/NEfurther010052.html | error ? |
URL9 | http://www.vanderbilt.edu/radsafe/9709/msg00075.html | neutral |
URL10 | http://www.volpe.dot.gov/opsad/risk/risk.pdf | negative |
URL11 | http://www.engr.wisc.edu/alumni/perspective/27.3/Gift01.html | neutral |
URL12 | http://www.neis.org/literature/Reports%26Testimonies/full_terrorist_report_10-22-01.htm | negative |
URL13 | http://www.akaction.net/FTGreely.pdf | negative |
URL14 | http://www.sea-us.org.au/no2reactor/anstomisinfo.html | negative |
URL15 | http://www.msnbc.msn.com/id/5591511/ | negative |
URL16 | http://www.world-nuclear.org/education/ral.htm | positive |
URL17 | http://www.nuclearfaq.ca | positive |
URL18 | http://www-formal.stanford.edu/jmc/progress/nuclear-faq.html | positive |
URL19 | http://neinuclearnotes.blogspot.com | positive |
URL20 | http://positiveenergy.blogspot.com | positive |
These results indicate another problem. I wanted to find articles that discussed the positive benefits of nuclear reactors. Many of the found pages were decidely negative. The search engine does not make a distinction between "no benefit" and "benefit". So I am looking for a better way to search for relevant articles. The simple keyword list does not do it. In this regard it seems that MSN did a better job than the others.
9 Comments:
Interesting analysis. We have some pretty amazing tools at our disposal in our search for information and our quest to share that information.
I like using Google's new alerts to notify me about articles of interest. I have recently been using the search "new nuclear power plants" with pretty interesting results. Check out all of the new posts on the Atomic Insights Blog.
By Rod Adams, at 12 October, 2005 04:33
Get 1000s of Links pointing back to Your Site... Starting Today!
By Anonymous, at 28 October, 2005 02:14
Consider the power of being able to create incoming links to your site any time you want them...
By Anonymous, at 28 October, 2005 02:15
High Google and Yahoo link popularity can be yours,
By Anonymous, at 31 October, 2005 01:43
Advertising can be a big problem otherwise. A lot of companies reserve a big chunk of their budgets to cover marketing expenditures.
By Anonymous, at 03 November, 2005 21:37
High Google and Yahoo link popularity can be yours,
By Anonymous, at 06 November, 2005 10:55
Join the 3 Marketeers Club today, its Absolutely FREE!!
By Anonymous, at 11 November, 2005 06:54
What can I say in addition to Searching For Good News ? I think that you explained it very well! Would you be interested in a website about scommesse sportive ? Have a jump on it and look for something that you can be interested in! You'll find only scommesse sportive .
By Anonymous, at 18 December, 2005 07:29
What can I say in addition on Searching For Good News ? I think that you explained it very well! Would you be interested in a website about scommesse calcio ? Have a jump on it and look for something that you can be interested in! You'll find only scommesse calcio .
By Anonymous, at 18 December, 2005 16:57
Post a Comment
<< Home