Search engines relevance study
Several events motivated us to
analyze in deeper details various search engines and Internet
directories, and this resulted in several short studies. Read
more details in our articles:
This short study is dedicated to
the relevance of the search engines/Internet
directories search results.
Originally, we chose 8 keywords
combinations (alternating current, direct current, power
electronics, SMPS, power supplies, PSpice, our Director of
Operations name, and a known power supply designer name),
relevant to the industry in which we are doing business and
analyzed the search results from 20+4 search engines and Internet
directories (About, AllTheWeb, AltaVista, AOL, DirectHit,
Euroseek, Excite, Google, GoTo, HotBot, iWon, LookSmart, Lycos,
MSN, Netscape, NorthernLight, OpenDirectory, Sprinks, Vivisimo,
Voila, Yahoo! categories, Yahoo! Web pages, Yahoo! Web Sites-category,
Yahoo! Web Sites-relevance). After we made all the searches,
colected the data, analyzed it, drew the conclusions, and started
writing the article, we realized that the resulting article would
be too long, with too much data, therefore difficult to read. We
decided to keep the complete data for our internal use, and to
shorten the article to a manageable size, while maintaining the
conclusions.
Here are the results:
1) Keyword "direct
current" was chosen because we found it
surprisingly in an Open Directory category. The keyword, while may
look highly relevant to Power Electronics industry, is what
we consider a wrong word to be targeted, and definitely improper
for a category in a directory.
- AltaVista - highly relevant-1 , some
relevance-0 , irrelevant-8 , spam-1
- AOL - highly relevant-2 , some
relevance-0 , irrelevant-3 , spam-1
- Excite - highly relevant-5 , some
relevance-0 , irrelevant-5 , spam-1
- Google - highly relevant-9 , some
relevance-0 , irrelevant-1 , spam-0
- Lycos - highly relevant-2 , some
relevance-0 , irrelevant-8 , spam-3
- MSN - highly relevant-1 , some
relevance-0 , irrelevant-13 , spam-1
- OpenDirectory - highly relevant-2 , some
relevance-2 , irrelevant-21 , spam-0
- Yahoo! categories - No
results! Not bad at all, actually! Better nothing than
irrelevant.
- Yahoo! Web
pages - highly
relevant-18 , some relevance-1 , irrelevant-1 , spam-1
- Yahoo! Web
Sites-category -
highly relevant-2 , some relevance-0 , irrelevant-18 ,
spam-0
- Yahoo! Web
Sites-relevance -
highly relevant-2 , some relevance-0 , irrelevant-18 ,
spam-0
- Best search results: Yahoo! Web Pages and Google
2) Keyword "power
supplies" was chosen because we consider it to be
highly relevant to the industry in which we are doing business.
It is a keyword for which we are working hard to earn top
positions in search engines by improving the quality of the site
and advertising. At this time, our assessment, based on the
content of our website is that we should be on average in the top
5 positions on all major search engines. Our goal is to earn a
top 3 position.
- AltaVista - highly relevant-8 , some
relevance-3 , irrelevant-1 , spam-5
- AOL - highly relevant-5 , some
relevance-0 , irrelevant-0 , spam-3
- Excite - highly relevant-8 , some
relevance-2 , irrelevant-0 , spam-3
- Google - highly relevant-10 , some
relevance-0 , irrelevant-0 , spam-1
- Lycos - highly relevant-10 , some
relevance-0 , irrelevant-1 , spam-8
- MSN - highly relevant-5 , some
relevance-6 , irrelevant-3 , spam-1
- OpenDirectory - highly relevant-20 , some
relevance-4 , irrelevant-0 , spam-1
- Yahoo
categories -
highly relevant-1 , some relevance-18 , irrelevant-1 ,
spam-4
- Yahoo Web pages - highly relevant-20 , some
relevance-0 , irrelevant-0 , spam-4
- Yahoo Web Sites-category - highly relevant-10 , some
relevance-6 , irrelevant-3 , spam-5
- Yahoo Web Sites-relevance - highly relevant-10 , some
relevance-6 , irrelevant- 3, spam-5
- Best search results: Yahoo! Web Pages and Google
3) Keyword "PSpice"
was chosen because we consider it to be relevant, very important,
but somehow neglected in the industry in which we are doing
business. At this time, our assessment, based on the content of
our website, is that we should be on average in the top 10
position on all major search engines. Our goal is to earn a top 5
position. Our present assessment for the position and the future
estimation, is based on the facts that we address a relatively
narrow aspect of Spice-PSpice simulation, yet, we have very
interesting models and templates for an entire power supply to be
simulated, with unique and spectacular results. One example, the
Bode plot simulation for the inner current loop in a PFC, to the
best of our knowledge, is the first to be published online or on
paper. If there are other articles, which we are not aware of,
please send us the URL, and we will place a link on our website.
This is the reason, in our opinion, why reputable universities
placed a link to our site.
- AltaVista - highly relevant-6 , some
relevance-1 , irrelevant-1 , spam-3
- AOL - highly relevant-5 , some
relevance-0 , irrelevant-0 , spam-0
- Excite - highly relevant-8 , some
relevance-2 , irrelevant-0 , spam-2
- Google - highly relevant-9 , some relevance-0 ,
irrelevant-0 , spam-0
- Lycos - highly relevant-14 , some relevance-0 ,
irrelevant-0 , spam-5
- MSN - highly relevant-8 , some
relevance-3 , irrelevant-0 , spam-4
- OpenDirectory - highly relevant-6 , some
relevance-2 , irrelevant-0 , spam-1
- Yahoo categories - No results!
Not bad at all, actually! Better nothing than irrelevant.
- Yahoo Web pages - highly relevant-17 , some
relevance-3 , irrelevant-0 , spam-0
- Yahoo Web Sites-category - highly relevant-3 , some
relevance-0 , irrelevant-0 , spam-0
- Yahoo Web Sites-relevance - highly relevant-3 , some
relevance-0 , irrelevant-0 , spam-0
- Best search results: all search engines
4) We chose also
to query the search engines for the name of our Director of
Operations. Reason: to evaluate the real search capability
of a search engine, within a page or site already indexed.
The combination of the first and last name we estimated that is
likely to be unique.
- AltaVista - Cannot find the page containing the keyword (site
already indexed)
- AOL - Cannot find the page containing the keyword (site
already indexed)
- Excite - Cannot find the page containing the keyword (site
already indexed)
- Google - Cannot find the page containing the keyword (site
already indexed)
- Lycos - Correctly identified the site AND
the page containing the keyword
- MSN - Correctly identified the site
containing the keyword
- OpenDirectory - Not indexed
- Yahoo categories - No results
- Yahoo Web pages - Cannot find the page containing the keyword (site
already indexed)
- Yahoo Web Sites-category - No
results
- Yahoo Web Sites-relevance -
No results
- Best search results: Lycos and MSN
Some preliminary
statements, assumptions, definitions:
- We did not consider as search
results, the clearly identifiable ads, such as those
placed on the left or right of the actual search results.
Also those placed on top or on the bottom, if there were
no possibility of being considered as search results.
- We used the word "spam"
to characterize results similar with: "click here
for hot deals on alternating current", "bid now
for power electronics", etc.
- For the search results
characterized as "highly relevant" we did not
question the actual position of a site.
Some conclusions:
- Google and Yahoo!
Web Pages gave the most satisfying overall results.
- Confusing words, such as
"direct current" confused most search engines/directories,
but not Google/Yahoo! Web Pages. This demonstrates that Google/Yahoo!
Web Pages have the best search algorithm relative to
relevance. Open Directory, who invented the "Direct
Current" category, provided one of the most
irrelevant search on those words.
- Very unique, but well known
word PSpice, did not confuse any search engine. All
performed unexpectedly well.
- Quite surprising were the
results when we searched for the name of our Director of
Operations. The name is on a page on our website. Only
Lycos and MSN were able to provide the correct results.
Lycos identified not only the website containing the
name, but also the webpage itself, that is normally open
within a frame! This demonstrates that Lycos and MSN
have the best extensive search algorithm within the
indexed sites. Also suggests that other search
engines, with big databases, may not perform a full
search of their database. This may be both a strength and
a weakness. By performing a selective search, they
can handle a bigger database, and provide the results
faster. But also, they may skip sites already indexed, or
miss words, when somebody is looking for a unique piece
of information, or doing a search on a very narrow
subject.
- Only the following search
engines are worth to be considered (at this time) by
someone looking to find information over the Internet:
AltaVista, AOL, Excite, Google, Lycos, MSN, Yahoo! Web
Pages. You have already 7 search engines. Time is very
precious for everybody, nobody wants to waste it. It is
also unrealistic to believe that other "search
engines" may give you better results, already not
found in at least one of the above mentioned search
engines.
- A final word about meta
search engines, those that are searching other search
engines. They can not give you better results compared
with ALL, combined, above mentioned search engines. What
they can do is to "average" the results, so
that if a search engine is "customizing" the
database to fit its objectives, (and therefore becoming
less objective), the results of a meta search engine
could be more objective. Also, in order for a meta search
engine to be considered, this should provide, similar
with a "distributor", some sort of a "value
added service". We found one that we recommend to be
used: Vivisimo. It is organizing the search,
and is giving the visitor an opportunity to find more
sites relative to a keyword within a particular category.
A meta search engine that we valued in the past, Ask/Ask
Jeeves, is becoming less and less objective, suggesting
that the results are what "people have found helpful",
when in fact were paid results, identified as such on the
real search engine used by Ask Jeeves.
- SMPS Intellectual
Property
- This article contains
information for which SMPS Power Supplies and its
partners may claim Copyright and/or Trademark rights and
may be subject of a Patent application. Also SMPS Power
Supplies and its partners may claim the status of "First
to be published", relative to ideas published in
this article. Reasonable parts of this article may be
quoted by any third parties, without contacting us,
assuming that the source is clearly identified and a link
to the full article is included. If you wish to
incorporate information from this article within a
commercial product, you should contact us for permission.
- First
Revision: 28 Jul 2001
- Web first
published: 5 Aug 2001
- Last
Revision: 5 Aug 2001
Comments and
suggestions are welcomed and encouraged!
HOME
Copyright ©
1998-2001 SMPS Power Supplies, Inc. All rights reserved.
Other brand,
product names and words may be trademarks or registered
trademarks of the respective companies. If considered necessary,
a company whose name/trademark is mentioned in this article may
contact us by e-mail, and we will add a special trademark/copyright
notice, linking the company name with the words in question.