Internet Searching

Assignment 5, by Zach Tomaszewski

for ICS 691-3, Fall 2001, taught by Dr. Joan Nordbotten


I searched for mammals that live in a marine environment. My exact query was: (marine OR ocean) AND (mammal OR mammals). My initial expectations were low, since what I really wanted to retrieve was species-specific information. Since search engines are generally best with keyword-matching and known-item searches, I really should be running the search for each species or animal type that interests me. But of course I need to determine what species qualify as "marine mammals" before I can do that.

I searched Google first. [See the first page of results.] I discovered that many people use the phrase "marine mammal." Indeed, the Open Directory (dmoz.org), used by Google, actually has a category for this. So overall results were good. The first site on the list was The National Marine Mammal Laboratory. This site had some general information, as well as a number of links to more specific information. I found a great polar bear resource (http://maple.nis.net/~bearwork/polars1/13seawrl/pbindex.html) I can use; I hadn't realized they were considered marine mammals. Also, I forgot about seals and sea lions.

As usual, Google seems to return rather reputable, official sites first. I understand that the number of links to a site is used in their ranking. This is really helpful. Also their behind-the-scenes use of a human-compiled directory of sites (dmoz) surely improves the quality of the sites returned.

I repeated my search at Northern Light. [See the first page of results.] These results were less helpful (relevant). Their ranking seems much more keyword-based. Certainly my terms appeared in every title returned on the first page, but there are a number of other document factors besides term count that are important. Northern Light does have what they call Custom Search Folders that can be used to refine a search. Some of categories produced for my search included Mammals, Marine Mammal Protection Act, Oil Spills, and Marine Biology.

I do not care to offer advice on ranking algorithms or spidering technology. I'm sure the people at these search engines know far more than I do on these topics. The only advice I can proffer is that there needs to be more help in search formulation. Searching the Web is certainly a mixture of both user skill and search engine quality. In my case, what I really wanted was species-specific information. Perhaps I should have at least included the word "species" in order to pull up sites dealing with marine mammal species, rather than just the topic in general. Ideally, I would have browsed enough sites to get a list of animal families or species and then searched for each of these specific categories. Most users are not skilled searches, however, and have difficulty expanding or refining their searches. Search engines need to promote more search-refining dialog with users. Northern Light's. Custom Search Folders is a good step in this direction. Now if this could be combined with Google's highly relevant results, that would be a powerful engine!