Multimedia on the WWW

Assignment 3, by Zach Tomaszewski

for ICS 691-3, Fall 2001, taught by Dr. Joan Nordbotten

First of all, I do not think the Web can be seen as a database, even a "very large, unstructured, but ubiquitous" one. From Nordbotten's ADM draft, a database is "a logically coherent collection of related data, representing some aspect of the real world, designed, built, and populated for some purpose." She also provides the Merriam-Webster Dictionary definition of "A usually large collection of data organized especially for rapid search and retrieval, as by a computer." Well-defined structure, content, and borders are essential for a functional database. The Web has a whole is not logically structured, nor is the data related or concerning only limited aspects of the real world, and its purposes are as diverse as its users. It was also not organized for rapid search and retrieval.

It is true that the Web consists of data--bits, characters, and symbols--that can be queried and retrieved. And it is also true that every file (object?) on the web has a unique identifier--its URL. Despite the fact that the Web is not a database, we try to use these two characteristics to retrieve information as well as possible.

Some search engines, such as AltaVista, are already experimenting with image searching. Such technologies likely assume that images are related to their surrounding text. Also, the image filenames may be an indication to the image's content. Image recognition is unlikely to be useful on such a large scale anytime in the near future due to the computational overhead.

Restricting results by the date of publication can also be difficult. Frequently, the date the web page was last updated can be used, but this does not necessarily correspond with the date of publication. Searching the text for a date may or may not correspond to the date of publication.

I believe that metadata may be the best option for increasing search engine performance. However, this needs to be implemented by web publishers, not search engines, and so is likely to be slow in coming. Also, since a majority of current web pages do not even meet basic HTML standards, it is unlikely that metadata implemented by publishers will be perfectly structured or error-free.

I am certainly glad we have the current search engine technology; without it, we would be completely lost out there. However, I do not think that simply tweaking the current ranking algorithms or spidering more web pages a day is going to dramatically increase retrieval performance, especially when it comes to multimedia objects.

Comment on a posting by Zhen Fan:

>Challenges of searching images in the Web lie in two
>facts. An image itself is an image, practically not
>well documented or well expressed in words. People can
>add HTML tags to images, or use annotations to
>describe them, but the text is not a substitute as the
>image itself. The other challenge is on the user side.
>How can a user use a few words to describe an image he
>needs?

I think Zhen has a good point here about how people actually search for images. They do it by text keywords, and so in order to be found, images need to be indexed by text keywords. Currently, those keywords are best pulled from surrounding text, from the filename of the image, and from the ALT text for the image. In the future, hopefully some sort of additional, structured metadata may be available, though I wonder how long it will take before authors will consistently apply it. Even if the search engine retains thumbnail copies of all the images it has indexed, these are still not the primary way images are matched against queries. (However, when displaying results, a thumbnail image is more helpful than text keywords.)

Submitting an image as a "keyword" may be helpful in some cases. However, I think the image recognition and comparisons would have quite a high overheard. More importantly, this would only be helpful if a user needed more images of the same type they already had. Usually, this is not the case. People search for information or resources they do not already own. So if a person needs a volcano picture, they probably don't already have one to submit as a query. However, perhaps a user could select those kinds of images she liked from the returned results of a text keyword search, and then image recognition could be used to refine the results.

Precision is already rather poor when users need to select the keywords to describe their information need. With multimedia, they need to select the keywords that describe the image that meets their information need. It's going to be tricky to get that to work decently, especially with something as unstructured, dynamic, and unindexed as the WWW.