Finding Web Metaphors

A Proposed Corpus Search for Metaphors of the World Wide Web

A Research Proposal, by Zach Tomaszewski

for CIS 703, Fall 2003, taught by Dr. Elizabeth Davidson

Introduction
Literature Review
Proposed Study
- General Corpus
- Specific Corpus
Contributions and Future Research
Works Cited

Introduction

If our conceptual system is largely metaphorical--that is, if it is normal for us to actually think and talk about one thing in terms of another (Lakoff 1980)--it seems likely that our understanding of the World Wide Web will also be metaphorical. The metaphors we use to think about the Web will shape our interactions with it--how we value it, how we legislate it, how we describe it, and how we use it. If we knew what Web metaphors we are using, it seems we would be able to extend them where appropriate (as in design) or supplement them where they fail to portray a complete picture (as in legal cases).

The goal of this proposed research project is to empirically determine what metaphors we use to describe the Web. To do this, a corpus of literature about the Web will be electronically searched for instances of Web metaphors. This will pave the way for further research dealing with specific metaphors and the contexts of their use.

Literature Review

Conceptual Metaphor

I would first like to review what I mean by metaphor. I am referring to conceptual metaphors, as described by George Lakoff and Mark Johnson in Metaphors We Live By (1980) (as summarized in Tomaszewski 2003).

Lakoff and Johnson postulate that our conceptual system, as evidenced by the language we use, is largely comprised of conceptual metaphors. A metaphor is not simply a figure of speech, but a systematic, partial structuring of one concept in terms of another. These metaphors form a vast but coherent system that shapes our understanding of the world. These metaphors frequently go unnoticed as they are such an integral part of our thinking.

Metaphors are only partial mappings between domains, since a complete mapping would mean that the source and target domain are synonymous. And so, a single metaphor cannot highlight or help us understand all the elements of the target domain. Thus, we frequently have a number of metaphors from different source domains to describe a single target domain. These metaphors will generally overlap in a logical and coherent way, but rarely form a single, consistent image.

Lakoff and Johnson posit that some conceptual metaphors are more basic than others. This is because, ultimately, our conceptual structure is grounded in our interactions with the world. And it seems that experiences of physical processes and objects are more clearly delineated than those experiences of emotions, organizations, or intellectual abstractions. Thus, we use these more basic physical metaphors to understand more abstract concepts.

Our conceptual system of metaphors determines our understanding of new and existing concepts. This understanding in turn determines how we relate to the world, and what we understand to be true. The power of Lakoff and Johnson's findings is that, once we realize we are using conceptual metaphors, we are no longer unwittingly bound by them. We begin to see them everywhere, and are alerted to what features they highlight and what features they hide.

Metaphor in Design

If we think of a metaphor in this way--as a conceptual mapping or model--we see that metaphors play a vital role in interface design.

Donald Norman (1988) describes good interfaces as successfully transferring the system designer's model of the system to the user. This can be done with metaphor. However, the mappings need to be coherent. That is, constraints and affordances of the metaphorical model need to accurately portray how the system really works.

As an example, one pervasive system metaphor in computing is that a COMPUTER IS A DESKTOP WORKSPACE. Tim Rohrer (1995) explores how this metaphor can fail to properly convey the use of the system--primarily, how MacOS's use of the Trash icon to eject a floppy disk is inconsistent with what users expect the Trash to do. Though trash cans are part of a desktop environment, they are used to delete files; users, even those familiar with the system, frequently admit to deep unease in dropping their floppy into the trash. While it could be argued that throwing office documents into the trash is simply a way of removing them from the office environment, the affordances are still not quite right. Trash cans afford destructive removal of documents, not off-site storage. Rohrer suggests that simply having the trash can icon morph into a disk storage box when a floppy is dragged over it would probably help greatly to put users at ease.

Rohrer (1995) also points out that, though designers may pick an initial system metaphor, users can shape or extend this metaphor. An example of this is a computer's clipboard. Traditionally, computer clipboards hold only the last item cut or copied. Yet a real world clipboard can hold a large number of documents or notes. One merely has to flip back through the pages to find a formerly added document. To mend this gap in the mapping for MacOS, the shareware program Scrapbook was written. It records the previous contents of the computer's clipboard. As Rohrer (1995) states:

The point of this example is surprising: sometimes users feel an inconsistency in a metaphor and design a solution without any input from the system designers. But if the users can feel how the user interface should work even better than the designers, the traditional picture of metaphor as the mere conduit of the system model from designer's mind to the user's mind is inadequate.

Rohrer goes to claim that the notion of a disembodied user is also inadequate. An abstract system model is not transferred from the mind of the designer to the mind of the user, using metaphor only as a vehicle that is discarded once the transfer has been made. (Lawler (1999) makes a very similar point about the words we use to convey our semantic frames.) Instead, as Lakoff and Johnson claim, metaphors are the result of our embodied human experience. As such, they remain a vital, internalized means of structuring our understanding of a system. They do more than simply convey single abstract concepts. They tap into "patterns of feelings" (Rohrer 1995) and connect each concept to a web of others (Lakoff 1980).

As with all metaphors, interface or system design metaphors can also hide or downplay features. One such example is how we think of computer files. They are not simply documents, as most files in a traditional file cabinet are. Many files are executable programs, and most are not even human-readable (Lawler 1999). Deleting a file from a computer (and then emptying the trash) does not completely remove it from the system. It merely tells the computer it can now overwrite that section of the hard disk. With the appropriate tools, the files can still be recovered, possibly to the chagrin of the user (Gold 1997). Nor is a computer file stored as a single contiguous package, as can be seen by anyone who defragments a disk drive (Gold 1997). If a file exists in only one place at time, as a physical book or document does, it implies that a hierarchical system of filing is required. But this is certainly not the only way computerized data need be stored (Nelson 1998). Yet all these features of actual computer files are downplayed or even hidden by the COMPUTER DATA IS A FILE IN A FILING CABINET metaphor.

And so we see that metaphors play a vital role in how we understand the world. These metaphor shape the design of system, and they are the medium of communicating that system to users. As always, this involves some feature hiding and incomplete mappings between domains. But the metaphors we use continue to influence our understanding of a system, even after we become familiar with the workings of that system. In order to understand how people think about the Web, it seems a prerequisite to determine what metaphors they are using.

Studies of Web Metaphors

There has much discussion and analysis of selected Web metaphors. In the past, I composed a partial list of Web metaphors based on a number of these discussions (Tomaszewski 2002). This list of metaphors includes understanding the Web as a highway, tidal wave, web/net, library, marketplace, village square, ocean, cyberspace, frontier, and layered stack of protocols. More generally, in terms of use, it can be seen as a collection of documents or a built space (or possibly a database). However none of these discussions or lists were empirically derived.

There are far fewer studies that actually utilize Web metaphors. Of those studies, most include only one or two isolated Web metaphors or only a list of metaphors the authors can think of.

For example, Smilowitz (1994?) looked at labeling browsers based on different metaphors for the Web. She discovered that using a metaphor did affect user performance, that a library metaphor was more effective that a travel metaphor, and that using overlapping or composite metaphors was less helpful than using only one metaphor at a time. However, she looked only at two metaphors.

Palmquist (2001), in a study on how cognitive style affects Web metaphor choice, asked subjects to choose a favorite metaphor from the following list of general metaphors: outer space, highway, frontier, waterscape, political space, marketplace, social space, living organism, other. (The original study allowed users to create their own metaphor for "other" and included sample keywords for each metaphor.) While this is a broad list, it misses some important metaphors, such as the Web as a net/web, a collection of documents, or as a database. Indeed, the researchers found past database experience had high correlations with successful Web use. Perhaps this metaphor should have been included in their study.

It seems a more complete list of metaphors would aid in both discussion and evaluations of Web metaphors in general, as well as allow for more informed empirical research.

Proposed Study

I believe that a broad, empirical search is need to document the conventional conceptual metaphors used to describe the Web. To date, most metaphor identification has relied on introspection, rather than any empirical review, and has failed to approach anything like a comprehensive search. I believe the best method for this sort of search would be an archival review of selected corpora. This is the most effective way to review metaphors from a large number of utterances from a variety of contexts.

By Web, I mean the collection of HTML and other digital documents written for human viewing and made publicly available through the Internet via HTTP protocol. Many Internet metaphors, being more general, are likely to include and describe the Web. However, I am not here examining metaphors used to describe other specific aspects of the Internet, such as email, FTP, peer-to-peer networks, telnet, etc.

General Corpus

A review of Web metaphors should begin with a general, domain-neutral corpus. One such corpus is the Bank of English (Deignan 1999), which contains over 450 million words of text collected from newspapers, advertisements, magazines, fiction, non-fiction, and conversations. The majority of the content was created since 1990. (The majority of this text is also British-English, which may be misleading if metaphors are culture-specific. However, it is not expected that this should significantly affect the study since additional corpuses will be used, and our primarily interest here is metaphor existence, rather than frequency.)

The corpus should then be queried for sentences (citations) containing target domain keywords and their various word forms (Martin 1994; Deignan 1999; Izwaini 2003). In cases where this results in over 1000 citations, random sampling will be used to produce a set of 1000 (Deignan 1999). These sentences can then be analyzed by a human reader to determine whether the keyword is being used literally, as part of a conventionalized metaphor, as part of a novel metaphor, or as in a homonymous word sense. (Martin 1994). In this way, metaphors (that is, source domains) used to describe the Web (as operationalized in terms of the searched keywords) can be determined.

Some sample key words from the target domain of the Web include: Internet, Web, site, page, link, URL, HTTP, and HTML. More keywords should be generated before the study, if possible. It is important to note how difficult it is to determine literal Web keywords, since metaphor is so pervasive. Even in this short list of terms used to "literally" discuss the Web, we have included a space metaphor (site), a document metaphor (page), and a web metaphor (Web, link).

The frequency with which a particular source domain keyword is used metaphorically in the language has often been determined by counting the frequency of its metaphorical use in a large sample of sentences (Martin 1994; Deignan 1999; Izwaini 2003). For example, the phrase "death blow" occurs 83 times in the Bank of English, but only one of these occurrences is literal (Deignan 1999). However, for this study, I do not believe this is particularly useful information compared to simply determining the existence of the metaphors.

On the other hand, I think it would be beneficial to know the frequency of the metaphors (as opposed to the metaphorical use of the keywords) in the language, especially when compared to each other. For example, do people use a document or spatial metaphor more often when describing the Web? However, I do not believe this information can be accurately determine from keyword searching. This is because single keywords do not form an accurate measure of metaphor frequency. That is, simply because follow occurs metaphorically less frequently than browse doesn't necessarily mean that the SPACE metaphor is less frequent than the DOCUMENT metaphor. Both metaphors include a number of different keywords. Thus any measure of metaphor frequency--especially when compared to other metaphors--would have to include a composite of all possible keywords to have any solid accuracy. Determining all possible keywords is likely impossible.

This brings us to another limitation of this study. As with any electronic text retrieval, success depends on determining the correct keywords. Besides the aforementioned difficulty of picking clear target domain keywords, there are issues of recall and precision. In a study like this--aiming to be a very expansive search--recall of as many possible metaphors is important. Thus, as many target domain keywords should be used as possible, including their numerous word variations and sentence contexts. However, broadening the search in this way will likely lower the relevancy precision, resulting in many more non-metaphorical results in the returned citations.

This issue of many false positive results is a concern because recognition of a metaphor--especially in the case of novel, infrequent conventional metaphors, or very conventionalized metaphors--relies on an attentive and insightful human researcher. Certainly, a long list of false positive matches is likely to weary anybody. In an attempt of offset any lapses in attention, it is recommended that there be more than one human reader.

Another question of validity is the corpus used. While the Bank of English is large and general, it may contain few Web metaphors. This would not make for an expansive search. This issue it dealt with below.

Related to the choice of a corpus is the question of metaphor authorship. Most corpora are of published or at least printed text dealing with the Web (though some do include conversations as well). It seems likely that most authors writing about the Web will be more familiar with its use than beginning users. Yet experienced Web users may use different metaphors than beginning users. Whether metaphors differ with experience will need to be confirmed or falsified at another time through a slightly different method (see "Contributions and Future Research" below).

Specific Corpus

In the hopes of off-setting the issue of whether a particular corpus is really fitting, more than one corpus should be examined. Since there may be metaphors specific to a particular group (such as frequent Web users or Web designers), a corpus where Web metaphors are generally more frequent should be examined.

Such a Web or Internet-related corpus may well already exist. But if it does not, it could be constructed in the following way: do a Google search for "Web". From this returned set of documents (over 269 million), a sample can be taken. Since Google ranks pages by link popularity ("Google Technology" 2003), the returned results are not randomly distributed. Yet popular pages would likely have greater influence on the language than less travelled pages. I would recommend clustering: a sample of the first n returned pages, and then a random sampling of 3 to 5 n pages from the remaining results. The advantage of constructing a corpus of Web pages in this manner is discussed below in "Contributions and Future Research".

Using a topic-specific corpus, the same methods should be applied as for the general corpus to determine if further metaphors can be found.

Contributions and Future Research

The primary contribution of this study is to provide a firm foundation for further research by clearly establishing what Web metaphors do exist. This information has a number of uses.

First of all, the metaphors themselves are of interest. There is already a project underway to catalog the diverse metaphors we use (Lakoff 1994). The Web metaphors can be examined for relationships between them. For instance, the FRONTIER and TOWN SQUARE metaphors seem to be sub-classes of the more general SPACE metaphor. Additionally, they can be related to more general metaphor categories. Indeed, some people believe a complete bank of high-level generic English metaphors would number only about 500 metaphors (Martin 1994).

Secondly, a clear analysis of the implications and aspect-hiding of each metaphor would be a great aid in legal issues. The legal ramifications are very different if the Web is likened to a street corner (a public forum where free speech is protected), a shopping mall (where commercial zoning plays a role), a telephone (where privacy is protected), or television (where content is strictly regulated) (Gold 1997).

Thirdly, the metaphors could be extended by Web designers. (This is where my personal interest lies.) If THE WEB IS A BUILT SPACE, then perhaps elements of architecture and techniques of wayfinding can be metaphorically mapped to information structure and Web site navigation (Tomaszewski 2002).

But perhaps before any of these other goals are pursued, it would be helpful to take a closer look at Web metaphors in corpora. In this study, we have merely laid the necessary foundation of a broad look at the area. But with certain constructed corpora as described above in "Specific Corpus", further details of specific metaphors could be determined.

For instance, the advantage of constructing a corpus of Web pages is that details about the particular pages can be coded. For example, those pages selected due to their high page rank and those randomly sampled from the returned results should be marked as such. Then the frequency of a particular metaphor could be compared between the two sets. Additionally, a human cataloger could classify pages by year, and we could then see if certain metaphors are gaining or losing popularity over time. Other possibilities include comparing metaphors between publication media (newsgroups, email, Web pages) or depending on the intended audience's experience level or culture.

Earlier I mentioned that the single keywords used to find metaphors in a corpus do not themselves provide an accurate measure of their metaphor's frequency. I should clarify that they are not an accurate measure of frequency that can be compared between metaphors. That is, based on keywords, we cannot confidently say that the SPACE metaphor is more prevalent than the DOCUMENT metaphor. However, if we are measuring the frequency of the same metaphor in different corpora or contexts, then keywords (especially a composite measure of a number of keywords per metaphor) could indeed give us a good (though not perfect) measure of frequency. That is, if we measure the frequency of DOCUMENT metaphor keywords in sample works from a number of different years, we may learn whether the metaphor is becoming more or less frequent.

Though corpus research is a great aid in metaphor research, it is not the only method. For certain questions--such as how people acquire new metaphors or how those metaphors affect their use of the Web--other, more experimental techniques should be pursued.

Yet it does seem that the exploratory research project proposed here is needed. It would provide a strong initial basis for further work in Web design, metaphor analysis, and extended Web metaphor research.

Works Cited

Deignan, Alice. "Linguistic Metaphors and Collocation in Nonliterary Corpus Data." Metaphor & Symbol (1999): 14:1, p19.

"Google Technology" <http://www.google.com/technology/index.html> Last edited: 2003. Accessed: 08 Dec 2003.

Gold, David. "You Can't Surf a Sine Wave: Metaphors and the future of the Internet." <http://ccwf.cc.utexas.edu/~dgold/metaphor.project/metaphor.html> Last Edited: May 1, 1997.

Izwaini, Sattar. "A Corpus-based Study of Metaphor in Information Technology." A corpus linguistics workshop presentation, Birmingham?, England. <http://www.cs.bham.ac.uk/~mgl/cl2003/papers/izwaini.pdf> Last Edited: 2003.

Lakoff, George and Mark Johnson. Metaphors We Live By. Chicago: University of Chicago Press, 1980.

John M. Lawler "Metaphors We Compute By." <http://www-personal.umich.edu/~jlawler/meta4compute.html> Last Edited: 1999. Accessed: 07 Dec 2003.

Lakoff, George. "Conceptual Metaphor Home Page." <http://cogsci.berkeley.edu/> Last Edited: 22 Mar 1994.

Martin, James H. "Metabank: A Knowledge-Base of Metaphoric Language Conventions." Computational Intelligence (1994): 10:2. <http://www.cs.colorado.edu/~martin/ci.ps>

Martin, James H. "A Corpus-Based Analysis of Context Effects on Metaphor Comprehension." <http://www.cs.colorado.edu/~martin/rat-met-cogsci.ps> Last edited: 1995? Accessed: 07 Dec 2003.

Nelson, Ted. "What's On My Mind." <http://www.xanadu.com.au/ted/zigzag/xybrap.html> Last Edited: 25 June 1998. (See also <http://xanadu.com/zigzag/>)

Norman, Donald. Psychology of Everyday Things. New York: Basic Books, 1988.

Rohrer, Tim. "Feelings Stuck in a GUI Web: Metaphors, image-schemas, and designing the human computer interface, or Metaphors We Compute by: Bringing magic into interface design." <http://philosophy.uoregon.edu/metaphor/gui4web.htm> Last Edited: 1995. Accessed: 06 Dec 2003.

Smilowitz, Elissa D. "Do Metaphors Make Web Browsers Easier to Use?" <http://www.baddesigns.com/mswebcnf.htm> Last Edited: 1994? Accessed: 07 Dec 2003

Tomaszewski, Zach. "Conceptual Metaphors of the World Wide Web." <http://www2.hawaii.edu/~ztomasze/ling440/webmetaphors.html> Last Edited: 15 May 2002.

Tomaszewski, Zach. "Communication through Architecture." <http://www2.hawaii.edu/~ztomasze/cis701/project.html> Last Edited: 15 May 2003.

Tomaszewski, Zach. "Lakoff and Johnson's Metaphors We Live By." <http://www2.hawaii.edu/~ztomasze/cis701/paper6.html> Last Edited: 24 Apr 2003