This dissertation concerns interactive narrative, which is essentially a story that changes in response to its reader. The field of interactive narrative can be found at the intersection of various other related fields, with a number of examples already existing. Yet interactive narrative is a relatively new endeavor; most true examples of it have existed for less than fifty years. The advent of the digital computer--a medium that excels at both processing and interactivity--has greatly increased the incidence of interactive narrative.
Many traditional narrative formats--literature, film, theater--allow for some very basic form of interaction. For example, a reader can control how fast she turns the pages of a novel. A viewer can control the volume of a DVD. An audience member can, through her reactions, impact the actors presenting a drama. But real interactive narrative intends to offer the user more significant impact than this. In particular, users should be able to affect the plot of the unfolding story.
One of the best examples of interactive narrative is roleplaying games (RPGs). These come in two general forms. In table-top roleplaying games--such as the classic Dungeons and Dragons--players sit around a table and describe the action of the story to each other. In live-action roleplaying games--such as White Wolf's Mind's Eye Theatre--players tend to be more active, to the point of acting-out at least parts of the action.1
In either form, one player assumes the role of the game master, who acts as a kind of referee. The game master directs the story by describing the details of the game world to the other players and then responding to what the others players choose to do as characters in that world. The flexibility of the story then depends on the creativity of the players and responding adaptability of the game master.
The game rules constrain the action somewhat by outlining possible game actions and their effects. Roleplaying systems with a lot of detailed rules tend to become more strategic in nature, like the war-gaming and strategic miniatures that preceded modern roleplaying games. Systems that are particularly light on rules become more free-form and dynamic, producing collaborative storytelling experiences more reminiscent of childhood pretending play.
Roleplaying games are live events, guided in part by a human director. On the other hand, the Choose Your Own Adventure (CYOA) series of books is a good example of an interactive narrative that has been completely encoded into an artifact that is then later read by the user. In this series, each book offers a branching storyline. The reader reads a few pages and is then offered a choice to make. Each choice includes a page number that the reader then turns to if they wish to make that choice. CYOA is the most well-known "brand" of this kind of story, though similarly-structured works exist by other publishers.
A variation on branching storybooks is gamebooks. In a gamebook, the reader picks a number of skills or equipment items for their character at the start of the book. Besides making choices as in a CYOA, certain options are only available if the reader has a certain skill or item. Gamebooks also tend to involve combat sessions where the reader rolls a die to determine the outcome. Combat success is also impacted by the character's skills and current health. In this way, gamebooks are a bit more like table-top roleplaying games as initial character creation and chance have an impact on the story possibilities.
Gamebooks also tend to be published as a series. Because of this, each book in the series tends to have only one successful ending--unlike a CYOA book, which usually has multiple successful possible conclusions.
Improv theater also offers an example of interactive narrative. When improv actors are performing together for their own entertainment, each exerts an influence--but not directorial control--over the outcome of the story. Improv actors may also illicit suggestions from an external audience, thereby granting the audience a chance to influence the direction of the story.
Computer games include a number of narrative forms. For instance, interactive fiction games--though frequently puzzle-based--can still offer a simple story structure. Because it is text-based, interactive fiction supports experiments with language and literary form.
A successor to early interactive fiction was the adventure game, as represented by Sierra's King's Quest and Space Quest series. These graphical games also presented a linear story revealed through a series of puzzles to solve. Occasionally, these games would provide multiple possible endings--such as in Dynamix's Rise of the Dragon.
Adventure games eventually evolved into computer roleplaying games, which have a greater focus on character abilities, controlling parties of characters, and combat over puzzles. These games have also tended to be more freeform, providing a large simulated world that can be explored. Where there is a storyline or quests, they can be followed in various orders or completely ignored by the player. Good examples of this are the Elder Scrolls and Grand Theft Auto series of games.
First-person shooter games--such as Doom, Quake, and Half-Life--and their second-person variants--such as Tomb Raider--offer detailed control of a character in a fictional world. However, despite a basic context-providing storyline, there are rarely significant story choices to be made during gameplay itself.
With the introduction of networked gaming and shared virtual spaces, massively multiplayer online roleplaying games (MMORPGs) have introduced a new genre. Here, though presenting a massive simulated world and a number of possible quests, some of the more interesting narratives emerge from interactions between human players and the formation of communities within the virtual space.
Finally, some users claim that detailed simulation games such as The Sims can be attributed with emergent narrative properties. Although The Sims is not intentionally designed to generate stories, story-like events occur as the autonomous characters follow their various inclinations.
Along with the multimedia capabilities of digital media comes the potential for interactivity. Yet many interactive digital works are so story-focused that they do not seem to quite qualify as games. For example, Romp.com's adult-themed series Jake's Booty Call was a graphical version of a CYOA book. Implemented in Flash, each episode presented a number of animated segments. At the end of each segment, the viewer could select which choice the main character, Jake, should pursue. From a single starting point, each play through an episode revealed a single path through a branching tree of possibilities.
Hypertext fiction also presents a kind of interactive narrative. These works can be completely textual, or include various multimedia. In some hypertext works, the path the reader takes though a series of successive choices determine the underlying story--just as in a CYOA book. As an example of even greater interactivity, some sites allow readers to become writers, extending storylines when they reach the end of a branch. In other hypertext works, the underlying story is static. The links the reader follows simply changes the order in which a single story is revealed.
Hypertext fiction is an example of a larger digital literature movement, which uses the processing, multimedia, and interactivity capabilities of the digital medium to produce novel literary works. While digital literature nearly always includes some textual component, not all examples are narratives nor are all interactive.
While not interactive, story generation projects also inform interactive narrative. Both endeavors fall under the larger field of narrative intelligence--building artificial intelligence systems that can either understand or generate new stories. However, an interactive user fundamentally changes the problem. Many story generation systems build complete stories at once in a way that does not tolerate new inputs halfway through the generation process.
There are a number of reasons to create a true interactive narrative system.
The first reason is that doing so requires a degree of automated narrative intelligence. Humans tell stories easily and nearly constantly. And yet--partly due to the large amount of real-world knowledge required--getting a computer to understand or generate stories proves incredibly difficult. Such narrative intelligence currently remains a challenging problem for the artificial intelligence community.
Secondly, interactive narratives are potentially very lucrative. Computer games are already a multi-billion dollar industry. Yet most of the games that purport to be story-based are too linear or restrictive, or else the story serves only as a context for combat or puzzle-solving. If games really were capable of adaptive story production--including believable characters and complex social interactions--it could lead to a whole new market of interest to people who don't currently play computer games.
Aside from entertainment, there are serious or practical applications, such as a teaching or training tool. Currently, virtual training simulations can be used to train people to handle or explore complicated situations. For example, software already exists that is used to train soldiers how to interact with civilians in order to achieve certain mission goals. If such software could be extended to produce well-formed stories, it may make the learning experiences more memorable.
However, the most important reason to pursue interactive narrative is that it would be a new form of human expression. Traditional narratives let us explore the Other. While we might empathize greatly with a character in a film or novel, they make their own choices and we, as audience, simple follow along, slightly removed. Interactive narratives, on the other hand, would allow exploration of the Self. Whether following our own inclinations, or those of a character role we assume, we would be able to make our own choices in the story. We would then be able to explore the consequences of those choices as the story unfolds.
Exactly what those consequences might be depends on the author of the interactive narrative. Instead of writing single--or even branching--storylines, authors of interactive narratives will write the rules that produce a number of possible stories. This is will be a challenging creative form--writing a world or system of potentials, rather than a concrete tale.
So interactive narrative proves to be an interesting AI challenge, potentially both lucrative and educational. Yet most importantly, it may provide a new experience for users and a new creative form for authors.
Before we can attempt to make a narrative interactive, we must first understand what a narrative is. Furthermore, our model must include a sense of how interaction affects that narrative.
Within the interactive drama2 community, such a model already exists. Aristotle's Poetics has served as the basis for understanding interactive drama since Brenda Laurel proposed her neo-Aristotelian model in 1991. Laurel's model is based largely on the work of Sam Smiley (1971). Michael Mateas (2004) has recently extended Laurel's model to include an explanation of how user interaction fits into the model.
The current poetics of Mateas's has accumulated a number of valuable additions during its development through four authors. However, the obscuring of key Aristotelian features--such as the distinction between object, manner, and medium--has lead to certain tensions. Here, we will trace the evolution of the current poetics in order to examine its strengths. We will then explore an overhauled model that includes most of these benefits while eliminating some of the internal strain.
In the fourth century B.C.E., Aristotle (1961) laid the foundations of narrative theory in his Poetics. Although he focuses primarily on describing the nature of tragic drama, he does refer to other art forms such as epic poetry, comedy, dithyrambic poetry, music, dancing, and painting. He claims that all these forms of "imitation" differ from each other in three defining respects: their objects, medium, and manner (which is sometimes called mode).
The object "imitated" in drama is "men in action" (Chapter I). Tragedy and comedy can be distinguished by the character of the men represented and the nature of the action. The men can be portrayed as "better" or "worse" than they are in real life; the action may or may not be serious, unified, and complete. But the important aspect is "men in action".
Fergusson, in his introduction (Aristotle 1961), explains that our understanding of Aristotle's action should be in light of his writings on ethics. Action here means praxis--an active, rational "movement of spirit", directed outwards. It is action arising from thought, focused to some end. The motivation of a character is essential to this sort of action.
Aristotle thus explains that the three objects of dramatic action are Plot (that is, the "arrangement of the incidents"), and the Character and Thought of its agents (Chapter VI).
Art can represent objects through a variety of different media--color and form, or the voice, or rhythm and harmony (Chapter I). Tragedy, specifically, is conveyed through Diction and Song (Chapter VI). That is, actors speak and sing in order to convey the action to the spectators.
Within the same medium, there may be different manners of presentation. For instance, in poetry conveyed through the media of speech and song, the events can either be narrated through a personality, narrated by the poet himself, or enacted as if the characters were "living and moving before us" (Chapter III). This is a distinction between epic and tragedy--epic is narrated, while tragedy is enacted. Aristotle calls this enactment of tragedy the Spectacle (Chapter VI).
Aristotle lists these six parts in terms of their order of importance to tragedy: Plot, Character, Thought, Diction, Song, Spectacle (Chapter VI). The first three are the objects--we understand the Plot of the action in part because we understand the Character and Thought of the characters. The action is presented through the media of Diction and Song, Diction being the more important for Aristotle. He lists the manner of Spectacle as the least essential to judging tragedy. Though the special effects of the stage may have a certain emotional appeal, they are the least connected with the art of poetry: "For the power of tragedy, we may be sure, is felt even apart from representation and actors" (Chapter VI).
Sam Smiley, in his 1971 Playwriting: The Structure of Action, explores the process of playwriting, using Aristotle's model as a framework. In the first chapter, in arguing that fine arts are artificial (that is, manufactured) objects, he explores Aristotle's four causes for coming into being of an artificial product.
For those unfamiliar with Aristotle's causes, the material cause of an object is the substance of its construction. The material cause of a house is the wood and concrete used to construct it. The formal cause is the form of the object. For a house, this would correspond roughly to its blueprint design. The efficient cause is the process that constructs the object. This would be the construction workers who build the house. The final cause is the end to which the object is constructed. A house is usually constructed to provide shelter.
Smiley very briefly presents Aristotle's six parts of drama as connected by formal and material causes. (Although the four causes are an Aristotelian concept, Aristotle himself does not state such causes between the six parts in Poetics.) Smiley presents them in the same order as Aristotle--Plot, Character, Thought, Diction, Sounds, and Spectacle--and contends that each element dictates the form of those below, while each provides the material for the element above.
For Smiley, Plot is constructed in terms of the actions of the Characters. The material from which we build Character is Thought3. Thought is itself constructed from words, or Diction. Diction is made up of Sounds. (Note the change here from Aristotle's Song.) Spectacle--"the physical actions that accompany the words" (Smiley 1971, p.12)--is the most basic material of all.
A playwright holds the formal order to be most important. A certain Plot dictates the qualities required of the Characters. Based on these qualities, the characters will espouse certain Thoughts, expressed in certain Diction, and so on. In contrast to the playwright, the actors and production team tend to construct the play working in the material order, beginning with the Spectacle.
Note that Aristotle's distinction between object, medium, and manner has been ignored here. We will soon discover that a number of tensions were introduced by this omission, as these few paragraphs of Smiley's become the foundation of the modern poetics of interactive drama.
Brenda Laurel (1991) begins with Smiley's model, renaming some of the elements to be Action, Character, Thought, Language, Melody, and Spectacle. She describes these elements first in terms of drama, and then expands their meanings to be suitable for all human-computer activities (such as computer-based interactive narratives).
Starting at the lowest levels, Laurel begins by defining Spectacle as "everything that is seen" and Melody as "everything that is heard". However, this does not fit cleanly into the causal hierarchy since Spectacle does not form the basis of Melody. Also, this emerging neo-Aristotelian model does not seem to allow for visual signals to travel "up" the hierarchy to become the basis of Language and the understanding of the drama.
So instead, Laurel renames Spectacle to Enactment and redefines it to mean all the sensory dimensions of the represented action--visual, auditory, tactile, and any others. From these sensations, the user constructs Patterns. Language now does not mean only spoken human language, but any "selection and arrangement of signs, including verbal, visual, auditory, and other nonverbal phenomena when used semiotically" (Laurel 1991, p.50). Thought and Character remain largely unchanged, though for Laurel they may arise from computer-based, rather than solely human, origins.
Though Laurel has overhauled the bottom half of the hierarchy in an attempt to fit the demands of the causal connections, we shall see that a number of inadequacies still remain.
Michael Mateas (2004) follows Laurel's model, both in the terms used and the material and formal causes. He adds to the model Janet Murray's (1997) notion of agency, which Mateas defines as "the feeling of empowerment that comes from being able to take actions in the world whose effects relate to the player's intention" (Mateas 2004, p.21).
In an interactive drama, the story is enacted with the player taking the role of one of the characters. To support interaction at this Character level, Mateas adds two new causal chains--a Material for Action and a User Intention. When a user is interacting within a virtual world, the objects and the characters in that world afford certain user actions (from below). In turn, the story provides some narrative constraints, or at least direction (from above). When the user acts upon other characters in the story, her intention becomes a formal cause in much the same way the requirements of the action shape characters in traditional narratives. "A player will experience agency when there is a balance between the material and formal constraints" (Mateas 2004, p.25)
We can thus see that Aristotle's model has come a long way through these additions and reformulations. However, due to the introduction of material and formal causes, a number of omission and tensions have been introduced.
First of all, we have lost Aristotle's sense of manner. Rather than differentiating between whether a narrative is enacted or presented, Spectacle has come to mean "all that is experienced by the audience."
Secondly, we have lost the idea that the medium is variable, yet still specific. When defining medium, Aristotle admits that medium may be color, harmony, rhythm, etc. (Chapter I). Only in describing tragedy (and other drama such as epic) does he limit himself to Diction and Song. We have since come to assume that all dramas are presented only through Diction and Song, which are primarily auditory channels.
However, as dramas become computer-based, we have, through Laurel, attempted to regain the flexibility of different possible media. In order to allow for visual signs, we have rather ungracefully expanded Song and Diction to Pattern and Language. We now speak only generally of Patterns, rather than specifically of medium-specific modes. Patterns must then be assembled into something as well-defined as a Language in order to serve as the basis of Thought, Character, and Action.
Most importantly, the causal hierarchy implies sequential and exclusive links between the levels. That is, it seems only the level directly below should form the basis for the level above. For example, we certainly construct our understanding of a Character in terms of her Thoughts, which are understood in terms of her spoken Language. However, her physical features, expressions, gestures, costume, and theme music also contribute to our understanding of who a character is. Yet these attributes seem to serve as the material for Character without conveying Thought or using Language.
Mateas runs into this problem when he describes interaction with objects as existing "somewhere between spectacle and pattern" (2004, p.25). Yet what affordances are granted by the raw sensory experiences of Spectacle? What sort of constraints are provided by Patterns such as a purple jacket and an ominous musical chord? Yet it does not seem right to move objects to the level of Character, as objects are not assembled from Language-encoded Thought.
Aristotle does not mention setting or props, probably due to the fact that plays of his time had limited scenery. However, objects in the world play an increasingly important aspect of computer-based interactive drama since they are often the means through which the player can affect the action.
Aristotle provides us the basis for describing practically any art form in terms of its object(s), medium, and manner, while exploring tragedy as a specific example. Smiley gives us the idea of formal and material causes between these elements. Laurel explores this process that Smiley only sketches, and expands this framework to describe computer-based drama. Mateas tackles the problem of how interaction and the experience of agency can fit into this model. I believe that we can keep all these contributions, yet remove many of the tensions introduced during this model's evolution.
Returning to Aristotle, we can say that an object of "imitation" is presented through some medium. We can think of this medium as our experience of a "text", whether this be reading a script, watching a movie, or playing a narrative game. The object of imitation does not formally dictate the choice of a certain medium as a whole--a story could be presented as either a novel or as a play. However, the object does formally dictate its construction within a specific, chosen medium--a story's dialog is written within quotes in a novel or spoken by the actors in a play. As an audience, we construct a sense of the object from our experience of its instantiation in a particular medium.
Our experience of the medium can be described at different levels of detail. At its most basic, our experience of a medium may utilize a number of sensory channels--the visual, auditory, tactile, etc. This raw sensory experience corresponds to Laurel's (and Mateas's) definition of Enactment.
At a higher level, as Laurel suggests, we discern patterns based on this sensory experience. From various sounds, we may differentiate music or speech. From our visual experience, we may differentiate text, diagrams, photographs, animation, or live action. We might call these differentiated sensory patterns the modalities4 of the medium. They are essentially what Robert Stam refers to as "tracks". As Stam describes it: "The novel has a single material of expression, the written word, whereas the film has at least five tracks: moving photographic image, phonetic sound, music, noises, and written materials" (Stam 2000, p. 59).
We may also want to consider that, also as Laurel describes, there are certain conventions (something like a proto-language) that develop for these different modalities. For instance, a shot-reverse-shot with a fade can signify a reminiscent flashback in film. Comic books use different "word balloon" conventions to show whether a character is speaking, whispering, or thinking.
The specific sensations, modalities, and conventions depend on the particular medium used. Each level provides the material necessary for constructing those above it, while formally constraining those below it. This is a different, broader notion of medium than Aristotle's, which would correspond only to what we are calling the modality.
This reformulation of medium opens the way for a media-specific analysis of works, as called for by N. Katherine Hayles (2002). Whether a story is conveyed as a live performance, a film, or a novel, we should be able to explore the particular details of its material embodiment in a medium and how that specific embodiment affects our conception of the work as a whole.
In an interactive medium, the medium also provides interface controls that affect the imitated objects5. When experiencing a drama, the user moves through the stages of material causality: from their sensory experience, they discern separate modalities, each with their various conventions for relaying the objects of drama. When providing user inputs, the drama system must make this same transition. The system may allow for various channels for input, such as haptic or auditory. Haptic input might involve different modalities, such as movement of the mouse or the pressing of keyboard keys. Mouse use has a number of conventions concerning the difference between left-clicking, right-clicking, and double-clicking. When discussing interaction, we are mostly concerned with the user's agency within the narrative (object) context--how the user can affect the world of the story. However, it is useful to remember that both their understanding of the story world and their attempts to control it must pass through the medium.
The object of drama is "characters-in-action". This action6 has characters as its material cause; in turn, the action determines what sort of characters are needed to produce it. As held by Aristotle, a character's motivation, or thought7, is essential to understanding that character. However, as we have seen, it is not the only material from which characters are formed. They have a number of other physical attributes, and often thought can only be inferred by the audience from these outward appearances. While essential to character, thought does not cleanly fit within the exclusive formal/material cause hierarchy.
The notion of setting is missing from Aristotle. Yet, action is partly constructed in terms of where things happen and what objects are used. This is particularly true in an interactive drama, in which the user assumes the role of a character and, through this character, interacts in a virtual story world. Though some of this interaction means affecting other characters, the user often spends time manipulating objects and their character's current location. We might refer to the characters and setting together as the story-world.
The user's actions at the story-world level serve as partial material for furthering the action, while the narrative context of the action so far provides some constraints on the user. This is just as Mateas describes agency, though in this reformulated model, the world's objects are placed within the same narrative context as characters. Like characters, modeled story objects often have an internal state that must be inferred by the user from the objects' outward appearances (as conveyed through the medium). Though we are usually most concerned with the affordances for interaction offered by the story-world, it is helpful to remember that the medium itself must also successfully afford the interaction controls needed to affect those objects.
We have now restructured the current poetics of Laurel and Mateas. Although we have changed some of the relationships and labels, we have tried to maintain their basic concepts--particularly the formal and material causes, the description of "patterns" and "languages" at work in a medium, and the mechanism of user agency. Most importantly, we have reinstated the Aristotelian distinction between medium and object.
So far, we have limited ourselves to those elements that have been carried through the evolution of this poetics to Mateas. However, the notion of manner was dropped relatively early in this development. Since we have been implicitly restricting ourselves to interactive drama, this has been easy to ignore. We can simply assume that, as the player is assuming the role of a character, the manner is one of enactment.
Aristotle's definition of manner is very brief:
For the medium being the same, and the objects the same, the poet may imitate by narration--in which case he can either take another personality as Homer does, or speak in his own person, unchanged --or he may present all his characters as living and moving before us (Aristotle 1961, Chapter III).
We can see here that the manner is how the story is presented, but it is independent of both the medium and the object. The main distinction Aristotle makes is between narrated and enacted manners. For example, the play Romeo and Juliet has an enacted manner: it presents the characters "as living and moving before us". However, this play can be embodied in different media: as a script, as a live performance, as a film. If the same Romeo and Juliet story were written as a novel told from Juliet's nurse's point of view, then the story would have a narrated manner. This novel could be made into a film, as film can present both enacted and narrated manners (as explored in more detail below). Thus, the same object (the story--including events, characters, and settings--of Romeo and Juliet's ill-fated romance) and the same medium (film) can have a different manner (enacted or narrated).
Aristotle also distinguishes between two kinds of narration--an omniscient narration verses a limited, character-filtered narration. In either form, we find our experience of the action is provided only through the particular point of view of the narrator. This narrator may be unreliable; the filtering character may be fallible. The narration may be very overt, in which the narrator constantly evaluates or comments on the events and characters of the story.
Narration is not itself an event in the story it conveys, even when the narrator is a character in the story. This is essentially the difference narratologists make between the story--the chronological events of the action--and the discourse--the telling or presentation of those events (Chatman 1978). The discourse conveys the story to us, but it may comment on that story, focus our attention on certain events over others, omit events, foreshadow or flashback to previous events, provide "backstory" information, etc. There is usually a difference between the discourse and story timeframes--it might take an hour to read about the events that happen to the characters in seconds, or a ninety minute film might show story events that occur years apart.
Although this difference between the story and the discourse is less obvious for enacted narratives, the distinction can still be made (Chatman 1990). For example, film can present things from a particular point of view. A narrator can be established through such conventions as using voice-overs, point-of-view shots, and having the narrating character present in all scenes. Two different directors can tell the same story in different manners using the same film medium depending on how they use scene cuts, pacing, staging. lighting, etc in order to reveal, highlight, or comment upon the action. Although easily overlooked, the discourse--how the story is presented--is clearly important. And it is, in fact, present in all forms of narrative. Mark Stephens Meadows (2003) goes so far as to argue that the perspective from which the events are relayed is even more essential to narrative than the events themselves.
We have seen that Aristotle's manner is, essentially, narratology's discourse. This discourse is what we experience by materially constructing the conventions of the medium. And from the material of the discourse, we construct the world and events of the story based on the perspective we are provided. In reverse, the events and the world of the story determine the form of the discourse, which in turn formally determines how it is embodied in a particular medium.
In an interactive drama, where the user assumes the role of a character, the level in the model at which user action occurs is clearly meant to be that of the story-world. The player interacts by affecting the setting and other characters. The story-world provides the material for what interactions are possible; the user's actions should then become part of the action. However, other kinds of interactive narratives have user interaction at levels other than the story-world.
At its most basic level, most recorded narratives offer some control over their medium--particularly its timing. For example, the reader of a book can skim parts of the text, reread others, or put the book down and come back to it later. A museum-goer can glance at a painting as they walk by or study it for half an hour.
Some narratives offer the user control over the details of their discourse or presentation. For instance, the user might be able to control the camera viewpoint or might select different hypertext links, changing the order in which the underlying story is experienced.
Or the user might be able to specify the kind of high-order action she would like to see, either as an interactive "director" or as input to a story generator, which would then determine the details of the characters and setting.
So it could be argued be that a DVD player, a hypertext novella, an interactive drama game, and a story generator are all interactive narratives. They differ only at the level at which user action is intended to occur. If this is the only sort of interaction afforded and supported by the narrative, the user may still feel some sense of agency as long as those affordances and constraints are balanced. However, the user's interactions become more significant--that is, they have a greater impact on the action of the story--the higher the level at which those interactions occur.
This reformulated poetics more closely mirrors many existing interactive drama system architectures than its predecessor. Such architectures tend to share the same (generally implied) medium of computer software. Within this software medium, interactive drama systems tend to include one or more separate modules for handling action, character, setting, and manner, though often by different names.
Action--the events of the story--is usually guided or produced by an agent called a director, drama manager, game master, narrator, scene manager, or story generator. Characters include agents or cast members under the control of the drama system. An avatar or player character is the character under the control of the user. The setting--those locations and props of the dramatic world--might be collectively called the story world, environment, stage, or theater. Finally, the manner, or discourse, layer of the system handles the details of presentation to the user, as well as receiving interaction from the user. This component may be called the narrator, presenter, renderer, theater, or interface.
Certain terms, such as narrator and theater, appear under more than one category. Additionally, some projects subdivide or combine certain elements depending on their needs.
With this proposed poetics framework as a foundation, we can examine existing interactive drama architectures for similarities.
The Oz Project has followed a practically identical model (Mateas 1997). This is also the model used by Chris Fairclough's (2005) OPIATE project.
These system components map to the poetics levels as follows:
Poetics | Oz Project |
---|---|
Action | Drama Manager |
Story World | World (including Characters) |
Manner | Presentation |
Not all architectures map as cleanly to the new poetics as does the Oz architecture. For instance, Mateas and Stern's Facade uses the following architecture:
At first glance, it may seem that Facade's Natural Language Processing module provides the manner layer. However, this processing is done only for textual input, not output. When a player enters a text string8, the system translates this surface text into a discourse act. These discourse acts have nothing to do with manner or discourse in terms of the poetics; rather, they are events in the story world that formally prompt character reactions.
Facade breaks action management into beat selection and modeling the story so far.
Poetics | Facade |
---|---|
Action | Drama Manager + Story Memory |
Story World | Story World (including Characters) |
Manner | [not shown] |
The Virtual Storyteller project aims at developing an animated character that narrates generated stories.
Though they mention characters' knowledge of their "virtual environment", they do not include this virtual world in their model. The characters' actions are directed by the "director" to produce a story. This project's "narrator" then rearranges the events of the story, and the "presenter" relates them to the user. This places these two components at the level of manner.
Poetics | Virtual Storyteller |
---|---|
Action | Director |
Story World | Characters; ["virtual environment" (setting) not shown] |
Manner | Narrator + Presenter |
Spierling et al. (2002) propose an architecture comprised of four hierarchical levels. The output of each level provides input for the one below it--much as the poetics describes formal cause working downward through the levels.
Spierling et al. have divided story management into two levels: the management of the overall general (functional/morphological) structure and the construction of specific, instantiating scenes. The Character Conservation Engines also hints at a model of setting to support stage direction.
Poetics | Spierling et al. |
---|---|
Action | Story Engine + Scene Action Engine |
Story World | Character Conversation Engines |
Manner | Actor Avatar Engines |
As we have seen, the revised poetics provides a good overview of interactive narrative elements. Each project here uses different terms and different divisions between architecture components. Often a project may include multiple components for a single level of the poetics. This generally indicates what aspects the project is most interested in exploring. For instance, Facade is focused on story management and so has two components at the action level. On the other hand, the Virtual Storyteller is particularly interested in the narration of stories and so they have two components at the manner level.
A project may also combine two elements into a single architecture component or fail to mention an element in their overview if it plays a marginal role in that project's particular focus.
Overall, it seems the revised poetics can at least inform the design of system architectures--which is not something that can be easily claimed about the previous version.
Besides applying to cutting-edge interactive narrative systems, the new poetics also shares striking similarities to the definitions found in established narrative theory. For instance, Seymour Chatman gives the following conceptual division of a narrative text:
These concepts match those of the proposed poetics, with the translation of terms show as follows:
This simple translation shows how interactive narratives share much of their structure with traditional narrative forms, and so they can be understood in the same terms. While this translation allows us to apply the bulk of narratology to interactive narrative, it also adds to narratology--specifically, the notion of formal and material causes at work between the different levels, which are the foundations of a definition of interactor agency.
The intent here has not been to return to Aristotle, but to clarify the existing poetics model based on its evolution through four authors. To do this, it is important to return to Aristotle's distinction between the object, medium, and manner. A story (or any creative work) is always embodied in some particular medium. Renaming Laurel's bottom three levels, we have clarified the different aspects inherent to all such media. We have added setting to the world of the story, as action does not progress in terms of character alone, especially in an interactive drama. We have resurrected manner--that is, the presentation of the story--within the model.
Applying Mateas's definition of user action and agency at different levels, this new model can apply to other forms of interactive narrative besides dramas. The model also mirrors many existing interactive drama system architectures much more closely than the previous poetics model, thereby connecting theory to design. Finally, this model has evolved striking similarities to a standard model of narratology. While providing the new poetics a strong credibility, it also allows for the application of traditional narratology to interactive narratives.
Now that we have established a framework for understanding the structure of narrative, we can examine the details of its components. With our aim of directing a story that incorporates a user's actions, it is the structure of a story's action that is of the most interest to us here. Among others, the important features of a story's action have been examined by Aristotle, Freytag, Propp, and Chatman.
Besides laying out six formal components of tragic narrative, Aristotle explores the requirements for each component. In examining the action, which he calls Plot, Aristotle states that it should be whole, complete, and of a certain magnitude. More specifically:
plot, being an imitation of an action, must imitate one action and that a whole, the structural union of the parts being such that, if any one of them is displaced or removed, the whole will be disjointed and disturbed. For a thing whose presence or absence makes no visible difference is not an organic part of the whole (Aristotle 1961, Chapter VIII).
Aristotle continually stresses this unity of the action. The events comprising the action must be connected to one another by necessary or probable cause to make a single whole. By the nature of these causal connections, a story has a beginning, middle, and end:
A beginning is that which does not itself follow anything by causal necessity, but after which something naturally is or comes to be. An end, on the contrary, is that which itself naturally follows some other thing, either by necessity, or as a rule, but has nothing following it. A middle is that which follows something as some other thing follows it (Aristotle 1961, Chapter VII).
A story is complete when it includes all these necessary events; it does not start or end haphazardly.
Aristotle also held that a plot should be of a certain magnitude. It should be at least long enough to admit a change in fortune (Chapter VII). Generally, he considered larger, longer plots to be more beautiful, but only so long as they did not exceed the audience's memory. Once a plot becomes too long or complex to hold all in mind at once, the audience will lose any sense of its unity (Chapter VII).
A tragic plot can be divided into two parts. The events before the change in fortune are the Complication; those after are the Unraveling (or Denouement) (Chapter XVIII).
Aristotle held that the worst plots are episodic, where episodes succeed each other without necessary or probable causes. He stressed that the unity of a single protagonist throughout does not guarantee a unity of action (Chapter VIII). This low opinion of episodic narratives seems to have survived to today, as we generally consider TV shows and serial fiction to be of lower quality or less "literary" than movies and novels. However, even these episodic narratives tend to exhibit Aristotle's unity within an episode.
Aristotle describes additional requirements for a good plot, but most of them are specific to tragedy: the Recognition, the Reversal of Situation, and the Scene of Suffering, as well as the structural parts in relation to the choric song.
Yet despite Aristotle's focus on tragedy, he does leave us with some rules applicable to any well-formed story. It should concern itself with a single, unified action. It should be long enough to admit some change in state. It should have a beginning, a middle, and an end.
Gustav Freytag, a 19th century German novelist and playwright, further explores the requirements of a well-formed drama. His focus--while wider than Aristotle's concern with classical Greek tragedy--is still limited to "serious" drama. Though Freytag follows Aristotle's lead quite closely and still tends to tragedy over other dramatic forms, he does offer some additional insights.
Freytag holds that a drama is based on a Idea of the author's. This central Idea provides the unity of the drama, as it should serve to structure the action and determine the significance of the characters.
Similar to Aristotle's concept of praxis, Freytag holds that a character's emotions, thoughts and motivations are essential to serious drama. The emotions or actions themselves are not as interesting as how a character's emotions serve to bring about a will to action. A character's motivation can arise from within, or it can be produced though external influences upon the character.
The action should be unified, with a clear beginning and end. The end should bring a termination to any strife within the play. The events should follow each other as necessary or probable. The playwright must do more than simply show the events: he must make the events believable by exploring the characters' motivations and reasons, which should be consistent and credible.
Yet, for all this unity, Freytag does admit that occasional episodes not completely essential to the central plot or Idea may highlight or clarify a character, provide an interesting contrast to the main action, or otherwise enhance the overall effect of the play. If done correctly, these ornamental embellishments cannot again be easily "unclasped" from the main work.
Freytag is best known for his pyramidal depiction of dramatic structure. He begins by defining two states of the action: the play and the counterplay. During the play, the hero is predominately proactive, working outwards, striving, turning a desire into action. During the counterplay, external forces or opponents are affecting and directing the hero; the hero is primarily passive, his motivations and actions arising in response to outside forces rather than from within. A serious drama will contain a both a play and a counterplay, though either one can come first. The point where one becomes the other--when the passive hero finally resolves to action, or else when the active hero begins to be subjected to the circumstances his actions have wrought--is the climax. This definition implies that a serious drama will always contain some sort of struggle or conflict involving the hero.
Including the climax, Freytag defines five "parts" of drama (see diagram), as well as three scenic effects, or "crises".
These components occur in the following order:
The Introduction (a) explains the background of the drama by establishing time and place, noting the nationality and life relations of the hero, and briefly characterizing the environment. This may be presented as a narrator's call for attention (as is more common in older dramas) or as a short scene of action itself.
The Exciting Force is a scenic effect that occurs between the Introduction and the Rising Movement. It may be a whole scene, or only a few words. It marks when the volition arises in the hero that will lead to the action of the play; or, if the counterplay occurs first, it is when external forces resolve to affect the hero.
The Rising Movement (b) includes those events that further the action, introduce all major characters, and awaken the audience's interest.
As previously defined, the Climax (c) is the moment when the play becomes the counterplay, or vice versa. It should be inseparably connected to the previous action.
The Tragic Force (or Moment) is a scenic effect that may not occur in all dramas. Closely tied to the climax, it marks the beginning of counterplay.
The Return (d) begins to resolve the action. It should not introduce new characters or material, but build on what has already been established.
The Force (or Moment) of Final Suspense is a scenic effect that may not occur in all dramas. It seals the conclusion of the drama such that the audience feels "the compelling force of what has proceeded", for the Catastrophe should not come as a surprise.
The Catastrophe (e) completes the action. It should be brief, and provide a fitting end for the hero.
Freytag uses two dimensions for his diagram. Presumably, the horizontal axis is story time. Freytag does not define the vertical axis nor does he specify what exactly is "rising", "returning", or "falling" through the plot.
Besides exploring structure, Freytag also puts forth strong opinions on the content of serious drama which may not apply to many modern narratives. For instance, the hero's "force and worth shall exceed the measure of the average man" (Freytag 1895, p.63). The action should not be based on lamentable or common motives--such as thieving, cowardice or stupidity--leading to dishonest actions. The details of serious drama should not contradict reality. Modern narratives frequently violate these rules.
Vladimir Propp authored Morphology of the Folktale in 1927, although it was thirty years until it was translated into English (Propp 1968). In it, Propp explores how to classify folktales. Most previous attempts at classification were based on the contents of tales--such as a tale's general category (fantastic, everyday life, or animal tale), a tale's themes, or a tale's motifs (such as the presence of a dragon, witch, or magic ring). Instead of content, Propp looked at each tale's structure--specifically, the effects of the actions of the characters and agents.
For example, consider these parts of three tales:
Rather than classify these depending on whether they include magical items, or a tsar, or whether the hero is an animal, Propp recognized the function present in them all: a "donor" is providing the hero with a special item, which is then followed by the spatial transference of the hero to a new land.
As Propp defines it, a function "is understood as an act of a character, defined from the point of view of its significance for the course of the action" (Propp 1968, p21). These functions "constitute the fundamental components of a tale", and they are "independent of how or by whom they are fulfilled" (Propp 1968, p.21). That is, while the specific details provide the color unique to each tale, the underlying structures are constant. Furthermore, the number of known functions is limited; and, though not all functions are present in all tales, those functions that do appear always occur in the same sequence (Propp 1968).
A function can be seen as a genus, with a number of specific events (and even variation on those events) serving as an example of that function. Indeed, Propp provides a number of example "species" events for each "genus" function. However, the same events can fill different functions depending on their place in the story. For instance, a man may marry a widow with two children, thereby setting up the action of the story. Or the hero may receive the hand of the princess, thereby achieving his reward and ending the story. Here, the event of marriage is filling different functions at the beginning and the end of the tale.
Propp studied one hundred Russian folktales from the Aarne-Thompson index, and discovered 31 functions and a handful of other morphological features, such as character roles. As illustration, here are some of the more prevalent functions:
Symbol | Function | Description |
---|---|---|
α | Initial Situation | Introduction to hero by name or status, enumeration of family members, etc. (Though this a morphological element, it is not considered a function.) |
β | Absentation | A member of the family leaves home. |
γ | Interdiction | The hero is forbidden to do something. |
δ | Violation | The hero violates the interdiction. |
ε | Reconnaissance | The villain attempts to gain information |
ζ | Delivery | The villain succeeds in learning something about his victim |
A | Villainy | The villain causes harm, injury, or misfortune to a family member. (This serves as the complication, ending the "prepatory part" and beginning the actual movement of the tale.) |
a | Lack | The hero or a family member lacks something. (Serves as an alternative to A.) |
B | Mediation | The hero is made aware of the misfortune or lack. (Distinguishes between seeker/voluntary heroes and victimized/involuntary heroes.) |
C | Counteraction | A seeking hero agrees to go. |
↑ | Departure | The hero leaves home. |
D | First Function of the Donor | The hero encounters a donor (who greets or otherwise tests the hero). |
E | Hero's Reaction | The hero reacts to the donor. |
F | Provision or Receipt of Magical Agent | The hero receives some item, animal, or other assistance from the donor. |
G | Spatial Transference | The hero is guided or transferred in the direction of the object of his search. |
H | Struggle | The hero and villain engage in combat or competition. |
I | Victory | The villain is defeated. |
K | Liquidation of Misfortune or Lack | The initial villainy is undone, or the initial lack is fulfilled. This function is paired with A or a and forms the peak of the narrative. |
↓ | Return | The hero returns home. |
W | Wedding | The hero marries, ascends the throne, or receives some other reward. |
Propp comments that the functions all "belong to a single axis" (1968, p64); that is, they follow each other sequentially in a tale. Furthermore, "one function develops out of another with logical and artistic necessity" (Propp 1968, p64). A number of the functions come in pairs. For instance, an Interdiction (γ) is always Violated (δ). As mentioned, the details of a Liquidation (K) depend on the nature of the initial Villainy (A) or Lack (a). Other logical groupings are noticeable, such as the complication (ABC↑) and the interaction with the donor (DEF).
Propp explains other variations within functions. For instance, sometimes they have a negative form, such as a command rather than an interdiction (γ), which must then be fulfilled rather than violated. Or the hero may react negatively (E) to the donor's request (D). Trebling of a function or sequence is common--the hero meeting three donors before receiving the agent, or three heroes meeting a donor before the last hero is successful.
As it provides specific classes of events that linearly comprise a well-formed tale, Propp has long been a inspiration for automatic story generation and interactive narrative systems. However, Propp's model applies to a small set of simple tales--Russian folktales. His functions do not always apply cleanly to fairy tales from other cultures. It is certainly not a universal model of narrative structure. The hope is that similar rules of structure will be found for other genres of tales. However, despite the work of other structuralists, such Tzvetan Todorov's analysis of the Decameron stories (Chatman 1978), Propp remains the most widely cited model in interactive narrative work to date.
In examining the details of literary theory and narrative structure, Seymour Chatman (1978) explores the nature of a narrative's events, which corresponds to our meaning of action.
Chatman notes that a story's events are "radically correlative, enchaining, entailing", forming a sequence that is "not simply linear but causative" (1978, p. 45). As we have seen, this has been a view held since Aristotle. Chatman suggests that readers will often infer a causality between events, even if such a relationship is not explicitly stated in the narrative itself.
More recent critics have denied such a strict causal view of narrative. A more relaxed view is that later events are simply contingent upon earlier events. That is, the later events depend on earlier events for their existence or occurrence, even if those earlier events did not specifically cause the later events.
However, Chatman notes that not all narratives are concerned primarily with events, changes, or consequences. In a modern plot of revelation, the point is to simply reveal a state of affairs or to explore the details of a character.
The connections between events play a part in a story's verisimilitude. Part of a story's "believability" stems from how early events lead to later events. That is, to what degree does the story contain outrageous coincidences or completely unforeshadowed solutions. But another part of a story's verisimilitude involves the characters--particularly whether their reactions and motivations are understandable. However, it should be noted that verisimilitude is largely a matter of convention established by other texts in the same genre. For example, shooting a cheater over a game of cards is relatively normal behavior in a western, requiring little explanation. The same act is less usual in a Victorian society novel, which means the killer's motivations may need to be presented in more detail for a reader to accept the action.
Chatman also argues that not all events are equal--some are more important than others. These important events, which he calls kernels, are those narrative moments where the course of events is decided, where one path is chosen from the various possibilities. It is these kernels that are connected by causality or contingency; as such, these kernels cannot be removed without destroying the logic of the story.
However, there also exist minor events, which Chatman calls satellites. Satellites do not entail any choice, but serve to flesh out the consequences and details of the kernels. Therefore, satellites could be removed from the story and leave the logic intact, although the resulting story would be impoverished.
These kernels and satellites define a microstructure of events. However, we can also discern a macrostructure--a general overarching story structure--which allows us to group stories together based on structural similarities. The work of Propp and Todorov are examples of this. But Chatman points out the limits of this approach. These macrostructures are not generic or universal, but very specific to a narrow, particular genre--such as Russian folktales or Decameron stories. These are simple, well-structured tales, and their macrostructure does not apply to more general stories.
Nor can we classify stories simply by indexing their kernel events, as an event can only be understood in terms of its greater story context. "A killing may not be a murder but an act of mercy, or a sacrifice, or a patriotic deed, or an accident, or one or more of a dozen other things. No battery of preestablished categories can characterize it independently of and prior to a reading of the whole" (Chatman 1978, p. 94).
Based on this review of four authors, we can now draw some general conclusions concerning the structure of narrative action.
First of all, the action should be a single, unified whole. This is either because it concerns a single action (Aristotle 1961) or revolves around a single central Idea of the author's (Freytag 1895).
Yet the action is made up of a series of separate events. What unifies these events is that they are connected by necessary or probable cause (Aristotle 1961, Freytag 1895), "logical or artistic necessity" (Propp 1968), or some other connective contingency (Chatman 1978). Speaking generally, we might simply call this necessity. An event is necessary to the tale if its displacement or removal would leave the whole "disjointed and disturbed" (Aristotle 1961).
An essential part of necessity is characters' motivation or praxis--how their emotions or thoughts lead to a will for action (Aristotle 1961, Freytag 1895). Believability of character motivations and the credibility of necessary connections contribute to the verisimilitude of the tale (Freytag 1895, Chatman 1978).
Besides from being a unified whole, a story must be complete. Specifically, it must have a beginning, middle, and end. This usually implies some change of state--an initial condition, some change or problem, and then a final condition. This change of state often corresponds to a change of fortune for the protagonist (Aristotle 1961, Freytag 1895).
But more than a simple change, stories often involve some conflict concerning the protagonist. Freytag defines his climax in terms of the play and the counterplay, which refer to the protagonist actively working towards some end or else being the subject of external forces. In order to be complete, this conflict must be resolved by the end of the story.
The length and detail of a story--Aristotle's "certain magnitude"--varies greatly. Size has an impact on completeness, however, since generally there are more events and details whose various effects must be resolved before the conclusion of a long tale.
Every story also has a macrostructure or formal morphology. At its simplest, story events occur during either the beginning, middle, or end of the tale (Aristotle 1961), or else within the bounds of Freytag's eight parts and crises. According to the conventions of a certain genre, we might determine a more detailed morphology, as Propp did for Russian folktales.
Chatman's point concerning an event's greater story context is also important to consider here. For instance, we may have an event in the world of the story, such as a killing. But it is only in terms of the story context that we can determine the importance or meaning of this event--whether it is a murder, a sacrifice, or act of mercy (Chatman 1978). Furthermore, these different story-contextualized events can fill different functions in advancing the story. For example, within Propp's functions, a murder might serve as an initial villainy, or the hero might murder the donor in order to receive the magical agent.
This example reveals three views of a story's action. We have the world-level: the event itself in the story world. We have the story-level: what that event means in terms of the greater story context. And we have the morphological-level: what abstract function, if any, the event serves in advancing the story.9
A story does not concern only form, but content as well. Indeed it is the content--the quality of the characters, the details of the world, the specific flavor of the events--that makes each narrative truly unique.
Finally, we must remember that not all narratives adhere strictly to these rules. Aristotle admits the existence of episodic plots that are neither unified nor complete. Chatman points out the existence of "antistories"--narratives that deny any linear sequence of events--as well as revelatory stories, where the focus is on the existents of the world rather than on the events.
Even within a relatively well-structured story, some events do not need to be strictly necessary, as both Freytag's "ornamental episodes" and Chatman's satellites illustrate.
Still, if a story is well-formed, it should demonstrate these basic formal features: a unity of events through necessity, forming a complete whole with a beginning, middle, and end, and exhibiting a general macrostructure story-form.
We now have a working model of narrative sufficient for our purposes here. A narrative is more than just a series of events. Instead, it involves action, characters, setting, manner, and medium.
The action should be both unified and complete. Character and setting are essential material for this action, because "one cannot account for events without recognizing the existence of things causing or being affected by those events" (Chatman 1978, p.34).
This story-world and its events are then presented or narrated from a certain point of view. And this presentation is encoded in some medium. Whether told live in spoken words or encoded in a static artifact to be viewed later, the medium affects and constrains the transmission of the story.
A narrative becomes interactive when it offers the audience some means of affecting one of these aspects. This interaction will offer the user a sense of agency when the formal and material causes involved are balanced.
The Marlinspike architecture is designed to produce a directed interactive drama. An interactive drama is a system in which the player assumes the role of a character in an unfolding story and, through their actions within the virtual story world, can influence the outcome of that story. By directed we mean that the system has some form of centralized control over the story world. This centralized drama manager component works to direct the story, thus generating a well-formed, coherent plot that incorporates the user's actions. (This is in contrast to an emergent approach, in which the story is meant to emerge solely from the player's interactions with autonomous character agents or the static rules of the story world.)
Here we explore the basic design of Marlinspike. We will then see how this builds upon the poetics discussed previously, as well as other existing interactive drama architectures.
In brief, Marlinspike translates user-entered, world-level verbs into story-level actions. It then responds to these actions and advances the story by selecting a scene from a pre-authored collection. This dialog between system and user produces a thread--a series of causally-related scenes and actions. Threads are eventually woven together to form a complete story. The details of this process are as follows.
A Marlinspike drama takes place within a particular story world--a collection of simulated objects, including locations, props, and characters. Each object supports certain relevant world-level behaviors, such as being picked up, opened, dropped, etc.
Marlinspike characters are not autonomous agents, but simple puppets of the drama manager. Characters are comprised of a number of attributes, including morality (how altruistic or self-interested that character is) and affinity (how much they like or dislike another particular character). These attributes affect how certain verbs are translated to actions and how characters are cast into roles (see below). Characters also contain their own set of potential conversation responses.
Marlinspike includes a number of predefined verbs, which are used by the player to interact with the objects of the story world. Example input from the player might include take tennis ball
, go north
, talk to Alice
, or kill Fred
. These illustrate the verbs Take
, Go
, Talk
, and Attack
(for which "kill
" is a synonym). When we include the direct object each verb refers to in the story world, we produce a number of commands: Take(ball)
, Talk(Alice)
, Go(n_obj)
, Attack(Fred)
. Occasionally, a command requires a second object, as in: Show(Alice, ball)
.
When processing user input, the system first determines whether the resulting command is possible. For example, the tennis ball might be nailed to the floor or Alice might not be in the room, making the input take ball
or talk to Alice
impossible to complete. Every valid command (and its component verb) is then translated to one or more events (each with its component action; see below) based on the current story context.
Actions represent story-level actions. Every defined action includes an import value which suggests how dramatically exciting it is. For instance, the MANIPULATE action (which represents such verbs as Take
, Drop
, and Push
) has a very low import. On the other hand, the MURDER action (which might be produced by the verb Attack
) has a very high import.
Actions are the primary components of events, which have a sentence-like data structure similar to commands. That is, an event may include the object that receives the effects of the action, as well as any second object necessary to the event. So an event is simply the occurrence of a particular action, including any object(s) directly affected by that action.
Every verb translates, by default, to an action. For example, Drop
generally translates to MANIPULATE. The default translation may take the current world-state into account. So the command Kiss(Alice)
should translate to the event ROMANCE(Alice) only if Alice already has a high affinity for the player. If Alice detests the player, this same command would be better translated to ASSAULT(Alice).
However, within the context of a particular scene, this default translation can be overridden. For instance, if Alice has been transformed into a frog, then Kiss(Alice)
should instead become RESCUE(Alice).
A command can also be translated into more than one event. In this case, the extra events form of a tree of sub-events. So, if the player has already established a Girlfriend relationship with the character Betty, Betty might take offense at this kissing of Alice. This secondary effect would be appended as a sub-event, producing the following structure:
RESCUE(Alice)
|-- OFFEND(Betty)
Actions also have effects associated with them. This can include world or character state changes. So RESCUE(Alice) will probably increase Alice's affinity for the player, and possibly increase the player's morality value. Actions also provide some textual narration or graphical presentation, so the player can be made aware of the action's effects.
After each event has occurred, it is appended to an event history list, which forms a transcript of story-level actions that have occurred so far.
Like verbs, actions are player-centric. That is, the subject of an event--the character that undertakes the action--is always the player's character. The actions of non-player characters is instead handled solely through scenes.
Scenes are pre-authored components that serve two purposes in Marlinspike. The first is to advance the story. Since characters are not autonomous and all story control is centralized in the Marlinspike drama manager, it is only through scenes that characters respond to player actions and the story is advanced by the introduction of new material. Secondly, as we've seen, scenes provide a story context that may override some of the default verb-to-action translations.
Every scene has a list of preconditions that determine whether it can currently be appended to the story-so-far. Preconditions might include the current story-world time, character locations, prop states, character attributes, or previous story actions.
Scenes can be either durative or instant. Durative scenes introduce some event in the story and then remain active, preventing further scenes from occurring until its own ending conditions have been met. This is handy for introducing complex choices that have many possible player responses. For example, if a vampire offers the player immortal undeath, the scene may want to remain active so it can determine what actions corresponds to the player accepting, fleeing, or fighting the vampire.
Instant scenes introduce some event and then end. This is best for providing character reactions to previous events or for story elements that don't need much control of the story context.
Instant scenes can still add to the story context by introducing a trigger. A trigger watches for the later occurrence of a certain verb and can then interrupt or append to its translation into an action. For instance, in an interactive Bluebeard fairy tale, an instant scene could have Bluebeard tell the player not to open a certain closet. The scene then ends (giving other appropriate scenes a chance to play), but it would also add a single trigger to watch for the Open(closet)
command. Should this command occur (even within another scene's context), the trigger can then append the DEFY(Bluebeard) sub-event to the resulting default MANIPULATE(closet) event.
Alternately, a trigger can be set for an action instead of a verb. So, in a previous example, the player's jealous friend Betty took offense at the Kiss(Alice)
command, even though Alice was an icky frog that needed disenchanting at the time. If we wanted to make Betty more forgiving, we could set her trigger for the ROMANCE action instead of the Kiss
verb.
Aside from being durative or instant, scenes also belong to one of three functions. Beginning scenes have no preconditions and are selected by the drama manager to start a new story. The bulk of scenes are middle scenes, which serve to advance the story in some way. Ending scenes have preconditions, but provide a conclusion to the current story.
Scenes are represented by a SCENE action, and so every scene becomes an event (with no subject or direct object) that can be appended to the event history. If a scene is durative, its record may include, as sub-events, the player actions that occurred within its context.
A scene or action can also cast characters into certain roles specific to the particular story. For instance, a scene might establish that a princess has been imprisoned by an evil wizard. If the player then rescues the princess, the RESCUE(princess) action may cast the princess into a Friend role and the evil wizard as an Enemy.
In many ways, roles simply serve as a short-hand notation for all the possible actions or scenes that could produce a certain relationship or state. For instance, there might be a number of ways to make a Friend in the game--only one of which is RESCUE-ing them. But once the player has a Friend (by some means), that relationship can serve as an precondition for a number of later scenes that require a Friend.
Additionally, roles serve to separate relationships from character attributes. For example, the player might be greatly esteemed by a number of characters in the story world; that is, they all have high affinity for the player. Yet a Friend role can only be cast through a significant story event. Therefore, if a later scene has misfortune befall a Friend character, it should be more significant and relevant to the story than if that same misfortune simply befell one the player's many fans.
Like scene triggers, roles can also interrupt verb-to-action translation, as we've seen with Betty's jealous Girlfriend role in previous examples.
Events--arising from both user-entered commands and system-selected scenes--can be connected in terms of necessity. That is, if one event must precede another, we can say that the first is necessary for the second.
In terms of the Marlinspike system, these connections of necessity can be equated with preconditions, scene context, and the casting of roles. So, if a scene lists a certain action in its preconditions, then that action is necessary for the scene. Similarly, if a verb is translated into an action based on the currently-active durative scene or based on a trigger from a previous instant scene, then that scene's context is necessary for the action. Finally, if an action or scene changes, ends, or otherwise requires a role, then that role (and whichever scene or action originally cast that role) is necessary for the current action. For instance, if a scene requires that the player already have an Friend, the earlier action that resulted in making that Friend is necessary for the scene to occur.
These three simple rules allows us to plot threads of necessity through the events of the story. So, to return to the Bluebeard example, a scene may establish that the player is not to open a certain door and then sets an appropriate trigger to enforce this context. When that trigger produces the DEFY(Bluebeard) event, we can say that the earlier scene was necessary for the DEFY-ing. In turn, the DEFY(Bluebeard) action can now serve as a precondition for an enraged Bluebeard scene.
Threads serve to incorporate events into a coherent storyline. As mentioned, a storyline starts with a beginning scene. This thread can be continued by some action building on a role or context established by this scene. However, the player might also start a new thread by performing some unrelated action of high import.
The drama manager works to extend an existing thread with the next selected scene. If the last event of more than one thread can serve as preconditions for the next scene, then the drama manager can effectively splice two (or more) threads together into one thread. Contrarily, if the system cannot currently extend any existing thread, it may be forced to fork an earlier thread event, thereby creating a new thread which it will hopefully be able to splice back in later. This thread-extending behavior is the source of the Marlinspike name, which refers to the splicing and fancy rope work of marlinspike seamanship.
In the example to the right, SCENE6 is still pending. If its thread (Thread 1) is not extended, then the story is poorly-formed, as it contains unnecessary events. However, this evaluation is tempered by the rule that not every action or scene needs to be spliced into a thread, but only those of high import.
Marlinspike is based on the neo-Aristotelean poetics discussed previously, which specifies that an interactive drama can be defined in terms of multiple levels. Its highest level--Action--represents the significant events of the story. These events materially rely upon the underlying level of the story world--comprised of Characters and Setting--while at the same time also formally specifying that story world's nature.
As a character in the story world, the player interacts through verbs and commands. The alternative--letting players specify actions and events directly--would make the user more of a director than an actor in the drama. It is hoped that using verbs will increase the player's sense of playing a character within the story. It also allows multiple possible ways to achieve the same story-level action. For example, the player might RESCUE(princess) by breaking down her door, picking the lock, or magically teleporting into her cell.
Marlinspike scenes are partly inspired by the encounters of tabletop role-playing games (Dungeons & Dragons 2003). An encounter is an event structured in if/then terms--if the players do this, then this will happen. Sometimes this is can be as simple as "if the players enter this room, this monster will attack them." Just as all encounters are linked together to form an adventure, Marlinspike's scenes are strung together as threads.
These threads, when taken together, meet our poetics definition of a story being unified and complete, such that no part can be removed without leaving the whole "disjointed and disturbed" (Aristotle 1961). Each thread is a series of events unified by necessity. Yet, making allowances for the fact that some events may be more important or key to a story, we have required that only actions of high import have to be incorporated into a thread. Marlinspike stories are complete when they start with a beginning scene, followed by a number of middle scenes, and then an ending scene.
The way Marlinspike selects current scenes that build on past events is also inspired by Keith Johnstone's (1979) work in improvisational theatre. Specifically, he claims that if one focuses on the structure produced by reincorporating previous events, the content of stories tends to take care of itself.
The design of Marlinspike also owes much to certain existing interactive drama systems.
Marlinspike's actions and events are much like Chris Crawford's verbs and events as used in his Erasmatron system (Crawford 2004). However, unlike Crawford, we make the distinction between world-level and story-level events. Instead, Crawford only concerns himself with the story-level. (For instance, his verbs take only characters as subject or direct object.) Additionally, any character can be the subject of Crawford's verbs, while all Marlinspike actions are player-centric. Finally, Crawford's system is largely decentralized, with the rules of system (specifically, inclination formulae) determining how characters respond to the player's actions. As such, his system has no significant model of the plot beyond an event history and certain global variables.
The process of stringing pre-authored scenes together at run-time based on scene preconditions has also been used, among others, by Grasbon and Braun in GEIST (Grasbon & Braun 2001; Spierling et al. 2002) and by Mateas and Stern in Facade (Mateas 2002). Neither approach includes user actions as atoms in the story model, however. Instead, user actions within a scene (or within a beat, as Mateas and Stern call their story atoms) simply determine the outcome of that scene.
Marlinspike uses a much more flexible story model than either Grasbon and Braun's Propp-based morphological approach or Mateas and Stern's tension-based story arc model. While this will hopefully mean a system more adaptive to user action, it also means that Marlinspike dramas may not be as well-structured in terms of rising and falling tension.
Fairclough's (2005) OPIATE system includes a similar dynamic casting of characters into roles based on affinity for the player's character. However, OPIATE used case-based reasoning to build complete storylines at once. If the player failed three times to cooperate with a storyline's required choices, that storyline would be put on hold and a new one generated. Instead, Marlinspike strives to incorporate any important user actions into a single (albeit multi-threaded) story-line generated piece-by-piece.
Building on my own previous work with a scene-based approach (Tomaszewski & Binsted 2007), the Marlinspike design includes both scenes and user-entered actions as equivalent story atoms. Additionally, in the process of reincorporating past actions when selecting the next scene, it will be possible to reveal to the player exactly how their actions are impacting the story. For example, characters may refer to the previous event that resulted in their assumption of a role that is now relevant to the current scene. This is intended to offer the player a greater sense of story-level agency than existing systems.
Another step towards greater user agency is that Marlinspike always gives the player a chance to act before starting the next scene. This is meant to avoid long strings of non-interactive instant scenes, a problem that plagued our last design (Tomaszewski & Binsted 2007).
Finally, it is hoped that Marlinspike dramas will be faster to author than current scene-based approaches. While still dependent on a large number of pre-authored scenes, with Marlinspike not all significant action must be within the bounds a scene. This can potentially limit the number of scenes required for a successful story. Additionally, a scene needs only specify its exceptions to the underlying default verb-to-action translations rules, rather than directly providing the regular means of user interaction.
Marlinspike's design strives to better merge the strong story control of scene-based approaches with the high user agency of Crawford's verb-based approach. In this system, the player interacts as a character at the story-world level of verbs. The translation from these verbs to story-level actions can then be affected or overridden by the story context established by earlier story events. Marlinspike responds to user actions by selecting the next pre-authored scene that will further the story. The primary feature of Marlinspike is that, beyond just finding the next scene it can play, it strives to play scenes that actually make earlier user actions integral and necessary to the resulting plot. Thus, the system offers the user a wide range of possible actions at any point in the story, but also works to then incorporate those actions into its own model of the story.
Marlinspike has not yet been implemented, but a prototype is currently under development. While the Marlinspike architecture would also work in a mouse-based, graphical environment (though the range of verbs may be smaller), the first prototype game will be text-based. Specifically, it will use the interactive fiction system, Inform, to define the story world objects and to process user input. Therefore, the playing experience will be very much like that of interactive fiction, where the user types in commands in natural language and the system responds with text describing the results of their actions. The exception to this will be conversations with characters, which will be menu-based. This is meant to improve system affordances, since no natural language processing will be implemented beyond Inform's existing simple parsing capabilities.
The prototype game will be set in an alternate 1921 aboard the zeppelin airship Demeter. When seven wealthy passengers--one of whom is the player's character--awaken in their luxurious passenger gondola on the first morning of their trans-Atlantic flight, they find that the zeppelin crew has been slain. Interpersonal tensions mount as it seems something dark and thirsty still lurks in shadowy corridors of the airship above.
The number of verbs required will be comparable to interactive fiction games--about fifty to sixty. These will translate to approximately thirty different actions. Scenes will vary in size, but most will result in about a half-screen of text. It is expected that about sixty scenes will be required to produce an average story of twenty scenes in length; such a story should take about thirty minutes to play. Stories will end as soon as the drama manager can meet all the preconditions of one of about seven ending scenes.
The key feature of Marlinspike is its threading behavior. Because it reincorporates important user actions into a single unfolding story, Marlinspike's threading should lead to a better interactive narrative experience. In evaluating this claim, we need to look at three variables: narrative, interactivity, and user experience.
In an interactive drama, we are most concerned with the action level of narrative--particularly the story's event structure. We have already established our working definition of narrative structure: a unified, complete series of events with a beginning, middle and end, in which all events are connected by necessity such that none can be removed without leaving the whole disjointed or disturbed.
Admittedly, Marlinspike applies a slightly relaxed version of this definition in that only user actions of high import need to be made necessary to the unfolding story. Still, because the system tracks all the events--whether scenes or actions--and the threads of necessity between them, it can provide a number of measures of how well-formed the resulting story structure is.
The measures of well-formed story structure include:
We care about more here than just the extent or frequency to which the player can make inputs to the system. Rather, we are interested in the degree to which the player can exert agency by bringing about significant and meaningful changes in the system. The player's world-level agency is important, as this represents the degree to which the player can successfully move her character through the story-world, manipulate objects, or affect other characters. However, in determining the success of threading, we are much more interested in the player's story-level agency--the degree to which her actions significantly impact the later events of the story.
The measures of story-level agency include:
The purpose of an interactive drama is to provide a novel and meaningful experience to the user. Indeed, research has shown that interacting participants in a live-action interactive drama reported the plot to be an intense and significant experience, even while a passive audience found the same drama slow and poorly-formed (Kelso, Weyhrauch, & Bates 1992). Another significant aspect of an interactive drama is its replay value--that the player can experience the same story-world again but experience the consequences of making different choices.
While it is hoped that Marlinspike will provide a satisfying experience, there are a great many factors involved, such as the story's subject matter, the characters, the story's genre, the dialog, the presentation style, the text-based medium, the interface mechanisms, the player's surroundings, etc. Yet, with regard to threading, we care only whether the user feels that the story is better-formed and that they experience greater story-level agency when threading is used.
The measures for the player's experience of story structure and story-level agency will be the player's responses to a questionnaire administered after playing a Marlinspike session.
Therefore, our hypothesis is that Marlinspike's threading feature will produce narratives with a better-formed story structure, provide greater user agency at the story-level, and that users will be able to detect this difference and report a corresponding difference in experience.
The Marlinspike prototype game will be developed so that threading can be turned on or off. Normally when Marlinspike wishes to play a scene, it first establishes the set of scenes that have their preconditions met. From this set of possible scenes, it then selects the one that will splice or extend the ends of the most threads.
With threading turned off, Marlinspike would simply select a scene randomly from those that it can play at the current story point. Even when not using threads as a selection criterion, Marlinspike still tracks the state of threads, and so it can still report the same measures of story structure and agency as when threading is turned on.
Threading is significant as most existing scene-based interactive drama architectures do not use it. Instead, they simply determine which scenes can be played, and then select one at random--as Marlinspike does with threading turned off. For instance, our previous system design, Eudaemon, worked precisely this way (Tomaszewski & Binsted 2007). In particular, scene pre-conditions included the player's current location, recent actions, and whether the scene filled the next Propp function in the story. The perceived limitations of the Eudaemon system were due to adhering to Propp's requirements, not to the random selection of next scenes.
GEIST, which shares much is common with Eudaemon, also works very similarly to Marlinspike with threading turned off (Grasbon & Braun 2001).
Before involving real players, it is possible to test the bounds of the Marlinspike system with experimenter-generated input. That is, we can generate data sets that represent the extremes of user agreeableness, where agreeableness is the degree to which the player conforms to the author's conception of the ideal player.
To generate a hypothetical agreeable player data set, we can have the author of the game play through, making choices that would lead to the best-formed story.
The hypothetical disagreeable player comes in three flavors. The first is the inactive player, which is one who chooses to Wait
every turn, or otherwise fails to perform any actions of high import. The second is the random player, who can be simulated by generating random commands each turn until one is accepted as possible. Finally, there is the obtuse player, who completely ignores the story and strives to follow his own extreme goals. The obtuse player is usually notably violent, randy, or suicidal.
These generated data sets can test the ranges of Marlinspike's performance, with and without threading. However, these are extreme cases, which makes it important to consider real player data--including their reported experience.
The experiment to test the effect of Marlinspike's threading will proceed as follows. First, the generated baseline cases described above will be used to test the capabilities of the system. Then, the questionnaires to be used will be pilot-tested with 5 to 10 participants.
For the main study, 20 to 30 participants will be recruited from the University of Hawaii community, including both students and faculty. Each participant will be requested to fill out an informed consent form (see Appendix 1) and an initial questionnaire (see Appendix 2). This initial questionnaire will record the participant's gender, age group, education level, previous computer experience, computer game experience, and roleplaying game experience. This should let us control for these possibly confounding factors.
Each participant will then be asked to play through the prototype Marlinspike game twice. One session will have threading turned on; the other session will have threading turned off. To control for the effects of initial exposure to the system, half of participants will have threading on for the first session; the other half will first play with threading off.
After each of the two play sessions, the participants will be asked to complete a questionnaire (see Appendix 3) rating their experience in terms of story structure and story-level agency.
The stories generated by the participants' play sessions will then be examined by the measures described previously to determine the effect of threading on an interactive narrative experience.
Although examples of interactive narrative exist in such forms as roleplaying games and improv theater, the development of robust computer-based interactive narratives is a relatively new endeavor. This development holds great potential, both in producing a new form of creative human expression and as a challenging exploration of artificial narrative intelligence.
To guide this development, we must first have a theory--or poetics--of what comprises an interactive narrative. Although the foundations of such a poetics have already been laid, it needs to be closely examined and then reformulated to correct the tensions introduced during its evolution. The overhauled poetics model proposed here closely mirrors traditional narratology, as well as better informing system designs.
In particular, we are interested in a subset of interactive narrative: the interactive drama, in which the user assumes the role of a character within the story and, through their actions in the story-world, is capable of affecting the outcome of the generated tale. The Marlinspike architecture has been designed to support just such an experience. Its drama manager component works to incorporate user actions into a number of story threads, thereby producing a story that is both unified and complete.
Through the evaluation proposed here, we shall determine whether this threading behavior offers a significant improvement over existing scene-based interactive drama systems. Specifically, threading should produce a more well-formed story structure, greater story-level user agency, and an improved user experience.
Argax Project : Dissertation :
A Rough Draft Node http://www2.hawaii.edu/~ztomasze/argax |
Last Edited: 28 Dec 2007 ©2007 by Z. Tomaszewski. |