Argax Project

Node Status: COMPLETE

Skald

Skald was designed in 2012 following the interaction design process described by Cooper, Reimann, and Cronin (Cooper 2007; Tomaszewski 2013). My goal was to design a user interface (UI) for interactive fiction that:

A web-based or mobile-friendly interface was also an important, although secondary, design goal.

Considered Designs

I considered a number of alternative designs.

Action and Object Lists

Similar to Legend Entertainment's UI, this design would present two always-open menus: one of all the game-supported verbs and another of all the currently available objects. Clicking the words would add them to the input field.

Two menus, text description of location, and entry field.
Prototype of an always-open menu UI

This design clearly reveals all interactive objects and all interactions. However, because not every verb-object combination is logically possible, it does not clearly afford only the currently-possible interactions. In addition, based on my own experience with Legend's interface, such menus or lists are tedious to use to construct commands due to their length and the scrolling required in larger games. They do still provide useful training wheels to reveal available options. When playing Legend games, I would refer to the menus for hints when stuck, but I still typed at the command prompt to input most commands.

Real-Time Parsing

It is possible to keep the command line interface yet still prevent syntactic and even illogical input errors. In its simplest form, this design would parse the user's input while they type. While the command is unparsable or represents an action that is not currently possible, the input field could be highlighted red and the input prevented. Once the command is valid, the input field could switch to green and allow the user to submit the command.

Text description of location, and entry field repeated in different colors with different amount of text in each.
Prototype of a real-time parsing UI, showing different states of the input field over time

Although this design prevents invalid input, it does not afford possible commands. The user may still only discover the edges of the game through trial and error.

Inverse Parser

As an extension of real-time parsing, it would be possible to recommend a list of permissible tokens at each point of command construction. For example, the input field could start with a drop-down menu of all currently-possible verbs in the game. Then, based on the initial verb chosen, another menu could appear that contains only valid direct objects for that verb.

Text description of location, and entry field repeated showing different menu states of command construction.
Prototype of an inverse parser UI, showing different states of the input field over time

This technique prevents invalid input and affords only valid commands. However, it seems that it would not be any faster than typing. Rather than a series of separate menus, it seems that it would be faster and smoother to provide a single verb menu with sub-menus for any direct objects relevant to that verb. A single final click would then produce a valid command.

Object List

Users may not always begin their thought-process knowing what action they want to take. Instead, they may wonder what objects are currently around them, or they may be intrigued by a particular object and then wonder what they might possibly do with it. In such situations, it may be more natural have a menu of currently available objects. The user can then select an interesting object and pull up a list of the actions possible on that particular object.

Text description of location, and objects menu, with fly-out menu with verbs related to that object.
Prototype of an object-menu-based UI, showing different states of the menu over time

However, this design alone does not easily support actions that take no direct object, such as Look, Sleep, or Wait.

In-text Pop-up Menus

All of the previous designs have provided a means of input that is divorced from the primary output of IF: the text descriptions of the world. To blend these two, I could highlight or underline the names of interactive objects within the text itself, very much like hypertext links in a webpage. The user could then click on objects in context and select actions only relevant to that object.

Text description of location, with inline links and resulting pop-up menu.
Prototype of an inline pop-up menu UI, showing different states of the menu over time

However, as with the Object menu design, this approach does not support verbs that take no direct object. These verbs could be afforded through a different means, such as a row of buttons along the bottom of the screen.

Text description of location, with inline links and resulting pop-up menu, plus row of three verb buttons along the buttom.
Revised prototype of an inline pop-up menu UI including extra verb buttons

Additionally, because the state of the game world potentially changes with each user action, it may be necessary to change previous object menus if an object changes or even disable the menu entirely if the object disappears. For example, suppose a scene describes a small cake. The user then clicks on the "small cake" and selects an Eat cake action. This produces a system reply that describes the outcome: "You eat the small cake." If the cake is now consumed, it should not be possible to click on the cake and examine it or try to eat it again. Indeed, any past description of the room that even mentions the cake is now only historical context. Similarly, if the player closes a door, the door's action menu needs to be updated to drop the Close door action and insert an Open door action. The simplest way to handle these various discrepancies between the state of the world as described in past turns and the current state of the world is to simply remove the text of past turns or else disable it in a way that indicates that it is no longer relevant.

Paper Prototype

Of these various designs, the in-text popup menus seemed most natural and unobtrusive. I constructed a paper-prototype that described a simple Sleeping Beauty-like scenario. The player finds himself, clothed in a torn cape and carrying a used sword, in a tower room with a sleeping princess and a songbird in a golden cage. The goal of this short scenario was to wake the princess... although it turns out that the solution is not the traditional kiss. I asked a handful of users to walk through this scenario, selecting underlined "links" on the page, and waiting for me to provide the resulting in-text pop-up menus as Post-It notes.

Photograph of paper prototype, including background sheet, various menus built from post-it notes, and resulting output on slips of paper.
Skald paper prototype

This prototype revealed two interesting consequences of the in-text pop-up menu design. The first revelation was that not all of the currently-accessible objects will always be mentioned in the most-recent turn's output. Even if the description of a new room exhaustively lists all of the interactive objects it contains, it would be unusual and tedious to also repeatedly mention the player and all of the items she might be carrying in her inventory.

Furthermore, even if all room descriptions were exhaustive, most turns produce descriptions of only specific objects. For example, suppose the player enters a tower room that contains a number of different objects, including a cage. If the player then selects to Examine the cage, this would disable the current room description and produce a response containing only a description of the cage and its contents, without mention of most of the other objects in the room. Similarly, other actions may not provide much description at all. For example, close door often results in a message as simple as The door is now closed.

System responses like these that contain no object links--or at least no objects that interest the player--result in "dead ends" of interaction. If previous turns' links have been disabled (for the game-state reasons described above), the user is left with no in-text means for further interaction. All is not lost: the user must simply Look in order to re-examine the current room and all of its described contents and exits.

There are work-arounds that would prevent this need for the user to break out of interactive dead-ends by pressing a Look button. For example, it would be possible to add two additional panes to the user interface: one Location pane that lists the complete description of the current room and its objects and another Inventory pane that lists the player and all of the items she is currently carrying. These additional panes would need to be automatically refreshed after every turn. The drawback here is that this design change would complicate both the user interface--which now has three separate panes of text to show the current location description, an inventory list, and the result of the most recent action--as well as the technical implementation--since each turn now requires three different responses from the system rather than only one.

While this redesign would ensure that the user always has an in-text object to interact with somewhere on the screen, it still does not provide for all possible interaction as in-text links. As mentioned previously, some game verbs such as Look, Wait or Sleep do not take direct objects. These actions must either be provided through a separate means--such as a row of buttons beneath the screen--or else be rather awkwardly associated with the player character object mentioned in the inventory list. This second approach would make these simple commands hard to find for new users.

Therefore, the prototype tests suggested that some supplemental means beyond only in-text pop-up menus would be needed to effectively reveal all of the currently possible actions and interactive objects at all times.

The second revelation provided by the prototype was that a new design may significantly change how users explore and play an IF game. In traditional IF, it is often necessary to closely read the text to determine what objects are present. In the prototype, object names were clearly underlined. In traditional IF, it is then necessary for the player to consider what the current puzzle or obstacle is and how she might overcome it. The player's conception of the current puzzle or goal may not be particularly clear; the player might simply explore different actions as they come to mind to see what is possible. In the prototype, these possible actions have already been delimited. Rather than visualizing and exploring the virtual world, prototype users appeared to be exploring the menus. They would click on an interesting object, scan the resulting menu, and then select the most interesting action without much thought given to their current goal in the game. At least one user was surprised to discover that the prototype game was over, as if she had simply stumbled upon the solution.

Many traditional IF games are puzzle-based. By analogy, those puzzles are often like riddles or short-answer questions on an exam: they pose a problem to the player and then provide an open space for the player to explore a possible solution. A user interfaces that affords all possible actions effectively turns those problems into multiple choice questions, with a corresponding drop in challenge level. While this may be an advantage for authors that would like to move toward less puzzle-based IF games, it may be a detriment to the traditional puzzle-based form.

Refined Design

My refined design retained in-text object links with associated pop-up action menus.

Skald screenshot.
Skald interface, showing the pop-up menu resulting from clicking on red door

In future, I would like to provide users the option to control the formatting of those links, or even turn them off completely if they find them distracting. Also, one of the suggestions taken from the paper prototype tests but not yet implemented is that, when the user hovers over an object link, all other links to that same object should simultaneously turn the same color. This clarifies that these different links in fact refer to the same single object in the game world.

In order to clearly afford all possible actions at any time and to overcome the "interaction dead-end" problem, I supplemented the in-text object links with two menus. The first menu lists all possible action verbs, with sub-menus for direct objects and possible indirect objects.

Skald screenshot.
Skald interface, showing a sample Actions pull-down menu

The second menu gives a list of all currently-available objects, and then the set of actions that is currently possible to perform on each.

Skald screenshot.
Skald interface, showing a sample Objects pull-down menu

To make these menus more obvious, it is also possible to view them as a left sidebar. This is the default start view in Skald. In this view, it is possible to switch between the Actions or Objects menus as tabs.

Skald screenshot.
Skald interface, showing a sample sidebar menu

Other pull-down menus include an option to see the Skald system error log or a short About Skald message.

Implementation

In order to run in a browser and thus be accessible from both mobile and desktop devices, Skald is written using Google Web Toolkit (GWT). GWT provides a number of cross-platform UI widgets and a compiler that converts Java to cross-browser-compatible JavaScript. I chose GWT because I am comfortable coding in Java, but I do not know much JavaScript and have done little dynamic webpage development. GWT's compiler allowed me to write the Java code I'm comfortable with while shielding me from most of the technical issues of dealing with different JavaScript incompatibilities across different browsers.

However, Skald only provides a user interface front-end. It does not model the game world. For that, I wanted to use one of the two most popular IF game engines: Inform or TADS. Inform is the more popular, and there are a number of projects that allow the entire Z-Machine or Glulx virtual machine (VM) to run in the browser itself. However, the user interface and backend are quite closely linked in these virtual machines.

TADS, on the other hand, recently introduced a built-in web server to its VM. TADS then provides its own HTML and JavaScript-based user interface, but the server was designed such that these UI files could be swapped out with only a small amount of work. In addition, TADS has a more detailed system for parsing and handling user-entered actions than Inform. Each command must go through three phases: a verify phase to test how logical the command is, a check phase to see whether a logical command is in fact possible given the world state, and an action phase where the world state is actually affected. These separate phases made it easier to reuse the verify phase to generate the affordance lists--all of the logically afforded combinations of verb and objects--that Skald needs every turn to update its menus. The TADS programming language is also more robust and has a syntax similar to traditional programming languages. For the low-level work I needed to do for Skald, TADS seemed like the better starting point than Inform.

When a TADS game uses the Skald interface, it performs as follows. When the game starts up, it also starts the TADS internal web server. The server collects the initial game text, which consists of all the output before the first prompt for input.

Then, the server computes the current set of affordances. It does this by using an author-provided list of the verbs that might be used somewhere in this particular game. For each verb on the list, the server reuses the TADS parser's normal verify routines to poll each of the objects currently accessible to the player character to see if it presently supports that verb. For verbs that take two objects, it checks all combinations of currently-present objects for successful matches. The full list of all currently-afforded actions is then formatted into JavaScript Object Notation (JSON).

The server can then send to the player's browser the initial HTML and JavaScript code needed to display the Skald UI, followed by the initial text of the game and the first turn's JSON affordances data that Skald uses to build its menus. The links for each object in the game text are provided by the game author, who provides a unique name that is used by the system as an identifier to match the corresponding object in the JSON affordance data.

On each subsequent turn, the user's selected command is sent back to the server as an HTTP POST request. The server then replies with the resulting game text and updated affordance JSON data.

This client-server architecture means that Skald could be use as a front-end to other IF systems. All that is required is an HTTP server that can send game text and JSON affordance data in the format required by Skald and reply to HTTP commands as sent by Skald. Generating the necessary affordance data will generally require an intimate view of the underlying world model, however. This will generally require a custom plugin for each backing IF system, although this plugin could then be reused for any game built using that same IF system.

The Skald code and TADS plugin is available online (Tomaszewski 2013).

Potential Consequences

By doing away with IF's natural language parser, Skald has modified one of the key features of the medium. While it is hoped that this change will eliminate most user error and improve the clarity of what is currently possible in the game, this is not without side-effects. Without the command line prompt, there is no longer an illusion that any input might be supported. The boundaries of the game thus become clear, potentially reducing the challenge and eliminating the open exploration that provides some of the joy of the IF form.

An evaluation with human users was necessary to determine the significance of these changes.

Works Cited