Argax Project

Node Status: COMPLETE

Participant Responses

The following is a summary of participant responses.

I looked primarily at the difference between responses given after playing Demeter with reincorporation on versus those responses given after playing with reincorporation off. I also tested for any order effects--whether responses were different after playing the first game session versus after the second game session--and for any group effects--whether one group answered differently than the other. I also looked for interaction effects. For example, did reincorporation perhaps only produce a significant effect when it occurred on the first or on the second play?1

Overall, none of these differences were statistically significant (p < 0.05).2 So the fact that the two groups were significantly different in their previous game experience did not produce any significant differences in their responses here. The one statistically significant finding was a difference in the frequency of input errors reported after the first and second game sessions. This finding is examined more closely below.

Results are presented here for each survey question. These results include a mean response score, a distribution of response scores, and total number of responses (n) for each question. (In the original survey, it was possible to answer N/A for some questions. Any N/A responses are not included in the count of total responses or in the means for that question.) The results also include the separate means for those responses given after playing with the reincorporation feature either on or off, although these two means are never significantly different.

Story Structure

Participants rated the following questions as Strongly Agree (4), Agree (3), Neutral (2), Disagree (1), or Strongly Disagree (0).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
SAANDSD
The events of the game had a story-like structure. 3.272121420 483.253.29
The game session had a clear beginning, middle, and end. 2.4810151544482.672.29
The events of the game were logically related to each other. 2.941521741 483.042.83
Earlier events led to later events in a coherent and understandable way. 2.7312181251482.832.63
The other characters' actions seemed to be consistent with their apparent goals and personalities. 2.709211132 462.782.61

Overall, ratings were favorable. That the experience was story-like received the highest rating of any response in the survey.

The rating for the story's completeness--having a beginning, middle, and end--was fairly low with a relatively wide distribution of responses. This may have been due to the fact that not everyone finds the possible Demeter story endings satisfactory. If the PC does not die, then the Landfall scene intentionally ends the story without answering the question of whether the passengers will be able to safely get down from the Zeppelin.


Survey Question Response Count Response Count: Reinc ON Response Count: Reinc OFF
YesNoYesNoYesNo
Were there any important events that seemed irrelevant to the main storyline? 444 222 222

In agreement with the solid rating given above for the story's logical structure, only about 10% of responses indicated the story contained irrelevant events. All of the reports of irrelevant events came from Group 0, and two of the reports came from the same participant after both games. The following are the explanations given by the participants:

User Agency

World-level Agency

These questions aim to measure a positive sense of world-level agency. Participants rated the following questions as Strongly Agree (4), Agree (3), Neutral (2), Disagree (1), or Strongly Disagree (0).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
SAANDSD
I knew what actions were possible to perform within the game. 2.7715203 7 3 482.463.08
I was able to construct commands that the game understood. 3.0013265 4 0 482.833.17
I was sufficiently able to direct my character's actions in the game world, such as move from place to place, manipulate objects, talk to other characters, etc. 2.9014226 5 1482.882.92

Participants were generally able to successfully control their characters in the world. However, given that the focus of an interactive drama is on story-level interactions, world-level input should fade into the background of the experience. These scores do not quite reflect such an ease of input. There was also a small secondary group that did not feel they knew what actions were even possible in the game. This is noticeable in the distribution of scores for the first question here.

The difference between reincorporation and non-reincorporation means for the first question in this section is comparatively large. However, this result--that turning reincorporation on makes it harder for players to know what verbs they can execute--makes very little sense.3 Also, as mentioned, responses to this question are not normally distributed. Using a more robust test of statistical significance, Mann-Whitney U = 373, n1 = n2 = 24, p = 0.08, reveals that this finding is not in fact significant.

These measures all showed significant correlation with the system measure of world agency--the percent of commands that translated to valid deeds--that was described earlier. Knowing what actions were possible, r(44) = -0.45, p = .002, and being able to sufficiently direct one's character, r(44) = 0.46, p = .001, correlated more highly with this system measure than being able to construct valid commands, r(44) = 0.36, p = .01. Presumably, players were capable of constructing some valid commands while still producing many invalid ones.


The next set of questions explores how frequently participants encountered different kinds of world-level agency problems. Responses include Never (0), Rarely (1), Occasionally (2), Frequently (3), Most of the time (4).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
NROFM
I entered a command that caused an error message or that the game obviously did not understand. 1.71 61515111 481.881.54
I entered a command that the game seemed to understand but that did not have the effect I intended in the story world. 1.32 14101760 471.291.35

It seems the majority of participants encountered some occasional trouble entering commands, but generally not excessively so.

There was a significant difference between the mean reported frequency of error messages depending on play order. The average response was 1.96 after the first game and 1.46 after the second game, t(23) = 2.63, p = 0.02. The meaning of this is fairly obvious: After learning what commands are possible during the first game, players are less likely to report entering invalid commands during the second game. This finding matches that found earlier in the game data by looking at the percentage of input commands that produced valid deeds.

As would be expected, these responses exhibited a significant negative correlation with the system measure of world agency, with reports of entering a command that caused an error, r(44) = -0.68, p < .001, corresponding more highly than entering a command that did not have the intended effect, r(43) = 0.45, p = .002.

Story-level Agency

These questions examine the participants' sense of story-level agency. Participants rated the following questions as Strongly Agree (4), Agree (3), Neutral (2), Disagree (1), or Strongly Disagree (0).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
SAANDSD
My actions seemed to have a significant impact on the course of the story. 2.21 81214104 482.252.17
I believe the story would have been different had I performed different actions. 2.83 1419870 482.962.71
I believe the story would have been better had I performed different actions. 2.63 11161461 482.712.54

On average, participants seemed fairly neutral regarding their impact on the story. However, note the wide distribution of opinions in this area, with almost as many responses indicating no significant impact as those that did. There was a more positive consensus that player actions were influencing the course of the story in at least some way, however.

Feeling that one's actions had a significant impact on the story correlated very strongly with the world-level agency responses regarding knowing what actions were possible, r(46) = 0.47, p = .001, and being able to sufficiently direct one's character's actions in the world, r(46) = 0.58, p < .001. This suggests that world-level agency is indeed a prerequisite for story-level agency. At the very least, those that feel high agency at one level are more likely to feel high agency at the other. However, reports that one is able to construct valid commands did not significantly correlate with having a significant impact on the story, confirming that there is more required for a sense of agency--especially at the story-level--than simply providing successful inputs to the system.


The next set of questions explores how frequently participants encountered story-level agency problems. Responses include Never (0), Rarely (1), Occasionally (2), Frequently (3), Most of the time (4).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
NROFM
I entered a command that did something significant in the story world. 2.04 41215106 472.131.96
I entered a command that did something significant in the story world, but this action then failed to influence the other characters or subsequent events to the degree that I think it should have. 1.40 5201440 431.321.48

Participants felt they performed significant actions in the world slightly more often than they reported producing input errors, which is good. In the context of an interactive drama, not every action performed will be significant. So an average rating of only occasional significant actions is not too disappointing, although it could still be improved. Also, most significant world effects were perceived to successfully affect the story (though, in retrospect, this second question could have been worded a little more clearly).

None of the question responses for either world or story-level agency corresponded significantly with the system measures of story agency. Responses indicating that one's actions seemed to have a significant impact on the course of the story did show the strongest correlation with both Mean Action Import, r(46) = -0.24, p = .1, and Significant Action Count, r(46) = 0.24, p = .1. But these are not significant. Furthermore, performing actions with a higher mean import overall actually has a negative correlation with a sense of agency. Unlike the system measures of world-agency, the system measures of story-agency do not match the experience of users. Simply performing high-import actions was insufficient to grant a sense of story agency.


The next few questions measure participants' perceptions of their own story-level agency in a specific context. Participants were asked to identify the single most memorable or notable event of the session they just played. (However, as shown in the following section, a handful of participants did not describe an event in answer to this question. For example, some described the mood of the game as a whole as the most notable "event".)

They were then asked the following questions in regards to that event. Possible responses include Definitely (3), Very likely (2), Likely (1), Neutral / I don't know (0), Unlikely (-1), Very unlikely (-2), and Definitely not (-3).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
DVLLNUVUDN
This event occurred as the direct result of an action I performed in the game. 0.85 19077942 481.080.63
If I played the game again, I could cause this event to happen again. 1.72 214137110 471.741.71
If I played the game again, I could avoid this event or prevent it from happening again. 0.91 1431011441 470.781.04

Looking at the distribution of responses, there seems to be two groups of participants here: those that felt a strong, clear ability to control the story and those that felt they only had a slight influence on the story. However, closer inspection showed that over 75% of those that answered Definitely to each of these three questions were referring to an action they performed. Other participants were instead replying to these questions in regards to an event that occurred to them. Thus, these questions were not as useful a measure of story-level agency as intended.

Satisfaction

Although a great many factors go into producing a satisfying game, I was interested, as author, in roughly how enjoyable this experience was as a whole. Possible responses include Strongly Agree (4), Agree (3), Neutral (2), Disagree (1), or Strongly Disagree (0).

Survey Question Mean Response Count nMean:
Reinc ON
Mean:
Reinc OFF
SAANDSD
I enjoyed playing this game. 2.54 12121581 482.52.58

Responses were very stable between the participants' two game sessions, r(22) = .83, p < .001. Four participants dropped their rating by 1 point on the second game, two participants increased their rating by 1 point, and one participant increased by 2 points.

As author, I think a **½ rating is pretty fair review for this game. That is about what I would give it too. I am glad that 1/4 of the participants enjoyed it very much and that the one person that despised the game increased his rating to simply disliking it on the second play.

Interestingly, satisfaction correlated with a number of other responses. However, the most significant correlations (r(46) > 0.5, p < .001) were with responses regarding whether events were logically related to each other, r = 0.53, whether earlier events led to later events in a coherent and understandable way, r = 0.65, whether the NPC's actions seemed to be consistent with their apparent goals and personalities, r = 0.68, and whether the player's actions had a significant impact on the course of the story, r = 0.57. Although no claim of causation can be made from this, satisfaction is at least correlated with a sense of agency within a well-formed story with believable characters. Thus, it is encouraging to learn that those who enjoyed the game tended to have exactly the experience Demeter was designed to provide.

Notes

  1. Actually, finding that reincorporation only has an effect on the first or second play would be the same result as reincorporation having a different effect depending on the participant's group.
  2. Or even p < 0.1, for that matter.
  3. This finding would make sense if comparing the means for first play session versus second play session: people would learn what actions are possible in the game during the first play session. However, the difference when looking at play order was even less pronounced. Overall, I believe the relatively wide difference between the reincorporation and non-reincorporation means found here is simply a random fluke.

ToDo