8
Pose presentation for a dance-based massively multiplayer online exergame Hannah Johnston , Anthony Whitehead School of Information Technology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada article info Article history: Received 19 August 2010 Revised 20 November 2010 Accepted 6 December 2010 Available online 6 January 2011 Keywords: Exergaming Massively multiplayer online game Game design Sensor network Human–computer interaction abstract A sedentary lifestyle is linked to many health problems, including diabetes, heart disease, and obesity. Active games attempt to offer a solution by encouraging players to be more physically active through the use of entertaining media. We present a framework for a massively multiplayer online exergame (MMOE), that combines elements of persuasive technology and massively multiplayer online games to provide players with a customized, social gaming experience with the potential for long-term engage- ment and measurable physical benefits. We then examine our own exergaming system, sensor network for active play (SNAP), to assess its suitability in an MMOE context. We then address several technical and usability challenges in the development of an MMOE, including pose selection, training, recognition, and presentation methods. Ó 2011 International Federation for Information Processing Published by Elsevier B.V. All rights reserved. 1. Introduction There are growing concerns over the increasingly sedentary lifestyle of many industrialized societies, as it is a major underlying cause of disease, disability, and death [1,2]. There has been a dra- matic rise in the prevalence of type II diabetes, heart disease, and some cancers [1–3], with problems like depression, anxiety and mental illness on the rise [1,2]. An active lifestyle reduces risks of disease and makes the body stronger, fitter, and more flexible [1,2,4], but many people still fail to meet recommended levels of moderate physical activity [5]. Children and adolescents are partic- ularly vulnerable, devoting more time to video games and other sedentary entertainment. These activities are often addictive and self-reinforcing [6], making the desire to lead a more active life- style increasingly difficult, a fact that is only compounded by eco- nomic, psychological, and social challenges [7]. These health concerns and the popularity of the Nintendo Wii have sparked interest in physically active games, commonly re- ferred to as exergames [8]. While few people dispute the potential of exergames to encourage physical activity, little research has investigated their long-term or measurable benefits. Many exer- game prototypes have been developed, including our own, full- body accelerometer sensor network for active play (SNAP) system [9], but exergames cannot succeed on novelty alone. Massively multiplayer online games (MMOG)s may provide a viable solution, through their strong social strategies and constant influx of player- generated content. We propose a massively multiplayer online exergame (MMOE), that combines a dance-based exergame, using the SNAP system with persuasive strategies and an online social community, encouraging players to dance in the real world, while engaging with other players virtually. However, moving a dance- based exergame online introduces a new set of challenges. Captur- ing and presenting full-body poses is most successful when facili- tated in-person [10], something that is not possible in the MMOG context. This work focuses on these design problems and solutions. We examine customization components and provide suggestions for effective pose selection, capture, pose presentation and train- ing. Finally, we offer our conclusions and a discussion of opportu- nities for future work. 2. Framework for a massively multiplayer online exergame We propose combining exergames, persuasive technology, and massively multiplayer online games to deliver a more entertaining gaming experience that will consequently be a more successful long-term solution to physical inactivity. Tables 1 and 2 provide a summary of the framework for a massively multiplayer online exergame, divided into requirements for the physical input system and motivation and enjoyment aspects of the game. 3. Using SNAP in a massively multiplayer online exergame We developed the SNAP system, inspired primarily by the nov- elty introduced by Nintendo Wii and our displeasure its inability to enforce full-body activity. Once players realize that a quick flick of the wrist will suffice rather than a full and complete tennis swing, it becomes apparent that the system is not as sophisticated as 1875-9521/$ - see front matter Ó 2011 International Federation for Information Processing Published by Elsevier B.V. All rights reserved. doi:10.1016/j.entcom.2010.12.007 Corresponding author. E-mail addresses: [email protected] (H. Johnston), awhitehe@ connect.carleton.ca (A. Whitehead). Entertainment Computing 2 (2011) 89–96 Contents lists available at ScienceDirect Entertainment Computing journal homepage: ees.elsevier.com/entcom

Pose presentation for a dance-based massively multiplayer online exergame

Embed Size (px)

Citation preview

Page 1: Pose presentation for a dance-based massively multiplayer online exergame

Entertainment Computing 2 (2011) 89–96

Contents lists available at ScienceDirect

Entertainment Computing

journal homepage: ees .e lsevier .com/entcom

Pose presentation for a dance-based massively multiplayer online exergame

Hannah Johnston⇑, Anthony WhiteheadSchool of Information Technology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada

a r t i c l e i n f o a b s t r a c t

Article history:Received 19 August 2010Revised 20 November 2010Accepted 6 December 2010Available online 6 January 2011

Keywords:ExergamingMassively multiplayer online gameGame designSensor networkHuman–computer interaction

1875-9521/$ - see front matter � 2011 Internationaldoi:10.1016/j.entcom.2010.12.007

⇑ Corresponding author.E-mail addresses: [email protected]

connect.carleton.ca (A. Whitehead).

A sedentary lifestyle is linked to many health problems, including diabetes, heart disease, and obesity.Active games attempt to offer a solution by encouraging players to be more physically active throughthe use of entertaining media. We present a framework for a massively multiplayer online exergame(MMOE), that combines elements of persuasive technology and massively multiplayer online games toprovide players with a customized, social gaming experience with the potential for long-term engage-ment and measurable physical benefits. We then examine our own exergaming system, sensor networkfor active play (SNAP), to assess its suitability in an MMOE context. We then address several technical andusability challenges in the development of an MMOE, including pose selection, training, recognition, andpresentation methods.� 2011 International Federation for Information Processing Published by Elsevier B.V. All rights reserved.

1. Introduction

There are growing concerns over the increasingly sedentarylifestyle of many industrialized societies, as it is a major underlyingcause of disease, disability, and death [1,2]. There has been a dra-matic rise in the prevalence of type II diabetes, heart disease, andsome cancers [1–3], with problems like depression, anxiety andmental illness on the rise [1,2]. An active lifestyle reduces risksof disease and makes the body stronger, fitter, and more flexible[1,2,4], but many people still fail to meet recommended levels ofmoderate physical activity [5]. Children and adolescents are partic-ularly vulnerable, devoting more time to video games and othersedentary entertainment. These activities are often addictive andself-reinforcing [6], making the desire to lead a more active life-style increasingly difficult, a fact that is only compounded by eco-nomic, psychological, and social challenges [7].

These health concerns and the popularity of the Nintendo Wiihave sparked interest in physically active games, commonly re-ferred to as exergames [8]. While few people dispute the potentialof exergames to encourage physical activity, little research hasinvestigated their long-term or measurable benefits. Many exer-game prototypes have been developed, including our own, full-body accelerometer sensor network for active play (SNAP) system[9], but exergames cannot succeed on novelty alone. Massivelymultiplayer online games (MMOG)s may provide a viable solution,through their strong social strategies and constant influx of player-generated content. We propose a massively multiplayer online

Federation for Information Process

(H. Johnston), awhitehe@

exergame (MMOE), that combines a dance-based exergame, usingthe SNAP system with persuasive strategies and an online socialcommunity, encouraging players to dance in the real world, whileengaging with other players virtually. However, moving a dance-based exergame online introduces a new set of challenges. Captur-ing and presenting full-body poses is most successful when facili-tated in-person [10], something that is not possible in the MMOGcontext. This work focuses on these design problems and solutions.We examine customization components and provide suggestionsfor effective pose selection, capture, pose presentation and train-ing. Finally, we offer our conclusions and a discussion of opportu-nities for future work.

2. Framework for a massively multiplayer online exergame

We propose combining exergames, persuasive technology, andmassively multiplayer online games to deliver a more entertaininggaming experience that will consequently be a more successfullong-term solution to physical inactivity. Tables 1 and 2 providea summary of the framework for a massively multiplayer onlineexergame, divided into requirements for the physical input systemand motivation and enjoyment aspects of the game.

3. Using SNAP in a massively multiplayer online exergame

We developed the SNAP system, inspired primarily by the nov-elty introduced by Nintendo Wii and our displeasure its inability toenforce full-body activity. Once players realize that a quick flick ofthe wrist will suffice rather than a full and complete tennis swing,it becomes apparent that the system is not as sophisticated as

ing Published by Elsevier B.V. All rights reserved.

Page 2: Pose presentation for a dance-based massively multiplayer online exergame

Table 1MMOE Framework summary: physical input system.

Physical exertion � Provide a flexible level of challenge� Monitor and increase heart rate� Incorporate leg or full-body movement

Measurement accuracy � Provide an appropriate level of accuracy forthe application� Supplement with additional sensors if addi-

tional accuracy is necessary

Comfort of wearablesensors and markers

� Limit the size, weight, and quantity of sen-sors and on-the-body devices� Make all devices wireless� Make input systems as inconspicuous as

possible to limit embarrassment� Plan for wear and durability through the use

of replicable component� Make components that come in contact with

skin washable

90 H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96

players initially thought, and they tend to return to a less activeform of play. Proper form and technique are not significant factorsin Wii game play. Regardless of additional sensors (for example,the Wii Fit balance board), the system cannot determine whetherthe person is performing yoga poses in the right way or if the par-ticipant is doing exercises correctly.

We believe that the next logical progression in exergame sys-tem development is a multi-sensor network, with sensors placedon strategic areas of the body to allow for an even more immersivegaming experience. Such a network allows users to replicate elab-orate dances, yoga, and tai chi poses, and can verify the accuracy ofeach pose. The sensor network for Active Play (SNAP) system is awearable sensor network that uses body position as input for videogame applications. Tri-axis accelerometers, placed on players’wrists and above their knees, infer full-body poses. Several gameshave been developed for the SNAP system, in which players repli-cate full-body dance poses to music [9].

Although it is desirable to have a system that easily interpretshuman input, with practice users can develop and perfect theapplication-specific skills. Four sensors are sufficient to enforceform and require the player to engage in designed activities; how-ever, for training applications such as yoga, Pilates, and tai chi, webelieve a larger sensor network is preferable as the goal is to im-prove form and train rather than provide fun and entertainment.The more sensors in the network, the more accurately we can de-fine the body pose and motion [10].

3.1. Physical input system

The SNAP system has been shown to provide players with therequisite physical exertion and measurement accuracy [11]. One

Table 2MMOE Framework summary: motivation and enjoyment requirements.

Individual factors Feedback, credit and recognition � Provide on-going feedb� Recognize positive beha

Challenge and flow � Dynamic adjust level of� Leverage the flow state

Tailoring and customization � Tailor the experience to� Allow for creativity and

Connectivity � Connect players to theNovelty � Leverage game novelty

Social Factors Friendship and social facilitation � Leverage the power of s� Foster positive social in

Competition � Provide opportunities fo� Provide compensation m

Privacy � Use an online format to� Give users control over

o Provide an option to Let users select wh

important limitation of the accelerometer sensor network is thatit is most effective for static pose recognition, requiring players tomomentarily freeze in place. A more complicated system formovement and gesture recognition is possible using the SNAPsystem, but results suggest that it is less accurate than static poserecognition [12]. Thus, depending on the requirements of thegame interaction, a homogeneous accelerometer network mayor may not be appropriate, but it has shown to be extremely wellsuited to dance-based games. The adjustable, comfortable, anddurable design of the sensor pods and straps and support multipleplayers and diverse play environments. The sensors are visible,but it is reasonable to assume that as sensors decrease in size,players could wear the SNAP system unnoticed, or even embed-ded into regular clothing [13].

3.2. Motivation and enjoyment

The detailed acceleration information collected from the SNAPsystem makes it possible to provide players with targeted feed-

back. The SNAP system encourages flow [14] in several ways,including dynamic speed, pose classification threshold, and poseselection. All of these elements are easily modifiable and contrib-ute to the level of difficulty of the games. With the SNAP system,challenge may be introduced through deliberate pose selection.Game designers can select poses that require varying amounts ofbody movement and muscle use. By collecting some additionaluser information, such as age, weight, and heart rate, the SNAP sys-tem could better tailor the level of difficulty to individual users inorder to optimize player flow and enjoyment. SNAP also providesmany opportunities for player creativity and customization. Themost obvious is through the creation of dances and individualposes. Players can also add music of their choosing. SNAP has verylittle impact on the social experience for players online beyondproviding the technology necessary for players to create and sharetheir own dance poses. Players already compete against each otherin all of the games created for the SNAP system, but social influ-ence could play a huge role in motivating players to continue play-ing once the initial novelty of the system wears off. A networkedversion using SNAP as input would allow friends, or even distantstrangers, to compete against one another. These virtual dance-offscould occur across the globe, from the convenience of players’ ownhomes. One of the most promising features of the MMOE is that itcan provide both interactivity and anonymity. The physical appear-ance of the player using the SNAP system remains unimportant,giving some users, who might otherwise be shy or embarrassed,the opportunity to participate actively. Using the SNAP system,players can choose to play with real world friends or completestrangers online.

ackviours and give appropriate credit for accomplishmentsdifficulty to match player skill levelto maximize enjoyment and satisfactionindividual needs and preferencescustomization of activity, avatars, music, and other game content

Internet and other players to provide targeted encouragementand ensure opportunities are available to renew interest

ocial facilitation to encourage physical activity and positive behaviour changeteraction and friendship building in a safe online environmentr competition among friends and strangers in distant proximityechanisms for fairnessprovide players with both interactivity and anonymity

what they share online:o hide their imageo they interact with in the game

Page 3: Pose presentation for a dance-based massively multiplayer online exergame

H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96 91

4. Designing a customizable dance pose creation system

Pose selection, capture, training, and presentation are critical tothe success of an online dancing game. Approaches to pose selec-tion are described and evaluated. Pose capture methods are thendiscussed. Next, we investigate pose presentation methods andconclude with an overview of pose training, specifically in the con-text of an MMOE.

4.1. Pose selection

The first issue in developing a dance pose system is selection.Automated and user-controlled pose selection strategies are com-pared in Table 3.

Given the importance of customization to the overall success ofthe MMOE, a user-controlled approach to pose selection wouldlikely be more successful in most use scenarios as it allows playersto express themselves more directly. However, in situations involv-ing less computer-savvy users, or dancers with more improvisedstyles, an automated system could be beneficial. Both methodscould be offered to users, giving players even more control overpose selection.

4.2. Pose capture for display

Once poses have been selected, players must share them withother players. In the context of an MMOE, players do not havethe luxury of in-person explanations and demonstrations. To cap-ture the poses, one approach is to use a 2- or 3-dimensional avatar

Table 3Comparison of automated and user-controlled pose selection methods.

Selection method Advantages

Automated pose selection � Captures representative poses� Captures spontaneity, surprise, serendipity� Users could select from many possibilities� Many possible selection options (extracting p

timedintervals, or based on tempo, expression, position� Limited effort from users

User-controlled poseselection

� Controlled, less embarrassing environment� Limited motion interference with camera ima

sensors� May be possible to combine pose selection with p

capture and training

Table 4Comparison of poseable avatar and image-based pose selection methods.

Selectionmethod

Advantages

Poseableavatar

� Provides full anonymity and privacy� Players can express themselves without providing any personal� Could offer a range of customizable features� The avatar system could be supplemented with sensor data, or us

mapping combining of accelerometers and gyroscopes, requirsimply verify and correct a pose, alleviating much of the work f

Image-basedrecordingsystem

� Provides an accurate representation of full body poses, with limthe part of the user� Once the cameras are registered, recording multiple poses is qu� The system constrains players to feasible body positions

to represent the full-body dance poses. A simple version could pro-vide full joint rotations and limb movements, while a more sophis-ticated model might constrain the joints and limbs based onphysiological limitations of the human body. Another solution usesa network of web cameras to capture real images of the poses. Mul-tiple cameras may be used together to simulate a wide angle lens,capturing full-body images. We outline these methods in Table 4.

4.3. Pose training

We have shown that 5–8 unique participant data sets contribut-ing to a pose bank provides a sufficient generalization [10]. Toaccommodate the widest range of player shapes and sizes, morepose data could be collected by requiring members of the onlinecommunity to train poses. System intervention would only be re-quired to remove data from cheaters and other outliers. Over time,the data would become an increasingly accurate representation ofthe dance poses provided. It may be desirable to provide playerswith especially unique body sizes or shapes the option of alwaysre-training the poses.

4.4. Pose presentation

Players should be able to mimic the pose quickly and accuratelyfrom whatever presentation method is provided. Several differentpresentation strategies were developed to determine the most use-ful visual description for users. In the game, pose presentationserves more as a reminder or visual stimulus to action, and withpractice, players could likely adapt to any presentation method.

Disadvantages

oses at

)

� Continuous motion makes the system more vulnerable tomotion blur or sensor disruption in capture� Result in more disorganized or messy poses� Freestyle dancing may be awkward or embarrassing for some users� Some amount of user intervention for filtering is necessary

ges or

ose

� Requires more user planning� Requires more system interaction, users cannot ‘‘set-it-and-forget-

it’’

Disadvantages

information

e a one-to-oneing players toor users

� Model must be reasonably accurate and detailed in order tomove realistically� An intelligent system requires more complex development� Different models are necessary to accommodate different

genders and body types� Players may be more likely to generate impossible poses� Subtleties of pose and expression may be lost� Requires some amount of manual input, for which a more

complex interface is necessary to allow users to manipulatethe avatar

ited effort on

ick and easy

� A large, bright, cleared space is required� Players are confined to a specific distance from the cameras� The addition of cameras to the gaming system increases the

cost, setup, and calibration for players� Using real player images introduces privacy concerns, par-

ticularly when intended for online use

Page 4: Pose presentation for a dance-based massively multiplayer online exergame

92 H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96

However, by providing players with the most intuitive presenta-tion method, we help players focus on more entertaining andenjoyable aspects of the game as quickly as possible.

4.4.1. ProcedureParticipants attached the SNAP sensors and were then asked to

replicate six unique poses, one at a time. Participants indicated tothe experimenter when they believed they were in the correctpose. Accelerometer data and time measurements were recorded.

4.4.2. ConditionsFive separate presentation conditions were compared. The di-

rect (D) view requires participants to put themselves in the placeof the person in the image. The photo is of a person raising herright arm, so the participant also raises his or her right arm. Inthe mirrored (M) view, participants act as though they are lookingat themselves in a mirror. The left arm of the person in the photo israised, so the participant raises his or her right arm. The mirroredand back (MB) presentation gives players an additional view frombehind. The back view remains non-mirrored, as though the playeris watching him or herself. The back view is comparable to many3rd person views in games. In the mirrored and side (MS) view,players are given an additional view from the side. The side viewremains non-mirrored, as though the player is watching him orherself from the side. In the mirrored with feedback (MF) view,the mirrored image is overlaid with four coloured quadrants. Thefurther the player is from the ideal pose, the more the image isoverlaid with the colour red. This distance or error is presented

Fig. 1. Presentation conditions (in order) direct, mirrored, mirror

Fig. 2. Practice pose (pose 0) and the 5 pose

separately for each limb sensor. As the player approaches the idealpose, the picture gets closer to greyscale (Fig. 1).

4.4.3. PosesFive poses and one practice pose were used for testing. They

vary in symmetry, complexity, difficulty, and side-dominance.The six poses are shown below from the direct view (Fig. 2).

4.4.4. Pose training bankA pose training bank was created for testing, using data from

three females to two males. They range in age from 23 to 37, withan average age of 26.8. Participants in the training set had an aver-age height of 169.8 cm with a range of 162–175 cm. Their averageweight was 75.7 kg, ranging from 62 to 118 kg.

4.4.5. ParticipantsTwo separate tests were conducted, one of young adults and the

other of older adult women.

4.4.5.1. Young adult participants. The population tested consisted of30 female and 30 male participants. Participants were distributedevenly between the five presentation conditions (direct, mirrored,mirrored and back, mirrored and side, and mirrored with feedback),with 12 participants in each. The male to female ratio of each con-dition was approximately equal, ranging from five to seven malesand five to seven females per presentation method. Table 5 pro-vides an overview of participant age, height, and weight informa-tion, as reported by the participants.

ed and back, mirrored and side, and mirrored with feedback.

s used for presentation method testing.

Page 5: Pose presentation for a dance-based massively multiplayer online exergame

Table 5Young adult participant characteristics.

Average Low High

Age 20.5 18 27Height 170.1 cm 152 cm 193 cmWeighta 65.4 kg 43 kg 100 kgFitness level (1–5) 3.2 1 5Video game experience (1–5) 3.7 1 5

a Three participants declined to report their weight and were consequently notincluded.

Table 6Older adult female participant characteristics.

Average Low High

Age 62.8 49 73Height 162.9 cm 155 cm 175 cmWeighta 68.7 kg 59 kg 85 kgFitness level (1–5) 2.8 2 4Video game experience (1–5) 1.5 1 2

a One participant declined to report her weight and was consequently notincluded.

Table 7Average time to pose, broken down by individual pose.

Pose Time (s)

D M MB MS MF

1 3.014 3.138 2.938 3.091 13.1392 5.813 5.029 4.742 5.855 9.3243 6.067 4.005 3.755 5.145 12.0124 7.487 4.875 4.266 5.233 10.0345 5.780 3.792 3.441 4.190 9.394

H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96 93

4.4.5.2. Older female participants. A test was conducted with eightfemale participants comparing the mirrored and mirrored with feed-back presentation methods. Table 6 below provides an overview ofparticipant age, height, and weight information, as reported by theparticipants.

4.4.6. ResultsWe summarize a comparison of the five presentation condi-

tions, including differences in speed required to complete the pose,accuracy of the pose completed, and perceived difficulty of posereplication for the young adult participants. Pose 0 was used forpractice only and was thus omitted.

4.4.6.1. Effect of presentation method on speed. Speeds varied signif-icantly between the presentation methods. The fastest averagetime was the mirrored and back view, but it was not significantlyfaster than mirrored or mirrored and side views. The mirrored withfeedback method was statistically slower than all other conditions(P values of 0.008, 0.001, 0.001, 0.004 for direct, mirrored, mirrored

Speed by Presentat

0

4

8

12

16

20

Direct Mirrored Mirrored &Back

Tim

e (s

)

Fig. 3. Speed by prese

and back, and mirrored and side, respectively). Mirrored and mir-rored and back views had faster times than the direct method, withstatistical significance (P values 0.005 and 0.001, respectively). Thedirect presentation method seems to require more mental process-ing time. Supplementing the straight-on mirrored image withadditional viewpoints, such as the back and side views, does notappear to significantly improve the response times of the partici-pants. Confusion may arise from the juxtaposition of a mirroredand a non-mirrored view. Fig. 3 illustrates the differences in speedof the presentation methods.

Additional information can be gleaned by breaking down speedinformation by individual pose (Table 7).

The mirrored and back method had the fastest times for allposes. Pose 1 is a symmetrical pose, which further supports thetheory that differences in time are the result of processing time re-quired to mentally flip the images. The other four poses follow afairly consistent pattern for each of the presentation methods.

4.4.6.2. Effect of presentation method on accuracy. Table 8 includes asummary of the statistics describing accuracy for each of the fiveconditions. The lower the Mahalanobis distance [15], the moreaccurately participants were able to replicate the poses. Fig. 4 pre-sents the accuracy of each presentation method. The wide varia-tion for the mirrored and side method suggests that while manyparticipants found the view helpful, there were several otherswho misinterpreted the view and replicated poses incorrectly.The mirrored view, mirrored and back view, and mirrored with feed-back view all yield comparable levels of accuracy, but it is worthnoting that the addition of a back view or feedback reduces therange of values. Providing this additional information seems tohelp limit the possibility of misinterpretation.

ion Method

Mirrored &Side

Mirrored withFeedback

Q1 (0.25)MinMedianMaxQ3 (0.75)

ntation method.

Page 6: Pose presentation for a dance-based massively multiplayer online exergame

Accuracy by Presentation Method

0

2

4

6

8

10

12

14

16

Direct Mirrored Mirrored & Back

Mirrored & Side

Mirrored withFeedback

Mah

alan

obis

Dis

tanc

e Q1 (0.25)MinMedianMaxQ3 (0.75)

Fig. 4. Mahalanobis distance by presentation method.

Table 8Average Mahalanobis distance, broken down by individual pose.

Pose Distance

D M MB MS MF

1 6.0 4.9 5.6 5.3 7.42 6.4 5.4 6.0 5.3 5.83 5.0 5.6 5.8 8.9 6.74 8.7 6.3 4.9 5.8 4.85 8.3 6.8 5.6 5.6 6.0

94 H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96

Table 8 provides a direct comparison of the individual presenta-tion methods on a per-pose basis. The bold text indicates the low-est distance per pose.

Pose 1 is symmetrical and unchallenging physically. Interest-ingly, the feedback method resulted in worse accuracy for thispose. Because the function used to provide feedback was relativelyloose, participants were provided with information that they wereclose to the correct pose, before they had made an effort to closelymatch the pose on screen. Pose 2 confused many participants asthe foot and leg positions were unclear from the straight-on view.For this pose, the additional back and side views were helpful inproviding important depth cues. Pose 2 is shown below from a mir-rored, back, and side view to show the importance of an alternateview when the pose is not strictly planar (Fig. 5).

For pose 2 in particular, the side view provides the larger cross-section, and consequently the most information about the pose,but it seems to be especially difficult for players to replicate poses

Fig. 5. Pose 2 from the front (mirrored), side, and back views.

from a sideways perspective. Poses 4 and 5 are more technicallyand physically complex. Additional information about the properpose, from either the real-time feedback presentation method oradditional views help users replicate the poses more accurately.

4.4.6.3. Effect of presentation method on perceived difficulty. The re-sults of the survey for perceived mental difficulty support the dif-ferences in speed among presentation methods. Participants wereable to respond more quickly to the mirrored and mirrored and backviews, and also reported fairly low perceived mental difficulty. Theopposite is true for the direct and mirrored and side views. Theanomaly seems to be the mirrored with feedback view, which de-spite extremely slow response times, scored lowest on mental dif-ficulty. Our hypothesis is that rather than spending additional timeattempting to understand the presentation view, as is the case fordirect and mirrored and side views, players are instead spendingadditional time perfecting their pose. The feedback provides play-ers with valuable additional information that they subsequentlyuse to improve (Table 9).

The mirrored with feedback view had the highest reported diffi-culty physically replicating the poses. It was easier for players thatdid not receive feedback to make an assumption that the posesthey were performing were perfectly adequate, and thus, easy.The mirrored with feedback condition pointed out minor errors inbody position, making the pose replication process seem more dif-ficult. As the feedback mechanisms add to the perceived difficultyof the system, they should be adjusted to suit the accuracy require-ments of the game. Where accuracy is less important, playersshould receive less restrictive feedback.

4.4.6.4. Pose presentation feedback functions. The mirrored with feed-back view may be modified to provide different information,according to the needs of the system. A system emphasizing preci-sion, for physiotherapy or training applications for example, maydisplay more error, overlaying more red colour for smaller Maha-lanobis distances. Alternatively, for a casual gaming application,participants may be presented with red feedback only where a

Table 9Average perceived mental and physical difficulty by presentation method.

D M MB MS MF

Average perceived mental difficulty (1–5) 2.5 1.8 1.8 2.4 1.6Average perceived physical difficulty (1–5) 1.8 1.5 1.6 2.3 2.7

Page 7: Pose presentation for a dance-based massively multiplayer online exergame

Table 10Speed of female participants of different ages.

Age group Speed (s)

Mirrored Mirrored with Feedback

Females 18–27 4.383 8.554Females 49–73 5.196 10.046

Table 11Accuracy of female participants of different ages.

Age group Accuracy (Mahalanobis distance)

Mirrored Mirrored with Feedback

Females 18–27 5.4 6.1Females 49–73 9.8 7.5

H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96 95

limb is in significant error. Any mathematical function may be usedto provide users with precisely the kind of feedback necessary. Itmay be desirable to provide several feedback function options.

4.4.6.5. Baseline sensor error. There is a certain amount of error thatarises as a result of accelerometer noise and imprecise accelerom-eter placement on the body. Accelerometers may sit differently ondifferent people’s limbs. They also can shift around during, or evenbefore, use. While measures can be taken to help participants placesensors accurately, it is unreasonable to expect the sensors will bein the exact same location, at the exact same angle, for differentparticipants. In fact, it is even unreasonable to expect that the sameparticipant will place sensors in exactly the same location everytime.

While the standard deviations used in the Mahalanobis distancecalculations allow for greater playability, they still do not allowplayers to reach certain levels of accuracy. The lowest Mahalanobisdistance recorded in the presentation method test was 3.27 (for themirrored method), with averages across presentation methods inthe 5–7 range. When used with the feedback presentation method,players become frustrated if the system requires extremely lowMahalanobis distance to remove coloured feedback. To addressthis issue, we propose a feedback function that first subtracts abaseline error for each sensor. A value of 1.0 per sensor is consis-tent with our earlier findings [9] and still requires an extremelyhigh degree of pose precision, but could easily be modified basedon the needs of the particular application.

4.4.6.6. Dynamic pose presentation. Each of the presentation meth-ods examined provides users with different benefits. A solutionmay be to provide players with a presentation view that dynami-cally adjusts based on the unique characteristics of individualposes. For example, the user could be presented with a single viewof the pose, from whatever angle captures the largest cross-section.Another possibility may be to provide players with an animation,illustrating the movement required to get into the desired pose.This would limit the potential for occlusion and provide users withthe information necessary to interpret the pose. Further testing isneeded to determine whether the benefits of the additional infor-mation outweigh the mental challenges associated with re-orientation.

The concept of dynamic pose presentation could be extendedfurther to provide players with even more information about spe-cific hand or foot positions. In some poses, a wrist rotation may gounnoticed in a small, full-body image, while it may be critical tothe correct interpretation of the pose. An image overlay, super-im-posed arrows, or other image processing could be used to highlightspecific pose features such as angles or positions. These featureswould, again, have to be tested to determine their benefit to play-ers versus any possible distraction they might cause.

While it is evident that a mirrored view is more easily inter-preted than a direct image of the pose, the issue becomes morecomplicated with pose presentations that change over time, suchas in the MMOE context, where poses may be viewed from differ-ent player positions. The mirrored presentation is helpful whenplayers are observing the pose straight-on, but it may be extremelyconfusing from the side or back. In a system that relies on a movingview point, it might be more effective to abandon the mirroredmethod entirely, as switching between the two types of view inter-pretation would likely be a huge source of confusion.

4.4.7. Impact of other factors on pose successNo significant success correlations were found between fitness

level, video game experience, height, or weight and pose speedor accuracy, suggesting that pose understanding is a purely mentalactivity. No correlation was found between age and success within

the 18–27 group. These results suggest that system training is suit-able for diverse participants. It is possible, however, that fitness le-vel and gaming experience may impact on player success in actualgame play. For example, endurance helps players replicate manyposes quickly and accurately when they are displayed in rapid suc-cession. Age appears to be inversely correlated with both posespeed and accuracy among the older adult female group. The sam-ple size is too small to make definitive claims on the exact impactof age on pose accuracy or speed, but a comparison can be madebetween the younger and older groups as a whole. Not surpris-ingly, there are differences in average pose accuracy (Mahalanobisdistances) for female participants of the two age groups. These areillustrated in Tables 10 and 11.

4.4.8. Challenges for older participantsTesting with older adult female participants highlighted several

issues which may impact players of that demographic. Several par-ticipants had difficulty seeing the computer screen and positionedthemselves closer to the computer than younger participants.While it did not appear to impact their overall success rates, con-sideration should be given to placement and setup of systems in-tended for older players. Two of the participants had to put thesensors on while seated. This in and of itself is not problematicfor game play, but due to sensor straps elasticity, the sensor podsrequired more adjustment later. While these and other issuesmay have an impact on player performance, the participants wereall enthusiastic. If the women were nervous or cautious about com-puter systems, sensors, or gaming systems, they did not show it.

5. Conclusions and future work

We introduce our SNAP exergame system and propose a mas-sively multiplayer online exergame (MMOE) for a customizable,social gaming experience, providing players with measurable, longterm engagement, and health benefits. We outlined some of thetechnical and usability challenges involved in the development ofan MMOE that involves player-generated dance poses. Consider-ation is given to how players may best select poses, comparingand two methods, one that aims to capture more serendipitousand spontaneous dance poses, and another which provides userswith greater control over pose selection. We subsequently proposetwo methods to capture poses: a poseable avatar and a camera–network system. Pose training in the context of an MMOE is notsignificantly different from the current dance-based SNAP gameprototypes. The MMO system provides the added benefit of greateraccess to lots of user pose data. When aggregate data sets fail tomeet the needs of players, individuals can always re-train posesthemselves.

Page 8: Pose presentation for a dance-based massively multiplayer online exergame

96 H. Johnston, A. Whitehead / Entertainment Computing 2 (2011) 89–96

To determine the most effective method of pose presentation,we conducted tests, comparing direct, mirrored, mirrored and back,mirrored and side, and mirrored with feedback views. In general, par-ticipants responded more quickly to mirrored versus direct presen-tation methods. The mirrored with feedback method resulted in theslowest response times and greatest perception of difficulty, whichlikely is a result of increased awareness of their inaccuracy. Accu-racy results did not vary significantly among presentation meth-ods, however the direct and mirrored and side views seemed tocause greater confusion, as indicated in a self-reported measureof perceived mental difficulty. The feedback is a promising strategywhen accuracy is of greater importance than speed (such as train-ing applications). It offers many possibilities for refinement andcustomization. Dynamic methods of pose presentation may pro-vide an even more refined method of communicating poses in anMMOE. Factors such as height, weight, fitness level, and gamingexperience do not seem to impact the users’ ability to replicateposes. Testing with older adult females highlighted additionalusability issues, emphasizing a need for easy system setup andaccommodation of varying levels of visual acuity. In general, theolder female participants performed the poses more slowly andless accurately than the 18–27 age group, but not outside the rangeof usability and playability.

Future research will involve the implementation and testing ofan MMOE. Design refinements may be made to the SNAP systemfor more effective use within the MMOE, such as a switch to wire-less sensors and variable sensor pod strap sizes. More sophisticatedmethods of pose presentation could be developed and tested,including overlay screens, zoom-in details, or 3-dimensional viewswith dynamic camera angles. Future work could also lead to newnavigation strategies using only the SNAP system for input andcontrol and eliminating the need for keyboard and mouse inputfor the MMOE. Long term study of the physical and social effectsof the MMOE system would also be an extremely valuable avenuefor future work.

Acknowledgement

This work is partially funded by the Natural Sciences and Engi-neering Research Council of Canada.

References

[1] R. de Oliveira, N. Oliver, TripleBeat: enhancing exercise performance withpersuasion, in: Proceedings of the 10th International Conference On Human–Computer Interaction With Mobile Devices and Services, Amsterdam, TheNetherlands, ACM, New York, NY, 2008, pp. 255–264.

[2] C. Garcia Wylie, C. Wylie, P. Coulton, Persuasive mobile health applications, in:Electronic Healthcare: First International Conference, Ehealth 2008, London,September 8–9, 2008, Revised Selected Papers, Springer, 2009, p. 90.

[3] S. Arteaga, M. Kudeki, A. Woodworth, Combating obesity trends in teenagersthrough persuasive mobile technology, ACM SIGACCESS Accessibility andComputing (2009) 17–25.

[4] W. Zhu, Promoting physical activity through internet: a persuasive technologyview, in: Proceedings of the Second International Conference On PersuasiveTechnology, Palo Alto, California, USA, Springer, 2007, p. 12.

[5] J. Lacroix, P. Saini, A. Goris, Understanding user cognitions to guide thetailoring of persuasive technology-based physical activity interventions, in: InProceedings of the 4th International Conference On Persuasive TechnologyClaremont, CA, USA, April, Claremont, California, USA, ACM, New York, NY,2009, pp. 1–8.

[6] S. Berkovsky, D. Bhandari, S. Kimani, N. Colineau, C. Paris, Designing games tomotivate physical activity, in: Proceedings of the 4th International ConferenceOn Persuasive Technology, Claremont, California, ACM, New York, NY, 2009,pp. 1–4.

[7] J. Lin, L. Mamykina, S. Lindtner, G. Delajoux, H. Strub, Fish’n’Steps: encouragingphysical activity with an interactive computer game, in: InternationalConference On Ubiquitous Computing, Springer, 2006, pp. 261–278.

[8] Wii at Nintendo, Nintendo of America Inc., 2009.[9] A. Whitehead, N. Crampton, K. Fox, H. Johnston, Sensor networks as video

game input devices, in: Proceedings of the 2007 Conference On Future Play –Future Play ‘07, New York, New York, USA, ACM New York, NY, 2007, pp. 38–45.

[10] N. Crampton, K. Fox, H. Johnston, A. Whitehead, Dance, dance evolution:accelerometer sensor networks as input to video games, in: HAVE 2007 – IEEEInternational Workshop On Haptic Audio Visual Environments and TheirApplications, Ottawa, Canada, IEEE, 2007, pp. 107–112.

[11] A. Whitehead, H. Johnston, N. Nixon, J. Welch, Exergame effectiveness: whatthe numbers can tell us, in: Proceedings of the 2010 ACM SIGGRAPHSymposium on Video Games, 2010.

[12] A. Whitehead, K. Fox, Device agnostic 3D gesture recognition usinghidden Markov models, in: Proceedings of the 2009 Conference on FuturePlay on @ GDC Canada, Vancouver, British Columbia, Canada, ACM, 2009, pp.29–30.

[13] R. Slyper, J. Hodgins, Action capture with accelerometers, in: Proceedings ofthe 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation,Eurographics Association, 2008, pp. 193–199.

[14] M. Csikszentmihalyi, Flow: The Psychology of Optimal Experience,HarperPerennial, 1991.

[15] P. Mahalanobis, On the generalised distance in statistics, Proceedings of theNational Institute of Sciences of India 2 (1936) 49–55.