Motion to Play: Designing a Large-Scale Motion Capture Game

Katie Salen, Christoph Bregler, Clothilde Castiglia, Jessica DeVincezo, Roger Luke DuBois, Kevin
Feeley, Tom Igoe, Jonathan Meyer, Michael Naimark, Alexandru Postelnicu, Michael
Rabinovich, Sally Rosenthal, Jeremi Sudol, Bo Wright

http://Squidball.net

Abstract. This paper describes Squidball, a new large-scale motion capture-based game. It was tested on an up to 4000 player audiences last summer at SIGGRAPH 2004. It required the construction of the world’s largest motion capture space at the time, and many other challenges in technology, production, game play, and study of group behavior. Our aim was to entertain the SIGGRAPH Electronic Theater audience with a cooperative and energetic game collaboratively played by the entire audience, controlling real-time graphics and audio by bouncing and batting multiple large helium- filled balloons across the theater space. We detail in this paper the lessons learned in producing such a system and game.

 


Fig.1.  Electronic Theater audience playing Squidball

1 Introduction
The scenario was simple but challenging: transform a large-scale public space through play, and do so using motion capture technology. The interactivity had to be simple but clear, engaging yet casual, and had to happen in under 15 minutes time, for an audience of up to 4,000. Squidball, a game developed for SIGGRAPH’s 2004 Electronic Theatre Pre-Show, was the answer to this challenge. Using a calibrated motion capture volume triggering real-time computer graphics, the game debuted on August 12th, 2004, at the Los Angeles Convention Center. It was a game that attempted to discover the framing requirements for the creation of large-scale games within entertainment contexts, which offered low thresholds of entry for players and demanded collaborative, rather than competitive play. What are the requirements of such game interfaces, what role did players play in discovering and sharing rulesets, and how can technology be used to transform activity within non-traditional game spaces?

This paper describes the design criteria and technology behind this venture and attempts to answer some of these questions. It also explores the adventures and challenges that the production team had to overcome as well as the lessons learned.

 

2 Squidball
SIGGRAPH audiences experienced a similarly interactive Electronic Theater pre-show over a decade ago when the Cinematrix System was introduced in 1991 [2]. Cinematrix was an interactive entertainment system that allowed members in the audience to control an onscreen game, using red and green reflective paddles. Other interactive entertainment systems have been tested on audiences in the hundreds to thousands, described in greater detail in section 6.

The success of Cinematrix was the original inspiration for our work, and we initiated our project to bring back this style of pre-show entertainment to SIGGRAPH. Although the 2004 Electronic Theater was our first public test, we envision Squidball being deployed in other large audience events for entertainment, social studies, team building exercises and other potential applications.

Developing and testing such a system was a very unique and high-risk venture with many challenges. Other games, graphics and interactive systems are usually designed for a single user or a small group, and go through several test cycles. However, for the Squidball project, we were dealing with many factors orders of magnitude larger than standard environments, including a gathering of 4000 people, the construction of a system using a 240 x 240 x 40 feet motion capture volume and a large projection screen. Furthermore, the system had to work the first time, without the benefit of any full-scale testing.

The design process began by establishing a set of constraints, which would inform the game’s final design and drive its technical innovation. It was determined that the game would rely on the use of 22 cameras mounted on the catwalks above the auditorium, activating the entire volume, and that we would introduce a set of twelve retro-reflective “game controllers” into the space, which players would use to take action in the game. These controllers would act as wireless joystick/mouse inputs, much like the paddles of the Cinematrix, or players’ bodies in games like Ghost in the Cave by Roberto Bresin. We ultimately decided to use helium filled balls that players could hit around the space, as the controllers, and that we would track the balls using 3D motion capture technology. This data would drive real-time graphics and an audio engine powered through Jitter, which provided feedback to players as to the state of the game through visual and audio cues.

Game Description
The rules for the game were simple, and had to be discovered by participants through game play. The 12 weather balloons (input devices) in physical space were represented within the digital game space as green spheres on the screen. Players moved the weather balloons around the auditorium (whose space corresponded to a 3D space onscreen), in order to destroy changing grids, populated by 3D target spheres. The game had three levels of increasing complexity, and each level could be replayed three times before a loss condition was reached. The second level introduced an element of time pressure, so players had to complete the game challenge within the allotted time period. This was the level that taught players how the game worked, as most audiences failed to clear the level on the first try. The final level required players to uncover a number of special red dots that when connected, revealed the image of a teapot, a reference to the first 3D computer graphic. This small reference to the history of the conference and its membership was well appreciated and provided a nice visual culmination to the game.

Through repetition and the existence of a loss condition, the players eventually discovered the victory condition, as well as the correspondence between the balls and their representation within the virtual game space. Players quickly discovered a range of social strategies that emerged from their physical proximity with other players; in each instance of the game (the game was played 6 times over the course of 4 days) the 4,000 or so players came together organically to collaborate in the play as they discovered what the game play required of them. It was an interesting first step in designing a kind of game that was extremely simple in its rules and interaction but relatively complex in the forms of social dynamics it spawned.

 

3 Game Design
Any interactive experience designed to be played, by a mass audience within a specific venue also comes with a host of contextual considerations. These considerations ranged from a need to transform an audience unprepared for game play into willing participants, to the need to create a game that could be played casually, and collaboratively, by an auditorium of strangers. Following is an overview of some of the specific design challenges that emerged from the recognition of some of the constraints peculiar to the context for which we were designing. These challenges informed the creation of both the game concept, as well as the game engine, which was developed in parallel and used to prototype game concepts during the development process.


Fig.2.  Example Screen shot of Squidball game. Please see video (on squidball.net) for game in action.

3. Rules to be Discovered
In the case of Squidball, we knew we needed to design a game that required no explanation of the rules: players would find themselves “at play” when the balls were introduced into the audience, and had to use their discovery of the rules of interactivity to understand how the game was to be played. This element was crucial to the success of the game, for if players never understood the relationship between their actions in hitting the balls around the space, and the implications of those actions within the game itself, then the design would fail.

Several specific design choices were made to facilitate the discovery of rules. First, we had to establish a one-to-one connection between the balls in the physical space, and their representation as virtual objects on screen. In order to make this relationship clear and as easy to ascertain as possible, we chose to represent the balls as colored spheres on screen, and to attach visual and audio trails to the objects to help communicate the spatial trajectories of the balls. This visual one-to-one correspondence was intended to help players understand the link between the physical and virtual representations of the balls, and to lead them to understand that action taken with the balls in the real world space translated into action taken within the virtual game space.

Second, we relied on a time pressure component (levels were timed) and the need to replay levels as an educator of game rules. Once the group failed to clear a level, the goal became clear, and players quickly came to understand what was required of them from an interactive perspective. This transformation from audience member to game player happened consistently within the first four minutes of the game, and players that made the connection began to shout out this understanding to other players, acting as agents of the game rules. While we have no accurate measurements of the time taken to make this transition, or the rate of conversion, it was observed that it took as little as 10% of the audience making the connection to drive the transparency of the interactivity. It was remarkable to watch chaos transformed into coordinated action, with players yelling out game instructions and strategies to others, and the balls being intentionally moved to specific areas of the auditorium in order to clear the level. One player even came up with a degenerate strategy, which involved him running around the game space holding a ball above his head, clearing the board Pac-Man style. This strategy, as well as others that emerged through play, was a direct consequence of the discovery of the rules through play.

3.2 Make Balls More Fun
One of the more significant interactive challenges we encountered had to do with our choice of input device. How could we make hitting around a giant helium ball more fun then it is already to hit around a giant helium ball? There is a well-documented history of the fun of beach balls in large auditorium spaces (rock concerts, football games, etc.); our challenge was to turn the pleasure of random interaction into intentional and meaningful action with consequence within a larger system. We certainly did not succeed with all members of the audience on this point: many players we spoke with were only interested in the fun of hitting the ball around, and did not engage with the game directly. This is certainly one potential failing of the game’s design overall, but fortunately for us most members of the audience chose intentional action over random interaction, once they were initiated into the game’s objectives.

So how did we make hitting balls more fun? One simple way was to make the game a bona fide game, one that could be won or lost. Establishing a quantifiable outcome, or victory condition, for the game was critical, as it layered intention onto interaction with the balls. Without a win/loss condition, the players would have had less motivation to discover the consequences of moving the balls around the space. With a win/loss condition, players were driven to see the effects of their actions, and to test out this relationship through coordinated movement. The addition of intent transformed the fun of hitting balls into a strategy of play. We know that players derive great satisfaction from acting strategically within a game; when they can be made to feel clever, this satisfaction becomes even richer. By using an already established play mechanic (hitting balls) within an interactive system where the consequence of interaction was translated into success or failure toward the meeting of the game’s goal, we were able to make balls more fun.

3.3 Motion Capture Centric
The decision to use 3D motion capture technology had a large impact on the game design. While we had many ideas about games that could be played by 4,000 people in a large public space, making a game that had the capabilities of motion capture at its heart was deeply challenging. Add to this the constraint of a limited number of controllers (we determined 20 balls to be the maximum number of objects that could be simultaneously tracked) and the need to present the computer graphics version of the game space on a screen positioned in the front of the auditorium, which spanned less than 1/3 of the auditorium’s width. Each of these constraints had their own consequence on the game’s final design. The need to make the game motion capture specific, led us to reject game mechanics focused on small, detailed interactions between players (players exchanging cards, for example) or games with verbal exchange at their core (trivia games, for example). We had find a game that was essentially about large motion-based movements, whether these were articulated by the body or by another form of input device. Section 7, Large Scale Motion Capture, gives a detailed overview of some of the technical considerations we encountered.

3.4 The Casual Gamer
While the term “casual gamer” currently refers to a new demographic of gamer within the videogame industry (players looking for games that can be played in less than the traditional 50-70 hours demanded of console games), in the case of Squidball it refers to a game that lasts for a short period of time, has an extremely low threshold of entry for players, requires casual interaction (physical, intellectual, social), and in the case of the SIGGRAPH participants, can be played with a beer in one hand. We were limited to a small number of input devices—approximately 24 players out of 4,000 could touch a ball at any one time—which worked to our advantage in this respect. A player’s participation was in some sense, casual by default, as the game did not require a player to invest heavily in every single moment of play. Similarly, the game took advantage of the fact that many players could take on roles beyond “ball hitter.” The role of spectator was engaging in and of itself, due to the spectacular nature of the balls moving around the large auditorium and the energetic pandemonium that ensued once the balls were introduced into the space. Players also took on roles as strategists, yelling out directions to other players, as well as enforcers, corralling groups of players to move the balls to specific areas of the auditorium. The intersection of these roles led to any number of ways that a player could participate in Squidball, allowing play to span a continuum of pure spectator to hard core gamer.    

3.5 Eye Hand Screen Coordination
An additional issue arose in relation to the need for players to divide their attention between the projection screen where the game board was visualized and the balls themselves. It requires some ergonomic ingenuity to look up and forward at the same time, which was one of the weaknesses of the current design. As a result, some players decided to only watch the screen. Others ignored the screen and simply pushed the balls towards the center of the room. Initially, a relatively small percentage of “aware” players actually watched both and drove the game play forward. The number of ”aware” players increased dramatically towards the midpoint of the game, demonstrating that the game design principles were working, despite the challenge of eye-hand-screen coordination. The issue of split attention remains an issue for any game, which involves thousands of people, balls, and a single screen.

One solution might be to place multiple screens on all sides of the audience. However, this introduces another difficulty: coordination. Even with a single screen, players had difficulty coordinating the balls and target locations: the player must face one direction, look at a screen over their shoulder, and then punch a ball in a third direction towards a target.  Since few people have much practice at this activity, balls were popped left when they should have been popped right, or forward rather than back. Adding more screens would confound this issue. The problem of having to watch the balls and the screen is fundamental. Originally we also considered audio-only games, or games in which balls hitting each other formed the primary play mechanic. We plan to reconsider those ideas in future experiments, and do more detailed evaluations of the audience interaction.

3.6 Dynamic Difficulty Adjustment
Many games have something know as DDA, or dynamic difficulty adjustment, built into their system. DDA allows the game system to adjust to a player’s performance, allowing them more opportunity for success within the system. Racing games often use DDA to give a player a sense that they have a chance to win, even when they are doing poorly. In a game like Super Monkey Ball for the Nintendo GameCube, for example, the number and kind of powerups will change, depending on how a player is doing. If they are moving to slow, more system will place more speed powerups in the path of the player, for example, increasing his or her chances of improving this metric.

Squidball used DDA as well, because we wanted to create as engaging an experience for our players as possible, adjusting the system on occasion to ensure players had maximum opportunity for success. In the control both, we had a control to alter the sensitivity of the game grid. Turning up the sensitivity made the virtual target sizes larger, making game play easier. Turning down the sensitivity made the virtual target sizes smaller and game play harder.


Fig.3.  Squidball Gamers

For successful game play, we felt it was essential that the players be able to make mistakes and learn from them. So, by default, we set the sensitivity fairly low. However, in some circumstances we increased the sensitivity to temporarily make the game play easier, giving the audience a little “boost”. A person in the control booth was responsible for watching the progress of the game and making these ”group mind” decisions regarding when to adjust game play. We believe that similar controls were included in Cinematrix.

Even with a sensitivity control, there were some issues. One problem was that the audience size at the start of several of the shows was less than 4,000 people, which we had not anticipated in our game design. Because of this, some of the game levels proved hard to clear, because virtual targets were located in places where few audience members could reach them. This problem could be addressed in the future by creating multiple configurations for different audience sizes.

A second issue was uneven audience distribution. Some people were in sparse sections of the audience, and did not get to participate as much as others. To address this concern, we enlisted student helpers to help move balls around.

 

4 Large-Scale Game Design: Prototyping and Playtesting Challenges
The design of Squidball used an iterative design process in both the creation of the game and the technology that drove it. During the design phase, we brainstormed the basic concept for the game, including the core mechanic and some first thoughts about technology constraints, visual and audio direction, etc. Once the idea of the core mechanic was settled, we refined it for the purpose of prototype production, reducing it to its most basic, easily implementable form.

In the case of Squidball, this meant establishing a rapid prototyping environment connected with the real-time motion capture input, one that generated real-time computer graphics and sound effects. This phase involved the software integration of motion capture and the game engine.  The real-time visualization system and game engine were written using the Max/MSP/Jitter development environment distributed by Cycling74. The system consisted of five main components:

  • A TCP socket communication system, which distributed motion capture data from the Vicon system to the tracking, game-engine and audio subsystem in real-time.
  • A real-time tracking module that would take the raw motion capture data, filter it using Kalman filters and then extract useful metrics such as object velocity and collision detection. (This was necessary to produce consistent sound effects with the balls flying high up in the air.)
  • A game engine written partly in Java and partly as Max patches, which drove the game simulation. Sub modules of this system included components that handled the basic game narrative, the media files, and the interface to the graphics engine. The graphics engine used Jitter to render the game using OpenGL commands.
  • An audio subsystem resident as a Max patch on a second computer (and receiving forwarded motion capture information as well as scene control from the main Max computer). It produced sound for collision, bounces, flying noises, etc.

It is crucial to point out that all the decisions made during the design phase were understood as conditional. As the game developed and playtesting of basic interactive mechanics revealed strengths and weaknesses of the current prototype, it became necessary to make often radical changes to the game concept. The process of spec, build, and test continued until we established that the core mechanic was fun. The prototyping environment described above proved robust and fast enough to use in the final show, which was critical, as we continued to tweak the game until the night before the opening.

During the final production phase, we ran two duplicate sets of the game and audio computers for redundancy, with a switch, but fortunately this was never needed. The system required five human operators during the shows: one on sound, one on the game system, one monitoring the Vicon PC, one watching the audience, and a master show controller who coordinated the team.

Playtesting was a major challenge we faced. Gathering a 4000-player audiences is difficult and expensive, so we had limited opportunities to test the game at full-scale. This led to a number of creative solutions in testing. Though we were able to use spaces of approximately the same size as the final space for testing, we were not able to test with a full-scale audience. We came up with a number of innovations, including clumping groups in various locations in the testing space and organizing the clumps strategically to make them appear to be a larger audience. However, the first test of the game with a full audience was not until its premiere at SIGGRAPH.

 

5 Related Work
As mentioned, the inspiration for Squidball came from the Cinematrix system [1] shown at the SIGGRAPH 1991 Electronic Theater and several other events. With this game, every audience member had red and green reflective paddles that controlled on-screen games, including a voting system, Pong, and a Flight Simulator. In the voting schema, the system counted how many red vs. green paddles were shown. In Pong, the left side of the audience played against the right side, and the position of the paddle controlled the ratio of red and green paddles on each side of the audience. It was surprising how quickly the audience learned to control the games and to jointly coordinate the mix between red and green paddles. Of course, the yelling and excitement of a large audience was also part of the show. Another set of similar interactive techniques, were studied at student theater screenings at CMU [2]. Several computer-vision-based input techniques were tested on large audiences. The input technique most closely related to Squidball is the 2D beach ball shadow tracking, where the location of the shadow could be used as a cursor in several 2D games. D’CuCKOO (a band that uses various kinds of new technological instruments) designed a gigantic beach ball that creates music as the audience bats it around. The MIDI-Ball [3], a wireless 5-foot sphere, converts radio signals into MIDI commands that trigger audio samples and real-time 3-D graphics with every blow. There have been other systems reported, that track small groups of people as they perform interactive music and dance activities [4], or are used for home video games [5] but none of them have been tested on thousands of players.

6 Large-Scale Motion Capture
In the following section we describe the challenges and experiments of building a large-scale motion capture space and how this ties into the Squidball game engine and game testing.

Our target venue was Hall K in the Los Angeles Convention Center, which was converted into a 4000-seat presentation environment to screen the Electronic Theater for the 2004 SIGGRAPH conference. The total space was 240 x 240 feet. We needed to build a motion capture volume that covered the entire seating area and allowed enough height above it to throw the balls up in the air: a capture volume of 190 x180 x 40 feet. To the best of our knowledge, no motion capture space of this size had been built before. One of the larger reported spaces was built for the Nike commercial by Motion Analysis Corp and Digital Domain [6]. It had dimensions of 50 x 50 x 10 feet. It used 50 cameras to track six football players.

6.1 Interactive Motion Capture Space
One of our design constraints was tracking multiple (up to 20) balls in 3D in real-time. We had a Vicon motion capture system [7] with 22 MCAM2 cameras, and each camera had a field of view of 60 degrees (12.5mm lens) and 1280 x 1024 pixel resolution. In its intended use, the system can track standard motion capture markers (0.5 inch) in a capture distance of up to 25 feet. The markers are made of retro-reflective material. Visible light illuminators placed around the camera lens shine light out, and almost all light energy is reflected back into the camera. This makes the retro-reflective markers appear significantly brighter than any other object in the camera view, and image processing (thresholding and circle fitting) is used to track those markers in each view. Triangulation of multiple camera views results in very accurate and robust 3D marker tracking. (We are currently also considering vision-based techniques on non-retro-reflective objects for future game experiments.)

We determined that the only way to utilize the Vicon motion capture system in a significantly larger space with the same number of cameras was to scale up each aspect of the system. The cameras’ view scale up in an approximately a linear fashion; in other words, a marker 100 times larger in diameter and 100 times further away looks the same to the camera. Of course, because light intensity falls off with the square of the distance traveled, much greater illumination is necessary. With experimentation, we found that halogen stage lighting provided sufficient illumination for the Vicon tracker.

Three other challenges in scaling up the system were 1) producing the larger markers, 2) dealing with camera placement constraints, and 3) calibrating the space. All of them appear simple in theory but, in practice, these became critical production issues.

 

6.2 How to produce the balls (markers)
The motion capture system requires spherical markers for effective tracking. Normal marker sizes are 0.5 inch in diameter, which are covered with 3M retro-reflective tape. In order to stage our event in a radically larger-than-normal space, with larger-than-normal camera view distances (max. 250 ft.) and using the balls as real-time inputs, we had to increase the size of those markers significantly. We determined that a 16-inch diameter marker was the smallest marker that could be robustly detected at 250 feet. In the final game, for dramatic effect and game-play, we opted for larger markers: 8-foot chloroprene bladders (weather balloons). In order to achieve the right “bounciness”, we under-inflated the balloons.

Each marker requires a retro-reflective coating in order to be tracked by the Vicon motion capture system. Experiments using retro-reflective spray-paint failed. The reflective intensity of the sprayed paint was 70% to 90% less than 3M retro-reflective tape. After dozens of tests with tapes, and fabrics of varying color, reflectivity and weight, we settled on specific 3M retro-reflective fabric (model # 8910). Figure 2 shows the results of these tests. The first test (the ”lemon”), the second iteration (the ”tomato”), and the final version (the ”orange”), which ultimately produced a perfectly round spherical shape.


Fig.4a. 0.5 inch markers and the 3M retro-reflective tape the markers are coated by.


Fig.4b. The evolution of our retro-reflective balls.

In order to achieve a perfect spherical shape and to spread the force evenly throughout the surface, the fabric was cut on the bias, in panels like those of a beach ball. At this large scale, any of these shapes were adequate for the Vicon system to track them. The advantages of a perfect sphere were both aesthetic and functional. With the force evenly distributed, one spot is no more likely to rip the fabric than any other spot. Similarly, hitting the ball anywhere has the same predictable result. The balls were inflated with helium to reduce their weight. Because the fabric was heavy, they did not float away when filled with helium, fortunately. In future generations of this game, we plan further reduce the weight of the balls in using a different material or smaller sized balls.

 

6.2 Camera Placement
Standard camera placement for a motion capture system is an “iterative refinement process” dependent on several site-specific aspects. For standard motion capture, cameras are usually placed on a rectangle around the ceiling, all facing into the capture space. Sample motion capture markers are distributed over the capture space, and cameras are adjusted so that each marker is seen by as many as possible cameras from as many as possible directions. Additionally, the tracking software is checked for each camera during placement.

In our scaled-up system, camera placement was a significant challenge. We could not afford as many trial-and-error cycles in camera placement that would be possible in standard-sized motion capture labs since our time was limited in the final space and each adjustment took a significant amount of time. Other logistical constraints affecting camera adjustment included: A) cooperating with the Union LACC workers schedule to get access to the ceiling and catwalks, B) coordinating between people on the 40-foot high ceiling and people on the floor up through radio-communication for each re-mount and re-alignment of a camera, C) getting live feedback from the Vicon PC station in the control booth to people on the ceiling so they could see the effects of their adjustments, D) camera view limitations -the 60 degree wide-angle lenses did not actually see a full 60 degree angle of view; even with the extra heavy studio lights mounted next to the cameras the visibility of the weather balloons dropped off after 250 feet in the center (and at even shorter distances at the perimeter of the camera view), and E) scale issues-moving balls on the ground takes much longer because of their large size and the distance to be covered. In a standard mo-cap studio, you pick up a marker and lay it down a few seconds later; in this space, we had to move a shopping cart with a ball or drive an electric car across the hall.

3D Simulation in Maya
In anticipation of all those problems, we designed a 3D model in Maya for all the target spaces, including one for the campus theater (our first test), one for the campus sports center (our second, third and fourth test), and one for Hall K at the LACC (Figure 3), our final show. This final model was derived from blueprints we obtained from the building maintenance team.

We also built a 3D model of the “visibility area” to determine the sight lines of the cameras. We experimented in the campus sports center and determined that the cameras could not “see” of center at distances of 250 feet. The further out we moved the balls, as shorter the visibility became.  We therefore placed four cameras at one end of the court facing the other end. Moving a ball around on the capture space, we marked the 3D locations where the visibility of the ball vanished relative to each camera. Given this data, we built a 3D Maya model for the camera visibility volume (Figure 3 green). This volume was then used in our Maya building model to simulate several camera placement alternatives. Our goal was that each point in the capture volume should be seen at least by 3 cameras, given the constraints on the lengths of video cables. The final configuration we used was pretty close to our simulation. We settled on mounting evenly all 22 cameras around the left and right catwalk, and along the back-end catwalk and the center catwalk, but not the frontal catwalk. We didn’t want to mount any cameras and high-intensity lights above the screen, so the audience would not be distracted. (Consult our website to see the final camera-placement in detail).

Mounting and Networking
We knew we had to set up the system in LACC very quickly, so we ran multiple practice sessions for camera mounting in New York, first for 10 cameras and then 22 cameras.

The setup required careful cabling. Vicon sends camera data first through analog wires to a Datastation, which thresholds video frames and compresses the resulting binary images. The Datastation then sends all 22 Video Streams at 120Hz over gigabit Ethernet to the Vicon PC, which does the real-time 3D tracking. In order to have the shortest possible video cable length, the Datastation had to be close to the cameras. In Hall K at LACC, the Datastation was placed 40 feet above the audience on one of the catwalks. The compressed video data was then sent via gigabit Ethernet down to the “control booth” in the back of the audience on the floor. The control booth contained all the workstations, including the real-time 3D tracker and the game system.


Fig.5.  This shows our Maya model of Hall K and the visibility cone (green) for one of the cameras.

During camera placement we operated the Vicon PC in the control booth through a wireless laptop and Remote Desktop. This allowed us to walk from camera to camera on the catwalk, and do all adjustments, while remotely monitoring what the camera “sees” and how the tracking software performs.

The wireless bandwidth was high enough to do this with good latency. Of course, we had to use radio communications and lots of yelling from the ceiling down to people on the floor (who were moving target marker balls around) in order to adjust the cameras properly. In our NYC locations, this usually worked very well, and we had the cameras up in 1 to 2 hours. Unfortunately, at LACC we encountered quite a few surprises. As we said, complying with the Union workers schedule was a challenge, because they always had to stand next to us while we were mounting up on the catwalk or controlling the motion capture space so that it remained free of  “occlusion.” Also, there were many other parties needing access to Hall K during our setup. Finally, we ran into cross-talk on the wireless network for our laptops from the LACC building antennas and from the exhibition space next-door. Nevertheless, we were able to mount the entire 22 cameras and cables within one day.

6.4 Calibrating the Space
The final challenge for the motion capture setup was camera calibration. Using the Vicon software, the calibration process in a standard space is done by waving a calibration object throughout the entire capture volume. Usually, this is a T-shaped wand that has 2 or 3 retro-reflective markers placed on a straight line. (Figure 4) The 2D-tracking data for the calibration object from each camera is then used to compute the exact 3D locations, directions and lens properties of the cameras. This is called the calibration data, which is crucial for accurate 3D tracking.

Of course, the standard T-wand would not be seen by any camera in such a large target space (below pixel resolution). We determined that a 16-inch marker was the smallest marker that could reliably be seen and tracked from 250 feet. To overcome this, we built several “calibration T-wand” versions. Figure 4 shows one version that allowed us to “wave” the calibration object as high as 30 feet. We conducted initial tests on how much time an “exhaustive volume coverage” would take and how physically exhausting it would be using the roof of our lab. In the campus theater space and the campus sports center, we either walked the wand around holding it at several heights or skate-boarded through the space. In the final test at Hall K, we first used a crane and ropes. Ultimately, we ended up using a T-wand constructed out of light weight bamboo sticks lashed together using a traditional Japanese method and then drove that around at several heights on an electric car. A calibration run took around 30 minutes. Tensions ran high during calibration in Hall K, but the process was a success (see video on squidball.net) and we had a spot-on reading of Hall K right up to the periphery of the seating areas. In actuality, we were able to track the balls beyond the boundaries of the game “board”.


Fig.6. Left: The standard sized calibration objects of length 15 inch. Right: Large 15 feet high wand.


Fig.7. Calibration in Hall K using a crane.

 

7 Conclusions
After all the hard work to create and setup Squidball for SIGGRAPH 2004, the roar of the crowd at the end of each a level was gratifying validation of our efforts.

Since Squidball, we have been discussing possible iterations for future games. One option we have discussed is to use spotlights shone onto the crowd as targets, rather than using targets on a virtual screen. This would address some of the game play issues we encountered. The audience would have a physical cue showing where they are trying to get the balls to, rather than a virtual cue shown on a screen over their shoulder. Using spotlights, it would be possible to create roving patterns, enabling the spotlights to be moved in a pattern which ensures that everyone gets a chance to participate, taking into account the audience density and distribution. Of course, spotlights introduce a whole new set of technical challenges, though none that are insurmountable. We are considering this and other game design changes for Squidball 2.

Fig.8.More pictures of Squidball Players.

Acknowledgments
This work has been partially supported by NYU, NSF, ACM SIGGRAPH, Vicon Motion Systems Ltd, Apple Computer Inc, Advanced Micro Devices Inc, Alienware, Cycling 74, David Rokeby: very nervous systems, NVIDIA Corp., Segway Los Angeles. The audio was supplied for “Insert Coin” and ”Re-Atari” by Skott, “Superkid” by Max Nix, and “Watson Songs” by The Jesse Styles 3000. Furthermore we would like to thank especially our friends from AVW -TELAV, specificially Jim Irwin, Gary Clark, John Kennedy, Tom Popielski, Gerry Lusk, Mark Podany and Mike Gilstrap. Without their production efforts this would have been impossible. Also special thanks to Debbi Baum, Robb Bifano, Alyssa Lees, Jared Silver, Lorenzo Torresani, Gene Alexander, Boo Wong, Damon Ciarrelli, Jason Hunter, Gloaria Sed, Toe Morris, Scott Fitzgerald, Ted Warburton, Chris Ross, Cindy Stark, Brian Mecca, Carl Villanueva, and Pete Wexler for all their great help. And the SIGGRAPH chair Dena Slothower, who let us use their venue for the first big test.

 

References

  • CARPENTER, L., 1993. Video imaging method and apparatus for audience participation. US Patent #5210604, #5365266.
  • MAYENES-AMINZADE, D., PAUSCH, R., AND SEITZ, S. 2002. Techniques for interactive audience participation. In IEEE Int. Conf. on Multimodal Interfaces, Pittsburgh, Pennsylvania.
  • BLAINE, T. 2000. The outer limits: A survey of unconventional musical input devices. In Electronic Musician.
  • ULYATE, R., AND BIANCIARDI, D. 2004. The interactive dance club: Avoiding chaos in a multi participant environment. In Int. Conf. on New Interfaces for Musical Expression.
  • FREEMAN,W. T., TANAKA, K., OHTA, J., AND K.KYUMA. 1996. Computer vision for computer games. In 2nd International Conference on Automatic Face and Gesture Recognition, Killington, VT, USA, IEEE.
 
       

Back to Home Page Copyright 2006 Squidball Team
Squidball Contact: info@movement.nyu.edu
Web Design by Sean Patrick Henry