Locomotion
Issue: Volume: 27 Issue: 12 (December 2004)

Locomotion

When filmmakers think about bringing a storybook to life, they usually imagine re-creating the fantasy world by using animation or by putting actors into the roles of the storybook characters. Shrek is a good example of the former; last year's Peter Pan is a good example of the latter. This year's The Polar Express is an example of neither...or, perhaps, both.

The film, based on Chris Van Allsburg's illustrated book of the same name, The Polar Express tells the story of 8-year-old Chris's train ride to the North Pole on Christmas Eve. It looks like an animated feature, and it was created entirely with 3D computer graphics, including the IMAX release. But it was directed by Robert Zemeckis like a live-action film, in which actors performed nearly all the human characters.

"Bob wanted to direct actors, not a bunch of animators," says Sony Pictures Imageworks' Jerome Chen, co-visual effects supervisor. What he wanted, to be precise, was Tom Hanks—whose production company had optioned the book, and with whom he worked on Cast Away and Forrest Gump—to play the part of the boy...and the train conductor, the hobo, Santa, and the boy's father. "He didn't want Tom to just voice the characters," says Chen. "He wanted him to act them." And he wanted adults to play all the other children, too. Imageworks made it possible.
Before The Polar Express train finishes its journey to the North Pole, it takes its passengers on a wild ride that becomes even more exciting in the 3D IMAX version thanks to 3D graphics.




The actors—Hanks, other adult actors who played the parts of children and adults, child actors, acrobats, small people, and even musicians—performed on three stages. They were all fitted with costumes and some wore makeup. The dialogue was recorded while they were filmed, much as it is for any live-action film. But it wasn't like any other live-action film. The stages were motion-capture stages. The sets and props were built out of blue chicken wire, and when the actors wore costumes on stage, the costumes were made of mesh so they wouldn't occlude the motion-capture cameras. Rather than one cameraman working with the director, 12 cameramen using off-the-shelf video cameras filmed the actors in the round. And motion-capture equipment recorded the actors' every movement.

"The set looked like the holodeck from Star Trek," says Chen. "Everything was gray and black with all these high-tech looking lights. And 20 feet away, inside the "widow maker" [a separate room], people were making sure the computers were working and the network cards weren't failing. It was so weird. I never thought I would see Tom Hanks in a motion-capture suit with Bob Zemeckis directing him."

Once the crew members knew they would be capturing Hanks while Zemeckis was directng him, they began looking at existing motion-capture technology. Until then, body and facial performances had been captured separately, and often the facial performance was hand-animated rather than captured—as it was for Gollum in The Lord of the Rings, for example. Animators then pasted the two separate performances together.

That was not what The Polar Express crew had in mind. "When you have an actor like Tom Hanks, you don't want to split him apart," says Chen. "We needed a place where four actors could look at one another and move in any direction, and we could record their faces and bodies."

As a result, Imageworks, with the help of Vicon, developed a performance-capture system to do exactly that. With this system, based on Vicon motion-capture equipment, they captured the faces and bodies of up to four actors working on a 10- by 10- by 8-foot stage using 64 motion-capture cameras linked together. Demian Gordon, motion-capture supervisor, designed the camera placement: Each camera had a 3- by 5-foot field of view. Vicon's IQ software was upgraded to handle the multiple cameras and dense facial capture; the upgrades were incorporated in version 1.5 and later into the company's new MX system. The House of Moves' Diva software was also tweaked to handle the heavy load.
Animation for the train conductor, one of five characters acted by Tom Hanks, was derived from performances directed on stage by Robert Zemeckis and captured with a 360-degree facial- and body-performance capture system developed at Sony Pictures Imagewor




The actors each wore 151 markers on their faces and 32 markers on their bodies. The markers were the size of a pencil lead, 2.7mm in diameter. Fifteen makeup artists applied these dots to the actors' faces using a map drawn by Alberto Menache, senior CG supervisor, who developed the facial muscle system for the digital models. "The markers had to be painstakingly glued onto the same places every day," says Gordon. "After every scene, a host of people with flashlights inspected each marker and searched the ground for any that fell off."

The crew also applied markers to the sets and props, which were built in two sizes—an adult size and a larger size so that the adults would look child-sized next to them; the markers helped the crew place matching objects in the CG environment more precisely. The sets and props were moved between three stages—the performance-capture stage and two larger stages for body capture only—which were managed by Giant Studios.

One sequence supervised by Menache takes place in a small passenger car on the train and has 30 characters, including waiters and chefs serving hot chocolate to the children. "But they're dancing and doing cartwheels and walking on the walls and ceiling, and the conductor is dancing and doing the moon walk," he says. For this, the second-unit actors and acrobats playing the waiters, chefs, and children were captured on the large stage, while Hanks's performance was captured on the performance-capture stage.
Most of the characters in the film were motion-captured (body only) or performance-captured (face and body simultaneously), including the waiters. Giant Studios handled the body capture of multiple actors on large motion-capture stages.




Because Hanks performing as a child and as an adult would often be in a shot with adult actors playing children, a team of "triangulators" kept track of all the different scales for eyelines, props, and sets. They also blocked out how many steps an adult actor should take to walk across the set when the action was destined to be scaled down to child size. Each scene with children was filmed and motion-captured twice—once with the adult actors and once with child actors; however, the data from the child actors was primarily used for reference.

After each take, a team of 18 people from Imageworks and Giant Studios would check the data—massive amounts of data. "We shot 50 gigabytes a day," says Gordon. By way of comparison, Gordon, who was also motion-capture supervisor for the last two Matrix films, says that number is five times the total amount of data captured during six months of working on those films.

At the end of each day, Zemeckis picked performances from videos shot during the day. The motion-capture team, using time codes from the video, selected motion-capture data to match. The selected body performances for each character were then assembled into scenes in Alias's MotionBuilder, and video clips of matching facial performances were placed around the perimeter of the screen.

Zemeckis then "shot" the scenes with Robert Presley, a camera operator, who used a camera head as an input device. When he turned the wheels on the camera head, he moved the virtual camera in MotionBuilder—one wheel would tilt the camera; the other would pan it. The system, dubbed Wheels, was developed by Imageworks.
Performances from videotaped capture sessions were applied to 3D proxies (top) and put into simple sets (middle) so Zemeckis could design camera moves interactively. Later, facial animation was applied and sets added (bottom) based on the camera moves.




"He was working in real time with the motion capture applied to the low-res characters that were in CG sets," says Chen. "The sets were in the same place as the chicken wire on the stage. It was a new experience for Presley, and yet the device was familiar." Using this system, Zemeckis created master shots, close-ups, over-the-shoulder shots—just as he would have done with a live-action film. "The only difference was that he was filming a digital character that did exactly the same thing on every take," says Chen.

Meanwhile, a separate crew applied the facial-animation data to the digital models. The data, once fastened to 300 digital muscles, drove the facial performance—except for the eyes—although animators fine-tuned the expressions. The eyes and eyelids were entirely hand animated.

To create the digital muscle system anatomy for characters performed by Hanks, Menache started with 80 photographs of the actor doing different facial expressions. "I traced where I thought his muscles would be on waxed paper," he says. "The only thing I guessed about was his jaw pivot." The map he derived was used in two ways—to place the motion-capture markers on Hanks and, similarly, to create the digital muscles. "We put markers on the digital face just like on the actor," Menache says, "and then ran a program that created all the muscles."

Character technical directors used the same system for other characters, all of which had the same number of facial muscles. Thus, even though the markers roughly matched a performer's anatomy, the data could be applied to other characters. "The data is converted to the anatomy of the character," Menache says. "We tried to keep the rigging as straightforward as possible so it could be duplicated." Similarly, the motion data captured from one performer could drive disparate characters. By scaling the data, adult actors could perform child-sized characters.

All the models were created in Maya using subdivision surfaces for the bodies and skin; the characters' costumes were created with NURBS. To create the models, actors playing the main characters were scanned and photographed. "For Santa and the hobo, we scanned Tom [Hanks] wearing make-up, prosthetics, and a fat suit," says Sean Phillips, senior CG supervisor. "We also scanned 24 kids, but ended up using five body types mixed and matched to reduce the load on the modelers."

The costumes were simulated using the same object cloth system in Maya that was used for Spider-Man 2 (see "Another Big Leap," July, pg. 22), developed by Imageworks and Alias. For hair, the crew used a variety of methods including dynamics in Maya and a proprietary simulation engine. "If the character was a hero, we'd use one system, but if it was in the background, we used simpler hair that approximated the right motion," says Rob Bredow, senior CG supervisor. "We simulated subsets of the hair in a way that is similar to the way guide hairs are used, but we don't like to go into a lot of detail about our system." The studio's patented work on clumping, however, is documented in SIGGRAPH papers.
The hobo and the child were both performed by Hanks. To help actors playing children, the crew made props out of blue chicken wire and wire mesh in two sizes—normal and large enough for an adult to seem child-sized.




"The challenge for this film was not whether we could do the simulations, but how to set them up for 40 characters with often three or four characters per shot for the hour-and-a-half film," says Bredow. The most extreme setup was required for the scene in Santa Square, where 30,000 elves mingle with the children, Santa Claus himself, musicians, and reindeer.

As with the other characters in the film, the performances for the elves were based on motion-capture data, in their case from Cirque du Soleil acrobats and little people. To manage 30,000 elves, the crew used crowd-simulation software. The same program also helped herd thousands of digital caribou in one of the many environments the train passed through on its journey north.

"One of the hardest things about the environments was that we didn't know where the camera would be when we were building them," says Phillips. "We had blueprints, but until we got the camera move, we didn't know how much detail we'd need. So, especially for Santa Square, we rolled the dice."
The 30,000 elves in Santa Square were animated using motion-capture cycles applied to crowd-simulation software. Tom Hanks plays Santa, the child Chris with Santa, and the train conductor.




Much of the film takes place in outdoor environments or inside the train. "The environment changed as the train got closer to the North Pole, from semi-urban to rural to forested, and then into tundra in a great arctic wasteland," says James Williams, layout supervisor. "Because we built the environments in a one-to-one scale with the real world and because the motion data is one-to-one, we were very creative in how we dressed the sets so we could reuse a lot of them."

Although many sets were made with 3D objects, some were matte paintings created with Maxon Computer's Cinema 4D. To handle textured 3D objects and sets, the crew used a new environment-assembly system. "We've got thousands and thousands of objects—from a marble that rolls across the floor to the train itself—and everything was painted in our texture paint department," says Mark Lambert, senior CG supervisor. "We had to have a system to keep up with what texture goes on which object and provide it when it was needed."

In one shot, for example, the camera moves through four miles in two and a half minutes. "We follow a train ticket through the moonlight in a forest where we see wolves, an eagle, and waterfalls," says Lambert.

The animals were all hand animated as was the train, which is an enormous model. "It's based on an actual train," says Phillips. "The hydraulics and the steam vents, for example, are accurate almost to a fault. It is so heavy that we had fairly elaborate scripts for stripping out the undercarriage."
The camera follows a train ticket that flies out of Chris's hand into a magical CG winter wonderland complete with CG wolves and lit only by moonlight and the train windows.




The effects department added the smoke and steam. "When Chris first walks out into the street, the train looks just like it does in the book, with smoke and steam pouring off," says Rob Bredow, senior CG supervisor. "It had to hold up for many minutes because it established the scene and the mood of the film."

For these effects and others, including atmospherics and cracking ice, the crew used Side Effects Software's Houdini with a new custom tool called SPLAT (Sony Pictures Layered Art Technology). "SPLAT uses a painter's algorithm to draw all of our smoke and steam," says Bredow. "It sorts the particles from back to front, draws the back one first and the next one on top of that." Because the algorithm is accelerated in hardware (Nvidia cards using OpenGL), volumetric renders that might have taken 20 hours were output in three to four minutes per frame. "The nice thing is that an artist can move lights around almost in real time and see the effect on the smoke," he says.

The lights helped keep the imagery from becoming too photorealistic. The film takes place entirely at night—inside the train or later, in Santa Square, and outside in the moonlight, which encouraged dramatic lighting. "I wasn't lighting like I would for a live-action movie," says Lambert. "It's Christmas Eve and time is magical. We had beams of light cutting through the dark with fog in the air. We used color in ways that wouldn't happen in the real world. If I wanted something moodier, I could try it. It was a great thing to play with."
Because the motion-capture data was most easily applied to CG characters on a one-to-one basis, the train was built on a one-to-one scale as well. Smoke and steam were added using a new custom drawing tool called SPLAT.




The crew members who worked on this film believe they have taken part in filmmaking magic, in helping create a new filmmaking medium.

"This is a medium where we get the best of both worlds," says Williams. "We get the exactness of the animation process in which you can go back again and again and refine each take, but at the same time, we get the spontaneity of live-action performances and live-action tools. It breaks through a glass ceiling we never knew was there. It frees us to redefine the way we make movies."

Barbara Robertson is an award-winning journalist and a contributing editor for Computer Graphics World.