Director peter jackson filmed all the battles in five armies in virtual production.
When Weta Digital won a visual effects Oscar in 2002 for its first blockbuster film, The Lord of the Rings, it was a fledgling studio. Since then, Weta Digital artists have brought home four more Oscars, five BAFTAs, and numerous other awards and nominations. This year, the well-established studio shouldered the effects for two blockbuster films, Dawn of the Planet of the Apes and The Hobbit: The Battle of the Five Armies, both sequels in successful franchises.
Each film represents a culmination of state-of-the-art tools and techniques developed at the successful studio: Dawn (see "Evolution," July/August 2014) brought character creation and performance to a new level, and the recently released Five Armies provided a battleground for efficient and exacting methods of virtual production. Both received BAFTA nominations for Best Visual Effects, and Dawn has received an Oscar nomination.
With Five Armies, Director Peter Jackson ended his journey into Middle-earth that started with The Lord of the Rings. So, too, did Visual Effects Supervisor Eric Saindon, who joined Weta Digital in 1999.
For Five Armies, Saindon cast his eye on all shots the studio submitted to Jackson, but shared visual effects supervision with two other longtime Weta supervisors: R. Christopher White and Matt Aitkin.
"We split the show so Matt could take shots in Dol Guldur, several in Erebor, shots where Bolg and Azog meet for the first time in this movie - 10 or 15 shots in each," Saindon says. "Chris did Lake-town - the first 20 minutes of the show - and then all the Ravenhill sequences."
In addition to the overall guidance, with the help of Animation Supervisor Aaron Gilman, Saindon handled the big battles in Dale, in Erebor Valley, the Beorn attack, and the Gundabad attack. These sequences spurred the studio to roll out new technology for virtual production.
"I asked Peter [Jackson] if he minded being a guinea pig," says Joe Letteri, senior VFX supervisor and Weta Digital director. "He didn't, so we did a lot of Five Armies with Peter using a virtual camera on the virtual production stage with new technology that was breaking every day. We worked through it."
For the last film in The Lord of the Rings series, the studio had created battles by shooting some actors on location and then filling in the rest using Massive software to manage CG characters in the background. "For this film, everything [for the battles] was on stage," Letteri says. "We had hero actors on greenscreen and used virtual production for everything else. Hopefully, it looks like we were shooting on location, but we were shooting on stages."
New Tech: Gazebo
The new tools implemented for the film included the real-time lighting software Gazebo, technology in development two years ago (see "Shaping Middle-earth," January/February 2013), new rendering software called Manuka, and a new virtual production pipeline. The scale of shots with the armies made their implementation necessary.
"Gazebo came first because we were trying to do a real-time renderer for the stage," Letteri says. "It grew out of thinking how to improve virtual production post-Avatar. The thing that was missing was a lighting tool that mimicked the result we'd get with our existing production-renderer, RenderMan. We wanted to know if we could render equivalent lights and shaders in hardware that would approximate a final render. Gazebo gave us that and speed optimization. It was robust for the virtual stage."
Gazebo also had a secondary, equally important role as a pre-lighting tool for technical directors. "We knew that once we had that equivalency, we could use the fast engine for blocking in lighting," Letteri says. "That's when we started to bridge that gap."
New Tech: Manuka
A new approach for rendering became an important consideration with Tintin. In Tintin, lightbulbs rather than the sun illuminated the CG characters and sets in many shots.
"The spherical harmonics pipeline we had written for Avatar wasn't ideal for interiors," Letteri says. "You really want to use ray tracing, and the kind of distribution ray tracing we were working with in old versions of [Pixar's] RenderMan were expensive to do. You fire a ray, hit something, [fire] a bunch more rays and hit more things, and you end up with a dense, heavy tree structure that's accurate but hard to work with."
So, the Weta researchers began investigating path tracers.
"With this method, you construct bidirectional paths of light from the camera to the light, or vice versa, and solve the integral equations to compute the light transport," Letteri explains. "It's more promising, and there was more research into the mathematics for solving those problems."
The goal was to load a complex scene, do the path tracing, and then use that as a guide into how they might break down lighting to push shots through as usual.
Before writing their own software, the R&D team looked at PBRT and Mitsuba. Matt Pharr, Greg Humphreys, and Pat Hanrahan received a Sci-Tech Award for PBRT, which is now detailed in Pharr and Humphreys' book "Physically Based Rendering." Dr. Wenzell Jakob's Mitsuba is a research-oriented rendering system in the style of PBRT.
"They do path tracing, and they allowed us to test those algorithms," Letteri says. "But we needed something more production friendly. We needed to test path tracing with real shots and real scenes. So, we wrote a path-tracing engine that is RenderMan-compliant so we could feed it RIB files and our RenderMan shaders."
And that made it possible to test the new engine using real production assets.
"Once we started doing that and saw where we could go, it became apparent that we were getting enough speed," Letteri says. "I kind of knew it would happen, that it would become viable for production because it was a two- to four-year project. By the time we got all the elements, I was hoping the hardware would be faster and it would all come together. And, it did. We have a full bidirectional path tracer now, and we can put any sampling algorithm into it."
And that means, with Manuka, the path tracer they developed, the experiment can continue alongside its use in shot production. "We wrote Manuka with two goals," Letteri says. "One was to be a research test bed in which we can do whatever we want. We can do brute-force sampling and see an exact answer, even if takes days and days. That's called an unbiased renderer. But then, also, we can bias the renderer and trade accuracy for speed. We know what [the image] is supposed to look like, but we're willing to do something different to get the shots done. We can switch between."
Because the team designed Gazebo with RenderMan and Manuka in mind, it could output a scene to RenderMan or the fully RenderMan-compliant Manuka. "That allowed us to do quick comparisons and to migrate quickly," Letteri says. "We could test a shot in Manuka and if we ran into problems, fall back to RenderMan. We didn't want to have to rewrite all our RenderMan shaders. Every time we looked at new rendering algorithms when something new came along, it would mean we'd have to rewrite shaders to tease apart surface descriptions and light transport. We did that early on, which allowed us to migrate between renderers more easily. So, now, we can use our PRMan shaders in Manuka."
The studio used Manuka for the first time on the big crowd shots in Dawn of the Planet of the Apes and discovered it could render the fur on large groups of apes efficiently. Thus, when it was Five Armies' turn a few months later, they rendered 90 percent of the film in Manuka, relying on RenderMan only for the fire elements in Lake-town because they hadn't yet optimized Manuka for volumetrics.
"As of Five Armies, it has become our renderer," Letteri says. "We're working on both in parallel, but Manuka proved its worth so much, that's what we used wherever we could. Because it handles big scenes, the TDs spend less time managing the scene and more time on lighting. Big scenes are Manuka's backbone. We have to create so many populated environments and worlds, we designed it with that in mind."
Weta's Gazebo provided real-time lighting for virtual production; Manuka rendered scenes using path tracing.
Big scenes also mean big data, particularly when filmmakers use virtual production rather than live-action photography.
New Tech: Pipeline
"We had been developing the pipeline for the next Avatar, but we decided to roll it out for this project," Saindon says. "It's a different way to organize our assets and keep track of them. We had more CG shots and more assets and characters to look after than ever before."
Letteri chronicles the difficulties inherent in tracking assets for virtual production by comparing that method of filmmaking with live action.
"In a live-action shoot, there are costumes, lighting, and so forth, but in the end all that's boiled down to a piece of film," Letteri says. "That's what we start from. We have to know what's in the plate, and we can measure everything in the plate, but we don't have to worry about how it got there. We don't have to track the costumes and so forth while they're shooting. But when we're doing virtual production, all that becomes our problem and part of our pipeline. What costume the character was wearing and whether the director changed the costume from April to May, for example. We have to track history back and forth. We tracked all those complexities for Avatar mostly by hand."
The new pipeline manages the assets for the artists, but change, even welcome change, doesn't always come easily.
"Our underlying pipeline was almost 100 percent different from the new pipeline," Saindon says. "The way we send things to the render wall with new queuing software, the way we see assets, and our lighting tools and renderer are all different. The first month was difficult, really rough, shocking. No one knew where to get anything. But I'm not sure we would have finished this movie without it. Peter was still editing and writing until August. We had a lot to do in a short time."
In fact, when Jackson filmed on location for Five Armies two years ago, the battles were not planned.
"Peter wanted to go onto the motion-capture stage, load in a scene, and film the shots the way he wanted," Saindon says.
Before the new tools, that would have been difficult. The scenes were too large, the amount of information too much, taking motion-capture data through motion editing, lighting, and texturing, and putting it back onto the stage for Jackson to film was a problem.
The new pipeline made it possible. "We didn't get greenscreen shots for the battle," says Saindon. "A lot of it was developed on the motion-capture stage with Peter, Terry Notary [stunt performer, movement choreographer], and Glenn Boswell [stunt coordinator]. Then Peter would shoot it with his mocap cameras."
Next-Gen Virtual Production
"We can open a scene on a motion-capture stage that anyone else would be working on at the same time. I can load the same scene in Nuke, Maya, and MotionBuilder, have the same data, see everything going on, and do all kinds of conversions," Saindon says. "And, it was very lightweight."
While he was on the motion-capture stage, Jackson could see thousands of CG characters running around and fighting through his virtual camera.
"He could film the battle as it happened," Saindon says. "I think he loved it. We have to account for more elements when a director can put the camera anywhere. So that means a lot of work in animation. But the advantage is, that once the shot gets past a director, it's almost ready. We can almost go directly into shots and get through the pipeline quickly."
Because all the software works within the same pipeline and because Gazebo is so fast, animators can drop a lighting setup into a scene for a presentation.
"We can see final lighting on an animation scene as it's going along, and we can send Peter our lighting setup in animation when it's being blocked out," Saindon says. "Because lighting done by the TDs is dropped into an animation scene, and because our puppets are so high res now, Peter couldn't tell if he was looking at the scene for animation or lighting."
Orchestrating the Battle
To give Peter the most freedom possible, Gilman's team met early with the director to map out critical moments in the battles - which Gilman describes as any fighting with more than 30 characters.
"We identified moments when the tide turns, moments of conflict, moments of transition to determine when we would need the largest amount of motion capture," Gilman says.
They also defined how each race of combatants would march and charge.
"Instead of having each race march or charge the same, they needed to move in fundamentally different ways," Gilman says. "One of the first things I learned as a creature animator is that a performance with monotone timing is boring. So, we thought of each race with almost musical concepts. Orcs do formations in beats of five. When the dwarves march, they always march off-step relative to the rank behind them, so we get oscillation between their shoulders. And the elves move in beats of three."
Notary and Boswell spent six weeks shooting 15 motion-capture performers to provide animators with enough data. "You could hear on stage when Terry [Notary] was choreographing, he'd call out beats of five for the Orcs, or play music," Gilman says.
The elves, which moved to a waltzing rhythm of threes, performed in threes. "We'd choreograph two elf swordsmen protecting a single archer," Gilman says. "The great thing about the elves is that they are delicate, with a finesse in their movements - agile and delicate, but at the same time, their way of attacking needed to be vicious. They were whirling dervishes of death, with arcing motion and quick stabs. Always spinning."
Jackson could see thousands of CG characters fighting through his virtual camera while he was on the mocap stage, and could film the battle as it happened.
Orcs, by contrast, moved like sewer rats.
"They'd move in formation, but we tried to find a point in space where there was a weakness," Gilman says. "They'd swarm to that weakness and then pour out the other side. They had little regard for the fallen; they'd step on the dead."
The stout dwarves, on the other hand, moved more like tanks and formed shield walls.
In addition to the stylistic differences in the way the races moved, each carried different types of flags and banners and wore different types of armor. "There was always a worry that the viewer would see an ocean of people, with no idea what was going on," Gilman says.
These battles did not center on hand-to-hand fighting among heroes.
"You see formations," Saindon says. "Peter wanted to see a sea of spears and ranks of people. He didn't want a whole battle with one dwarf fighting an Orc. He wanted groups of dwarves taking down an Orc. He wanted battles more designed than the typical melee we've done before."
Battle Management
Notary and Boswell diagrammed the formations for the performers on the motion-capture stage. "We had a huge number of diagrams, so we could do a limited number of captures," Gilman says. "We'd capture a series of movements, variation upon variation, and then we'd compile them to create the sense that there were thousands."
Animators amped up motion-captured actions and reactions to make the fighting more violent.
The data from the capture sessions went to the motion editors and animators, who created master vignettes by relying on the diagrams and drawings from the art department. A custom tool dubbed Army Manager, which works within Autodesk's Maya, helped the animators create the vignettes.
"Army Manager is basically a previs tool that allows animators to put thousands of characters in a scene and play back a pre-existing animation, whether keyframe cycles or choreographed animation shot previously," Gilman says. "The animators work with low-polygon bakes, not puppets."
Using Army Manager, animators could create large formations with any race, equipment, and formation. "We used any animation we had, placed it into artwork approved by Peter for a scene, and sent it to the motion-capture stage," Gilman says. "Then we could play it back so Peter could shoot what he wanted."
In other words, the animators could plug motion-capture performances, applied to thousands of characters, into a terrain and send the digital battle within the digital location to the motion-capture stage, so that Jackson could view it with a virtual camera.
"Peter would play with the camera on stage," Gilman says. "He could see all these characters we'd built in the master vignette, and do dozens of takes. Aerial shots, dollies, tracked cameras, whatever he wanted. It was important for Peter to make the battle feel violent and not staged. You can storyboard as much as you want, but the great thing about this technology is that it allows Peter to be as organic with the process as he chooses to be. It isn't until you place a camera inside a group of 3,000 guys trying to kill each other that you get visceral, organic performances."
After shooting the battle on the motion-capture stage, Jackson reviewed the takes with the editor and selected the ones he wanted for the shots. Meanwhile, the motion-edit team cleaned up the capture data, put feet on the ground, weapons in hands, scaled the data appropriately for the various races, and made sure the CG characters didn't intersect. When Jackson picked the takes he wanted, the motion-edit team put the corresponding animated characters into the shots.
"And then, we made it feel violent," Gilman says.
Making It Hurt
"Army Manager had to be fast or it would be pointless," Gilman says. "The animators had to move, copy, delete, and position armies quickly so we could send the vignettes to Peter in a few hours. The trade-off is that the motion was non-modifiable. We could get thousands of characters into Maya because we used low-res model bakes. The purpose was to have Peter give us shots. But, we couldn't raise an arm, stretch a leg, or make the battle feel more violent. The motion editors and animators needed to polish the motion."
During motion-capture sessions, the stunt actors did their best to make the performances feel as though they were in the midst of a battle, but they could only go so far.
"If you made the motion capture super-violent, you'd hurt people," Gilman says. "If I were to swing a sword at you and you knew it would gently touch you, you would amp up your reaction to have it look violent. But you're not going to really be hit, so there is an inherent anticipation that preempts the reaction. The animators had to pull all that out and speed up hits. Characters would drop weapons. They wouldn't spin in reaction to the angle of a weapon. We had a huge amount of work."
Moreover, it was impossible to anticipate everything during the six-week motion-capture sessions with Notary and Boswell that Jackson might want later.
"They did a great job establishing a foundation, but Peter wanted to invent new stories," Gilman says. "And as the battle developed, we wanted more cool things. So I spent two to three times a week getting more material. It was important to have constant access to the motion-capture stage, and we had motion-capture actors on call. It wasn't that we wanted more cycles. It was working with Peter's vision of the battle. We'd do re-shoots with weights on the actors' ankles and hands, and we weighted the weapons. We had hundreds of thousands of frames of motion-capture footage."
All that added more complexity as the motion-edit team would recompile shots into masters or send data to the Army Manager and then on to the motion-capture stage.
"Wherever we had close mid-ground to foreground action, there was a great deal of tweaking," Gilman says. "For far to mid-background, where the human eye can't assess the violence of each hit, we used Massive."
In addition to the battle, Gilman's team worked on the fight between Azog and Thorin on the ice and scenes of fighting inside Dale.
"It was six long, each-day-full months of overtime for me," Gilman says. "For us at Weta, motion capture never exists in a vacuum. It always goes through our motion-edit team and our animation team. There's always the risk that the data captured on stage will look choreographed or feel too light or tame."
Battleground
To help the audience understand what was happening in the battles, carefully designed environments provided landmarks. For example: "The position of the wormholes relative to the dwarves and elves was carefully thought out," Gilman explains. "Based on landmarks, you can tell where everyone is."
One advantage of using virtual production for the battles was that the battlefields were all-digital, which meant the studio could control the placement of landmarks based on the action in the battle. The disadvantage was that they didn't know where the battles would be until Jackson filmed them on the stage.
"Digital environments always seem to be the hardest thing to sell," Saindon says. "You can put a character on a turntable to get them to work. But, the environment is tricky to R&D. It isn't hard to do a digital environment. The challenge is making everything work well for the whole film; getting the detail right."
Saindon continues: "That's true for a lot of films, but it was more so for this one because we didn't know where the fighting was going to happen, where the battle would take place. We didn't know where Peter would point the camera. We didn't know if we needed to go into detail on a 10-foot section or do a broad scale of the two-kilometer by two-kilometer area. So, we had to develop a 3D environment at real scale for a two-kilometer-square area. The environments group had a huge task."
The team at Weta called on Renderman to generate volumetric fires.
Moving Forward
Virtual production might be difficult, but with tools such as those evolving at Weta, it is becoming easier.
"When I came to Weta, one of the first things I had to do for the first Lord of the Rings was help get the troll animation for the cave troll onto the motion-capture stage," Saindon says. "It was a horrible nightmare. We had gray-shaded models. It was slow. The camera wasn't accurate."
As for lighting and rendering, creating final shots for the first films was difficult enough. "We did everything with shadow maps and spotlights," Letteri says. "We spent a lot of time hand-tuning subsurface scattering parameters to get the right look for Gollum in Two Towers. And, to integrate the CG characters into the ground, we used shadow passes. We had to do so much by hand. But now, we can't integrate everything by hand. Lighting has to be completely integrated to make it work."
Spherical harmonics helped the studio move away from point lights to area lights, but when they encountered big interiors, they needed to ray-trace everything. Now, with the move to real-time lighting and rendering in Gazebo and with path tracing, in Manuka, they can work with larger, more complex scenes than what's feasible with ray tracing, and provide more photorealistic, real-time scenes to directors who want to use virtual production for filming.
"I can imagine visual effects in the future will become even more integrated into the movie process and thought about earlier on," Saindon says. "It won't be a post process. I think the more that visual effects is integrated into the process, the more it will disappear into the process and the more it will become an unseen thing, which is what our goal is if we do our job well."
For their part, with Dawn and Hobbit's BAFTA nominations for best visual effects and Dawn's Oscar nomination, the effects crew at Weta can be proud of having done their jobs very well this year.