To paraphrase Einstein: Space is what keeps everything from happening at the same place, and stereoscopic imaging helps us visualize space. The concept of space, or distance, is so fundamental, that a tentative understanding of our perception of it has taken the human race centuries. Stereoscopic imaging heightens our perception of space and it is so fundamental to our existence that any idea that the stereoscopic cinema is a passing fad is ridiculous.
Looking at a projected stereoscopic image is not the same as looking at the real world, or what psychologists call the “visual field.” I don’t think anybody working in the field would quarrel with that statement. You might modify it and say, “In some ways it is like looking at the visual world, and in some ways it isn’t.” You don’t look at the visual world through a rectangle unless you’re looking through a window; that’s one obvious difference. Your field of view in the real world is unbounded, you might say. But there are other differences that aren’t obvious, and the language people use to describe the cinematic experience tells us that there is a lot to be discovered, a lot to be thought through, about the differences between looking at a projected image and looking at the visual field.
The two words that are used often when discussing the stereoscopic cinema are “realism” and “immersion.” The commonsense definition of “realism” or its variant realistic, in the context of the stereoscopic cinema, is that an image that posses this quality looks real – like life. “Immersion” or immersive implies that you are in the image, not outside of it looking in. Does that mean that you are drawn into the screen or that the screen emerges outward to enfold you? Or does it matter?
Throughout the history of movies, photography and painting, most of the time you are outside the image looking in at it. An artist, Robert Barker, used the word panorama to describe his eighteenth century paintings of Edinburgh. The paintings, an attempt to immerse the viewer, were done on the inside of a large cylindrical surface. One hundred fifty years later Fred Waller’s Cinerama opened on Broadway and projected images onto the surface of a large cylindrical screen. Cinerama’s successor, IMAX, uses a large more or less flat screen to achieve much the same effect.
Photographers and cinematographers make their living by understanding the difference between perception of the visual field and the perception of an image. The images they depict of the visual world tell a story but these effigies are not the same as looking at the world. The departures lead to creative expression. The departures define the art.
Stereoscopic images have additional aspects that don’t apply to conventional three-dimensional cinematography. You may be taken aback by my use of the phrase “conventional three-dimensional cinematography.” But cinematography has always been three-dimensional. It has all the usual three-dimensional monocular depth cues, but without the cue of binocular stereopsis – the only two-eyed cue. Stereoscopic cinematography adds one additional depth cue, and that additional depth cue has (no pun intended) profound implications for how the image must be created and displayed.
Once the stereoscopic depth cue is added to a projected image certain interesting things happen, especially for distant objects. When you look off in the distance the hills don’t look flat as a painted backdrop. Yet with the projection of stereoscopic cinematography that’s what happens. If photographed with the usual interaxial separation the stereoscopic projection of distant hills can make them appear to be flat in a way that doesn’t correspond to their perception in the visual world.
The stereoscopic depth cue works best in close, because the distance between eyes – or the human interocular – is two-and-a-half inches or so, and simple trigonometry can be used to help demonstrate that past, say, a few hundred feet the stereoscopic depth cue doesn’t count for much. Stereoscopic acuity falls of rapidly with distance. But even though the stereoscopic depth cue rolls off with distance, images in the visual world don’t flatten. Yet when you add the stereoscopic element to projected images, objects in the distance do start to noticeably flatten out, especially very distant objects like mountains or cityscapes. Yet, when perceived in the visual world one doesn’t have that sense of collapsed of depth.
This brings to mind stereo cards of distant vistas, like Niagara Falls, which when viewed in a stereoscope look as flat as a board because the interaxial separation for the camera was only two to three inches. Little parallax information can be recorded of such a distant vista with such a lens separation. I’ve asked myself, “What’s the point of a stereo card with no stereo information?”
The people who are producing computer-generated images at the major animation houses (Rob Engle at Sony Pictures Imageworks, Robert Neuman at Disney, Phil McNally at DreamWorks, and Jayme Wilkinson at Blue Sky) have become masters at the manipulation of stereoscopic space. That’s because a computer-generated world allows for complete control. They can control the distance between their virtual cameras’ lenses, which controls the strength of the stereoscopic effect, but they can also control it differentially for different objects and different distances from the camera. So a background that would ordinarily appear to be as a flat backdrop can have stereoscopic depth. By the same token distant characters can be molded by being “shot” with an interaxial separation that is different from the interaxial separation that’s used for the close characters. This manipulation of space allows for the maximization of the stereoscopic effect in a composition. Varying the interaxial differentially allows for amazing freedom of stereoscopic composition and modulation of depth effects.
Recent examples of these works are the films Bolt, Beowulf, Meet the Robinsons, and Monsters vs. Aliens. These films are outstanding examples of stereoscopic cinema. They are so well modulated and controlled – the shot-to-shot flow is so perfect – that there is nothing jarring, there’s nothing disturbing, they do no harm. But they tell the story, and they tell the story with beautiful images – beautiful images that aren’t the way you see the world, because of the departures that I’ve discussed and the means used to plastically manipulate space. These filmmakers have created a modulated three-dimensional universe – a perfected three-dimensional universe suitable for projection on a big screen.
But you can’t accomplish the same effect today it with live-action capture. Today’s technology does not allow us to do it. Live-action camera photography is accomplished with a fixed interaxial. The backgrounds are not going to have a different interaxial separation from the foreground. Today, for live action, we cannot engage in the manipulations possible within a computer graphics universe, and that shouldn’t be a surprise. A major exception is with bluescreen or greenscreen. If you’re adding CG backgrounds, then you have the ability to change the depth effect, which is produced, let’s say, by one interaxial separation for foreground and another for background.
Stereoscopic cinematography has another limitation. Once the photography has been accomplished and an interaxial separation has been chosen, which controls the depth strength of the stereoscopic image, it’s hard to change. We don’t have a flexible technology that allows the manipulation of stereoscopic space in post, with this exception: the zero-parallax plane can be altered. But that’s trivial; that just means by horizontally shifting the images you can produce the effect you want. All you can do by changing the zero-parallax point is to establish the boundary condition – that which appears in the plane of the screen and thus that which appears in audience space or in theater space. What we are missing with live-action capture is the ability to modulate the strength of the image to make it appropriate to telling the story, and you certainly can’t control the stereo strength differentially.
It’s important to be able to control the image and its stereoscopic aspects in terms of creating a sequence. A sequence or a scene in a movie is made up of a many shots cut together. These shots have to be stereo timed, just like shots are color timed in post. You can’t time one shot in a sequence to look like it’s on the beach and another that looks like it’s at night. And you ought not to do the same kind of things for stereo-timing where one shot follows another with unmatched stereo strength.
We’re now at a time when we’re learning to look at stereoscopic images. What works today may be conservative tomorrow, but right now we have to be concerned with audiences accepting the images that are produced, and doing no harm. Do no harm, so that the audience feels comfortable when they are looking at a sequence. But the only manipulation we can use right now for cutting sequences is to control the zero-parallax position (and adding floating windows – but that’s another story). But it’s not enough.
I know that the stereoscopic cinema will flourish creatively once the kinds of creative controls I’m talking about can be fully manipulated by live action filmmakers. It will be a better stereoscopic cinema as the CG stereo supervisors have demonstrated. They are leading the way and there is no reason why we should accept limitations of live capture in the long run. In the short run there are technology problems, but in the long run what the stereoscopic supervisors of CG animated films have shown us is that it is possible to create a beautiful stereoscopic image flow.
What’s the answer? The answer may be the process “conversion,” or I call it “synthesis”. There’s a trade name for it that In-Three uses: Dimensionalization. In the Los Angeles area there are three companies that can take planar images and turn them into stereoscopic images: Sony Imageworks, In-Three, and Sassoon. There may be others, and if I’ve left you out, forgive me. The essence of these techniques – and the technology varies from house to house – is to synthesize one of the perspective views from the other. The process places the work in a world, for editing or post production, which is similar to computer graphics space where one can manipulate the strength of the stereoscopic image, effectively vary the interaxial separation between foreground and background, and produce a beautiful images and a beautiful flow.
The conversion process involves outlining objects in the shot, which in the movie business is called rotoscoping, filling in missing background material, and adding molding -in some cases by means of a depth map or some kind of a mesh so that the objects are not simply cardboard cutouts. There are other things that needed such as address transparency and reflections. Once you’ve created a database for a shot it’s possible to manipulate the image – and, in some facilities, to manipulate it on the fly so the image can be tweaked just as you can tweak an image in a color timing session.
This is very important because visualizing a stereoscopic image is difficult at the time of cinematography. It’s difficult to do because you’re not, generally speaking, able to look at the image on a theater-size screen real time. A theater-size screen is anyplace from 20 to 70 feet across. It’s hard to tell what a stereoscopic image is going to look like if you’re looking at a small monitor. Now we have five- and six-foot and larger stereoscopic monitors. Those help a lot, and make it easier to compose stereoscopic images on the set.
Why is it hard to understand what a stereoscopic image looks like? For one thing stereoscopic images are weighted and scaled by extrastereoscopic cues, and therefore the choice of focal length or an interaxial alone doesn’t determine the effect. It is a difficult art. It’s hard to visualize what stereoscopic images are going to look like when they are projected during photography and next to impossible to visualize how the shots are going to cut together. That’s because even if you’ve storyboarded the film, there are going to be changes. So when it comes to producing sequences it may become difficult to live with what you’ve shot. But synthesis can overcome these difficulties.