Everybody knows about Virtual Reality (VR) nowadays, all of us having played this videogame or other. Virtual Reality, when compared to traditional representation techniques, allowed mapping of the geometry of a 3D body into a non-existent 3D space, meaning that we could capture –in this case, render– any view from any point of view at wish, whereas in traditional 2D games we were bound to watch the game as the designer had originally planned. Looking for a better example, when we watch TV we are forced to see the film from the director/camera perspective. VR-based TV would allow us to change the point of view at will, not because there are more cameras, but because we know the 3D shape of all objects in the field of view and, hence, we can project them into the screen plane by simple geometry transformations. Like, for example, moving the camera around the main character to see if the monster is already lurking instead of waiting for the traditional scare. Indeed, Johnny Chung Lee did something like this using a Wiimote to track your position with respect to the TV and change the projected image accordingly:
Cool as it sounds, VR promised more than it delivered, and was limited to a few applications, mostly to progressively more visually complex videogames, like the ones in Xbox or PS3 today. Maybe because they are so close to the general public, VR applications are not fascinating anymore. However, Augmented Reality might do the trick.
Unlike Virtual Reality, people in Augmented Reality environments are not in different, faraway places, but exactly where they are. Sounds boring, huh? The key to AR is that some of the things you see in fact, aren’t: they are computer-generated and overimposed to your vision. For example, the T800 views in Terminator gave information about who the robot was watching to properly shoot them in the face. Similarly, Predators also had their own googles. Or the screen where Tom Cruise messed with present and future in Minority Report was indeed not there, floating in the air. And I’d say that Gollum was not there with Frodo and Sam, but then, who knows :P.
AR just requires a camera, a PC, or any other processing device, and some sort of viewing device, like screen, googles. mobile or whatever. The camera sees objects in the real world, the PC calculates their position and generates extra information and then the viewing device presents the real view with just that little extra. Probably, the best known AR application, though, is the Weather-person, who is waving around nothing while the computer overposes a map with suns and clouds in thin air in real time. The trick here is that, in fact, the person is standing in front of a flat surface colored in something you would not be caught dead wearing, like radioactive green. The camera knows that color well, so whenever it finds in in one pixel, it replaces it with its equivalent in a computer generated image. In the end, only the computer generated image and non-green pixels (i.e. the person) remain. This process is known as (static) background subtraction. Indeed, Playstation Eyetoy works in a similar way, only instead of assuming the background is homogeneously colored, it assumes the only thing moving in the field of view is the game player (motion based background subtraction).
AR techniques are based on knowing where things are. The key idea is to align both virtual and real world, so that virtual information keeps tied to real objects. In VR applications, if we move ahead, the whole world moves with us. In AR things tend to remain where they are: if there is a key on top of our dining room table, a virtual label may be overlapped to it, specifying that it is our store room key. However, if we step ahead and leave the key behind, the label stays on top of it.
If we want to include more complex, 3D objects, like a Gollum to guide us to some place, the align problem is a bit harder, so that it seems to be within the real world. Basically, we take some object whose side and position we know and use it like an anchor. For example, a black square might do the trick, although it is possible to use anything in the room if our computer is smart enough and we have enough computation power. Distortion due to perspective, like small means far, large means close and such, allows positioning of the person’s point of view (POW) virtual empty VR world. We can model our object there with typical VR techniques and then, render the body from the calculated POW. The resulting view is combined with the real one to create an augmented frame. If the process is fast enough, the virtual object changes shape according to our position in the real world.
What do we want AR for, except FX and computer gaming? I guess you have not seen yet Iron Man (the first), go to the watching facility closer to your convenience and take a peek 🙂