I’ve lost count of the number of TV series and movies where someone grabs a satellite image, a surveillance video or even an ATM shot and ask the local geek to “enhance” the picture. Suddenly they zoom into the eye of someone who, actually, was hard to spot in the original picture and get the shot of his killer printed in the retina … all of it thanks to the miracles of image processing and the magic fingers of the local nerd.
True or false? Really …
In fact, there are several flaws here that make the process unbelievable to anyone who has worked in image processing for a while. In brief, the (real) resolution of an image depends mostly on a single issue: optical resolution. Resolution depends on the distance between two distinguishable radiating points and it’s a measure of how well you (or a camera) really see. Trying to improve optical resolution in an image is about as feasible as trying to see a black cat in a coal mine in darkness: details are there, but you don’t get to see them.
To simplify, just get some book in your shelf and move away until you can no longer read the title. The title is still there, but your eyes are not able to clearly separate the characters from the background anymore. Of course, you can just step closer and you’ll read it back, but the motion is not improving your resolution, it’s just equivalent to changing your “zoom”. Indeed, optical zoom does for cameras what binoculars do for our eyes: it brings things closer so you can see them better with your (fixed) optical resolution. However, once the image is (digitally) captured, you can’t change the zoom anymore, because you can no longer manipulate the zoom lenses. So what is it exactly what they (or any average image processing software) do in the movies with the captured footage? The answer is simple: to change the digital zoom.
You probably have heard of digital zoom if you’ve bought a decent digital camera: they usually give figures like x15 optical x30 digital zoom and such. And if you know a bit about photography, you know that digital zoom doesn’t matter, in fact you can get it later at your home computer, the only one that counts is optical zoom. The reason is fairly simple: digital zoom simply takes two neighbour pixels and put a third between them. The color of the new pixel is the average of its neighbours. Hence, you can double the (pixel) size of the pic, even if you are not adding any detail that wasn’t already there: if I give you a stamp with the Parthenon and ask you to draw in in an A3 paper, you can probably do it, but you won’t get to see where the columns are more wasted or what details are depicted in the frontispicio (in part, because those are in the British Museum, but then again …), you just get the general details, only bigger.
One could argue that, in fact, when they use the digital zoom in their cameras they get better results that when they try to work the zoom later at home with PC software, and they are probably right, but only because pictures in your cameras are compressed before they are stored. If you go for digital zoom AFTER you do the compression, some details in the original image are already lost, but if you do it BEFORE, you won’t see anything that wasn’t already there thanks to good ol’ optical zoom. All in all, the truth behind superzoom is you can’t see what it’s not there.
You might have heard about superresolution techniques, but don’t be fooled: most methods rely on combining many low resolution images into a single high resolution one by extracting from a pic what the next one lacks and putting it all together, just like our eyes usually do. Although there are methods to do superresolution with a single image, at most, you can get 2x-3x zoomed images with acceptable detail if original capture conditions were nice enough (see this example). In the image below, you can actually improve the shape of the windows, but it’s unlikely you’ll recognize the face of someone in a window unless you use magic.
So is there no image processing technique that actually reveals stuff in digital images? In fact, you might be lucky with contrast enhancement techniques, which are equivalent to letting your eyes get used to darker areas so you can actually perceive shapes and such. If you capture too dark or too bright images, the scene information can still be there, only the human eye finds it hard to distinguish between a dark gray equal to 20 and one equal to 22 (if you represent illumination in a 0-100 scale as most digital color spaces)). The computer, however, has no trouble to separate a 20 from a 22, so actually the only thing you have to do to see things better is to tell the computer to repaint all pixels equal to 20 or less in 10 and all those equal to 21 or more in 30. Your eyes will most likely be able to distinguish 10 from 30. And, fortunately for us lousy photographers there is a tool in almost every image processing software called “Levels” that will do the trick for us. The images below (taken from here) show the effect of this trick (called histogram stretching) for the pic on the left.
The main difference between contrast and resolution enhancement is that in the first case the information is contained in the picture. You can’t superzoom a camera image unless it has been captured at millions and millions of pixels … and, even if you work with Chloe O’Brian at the CTU, it’s not happening in your everyday traffic camera.