Eulerian Video Magnification

For the uninitiated, Eulerian Video Magnification (EVM), the combination of algorithms that reveals the tiny changes in relatively static video feeds, is nothing short of magical. A lot of rumors are running around about what it can and cannot do. I intend to explain enough about it to put a few of these to rest.

EVM is a collection of algorithms that “magnifies” the subtle changes in a video feed. Depending on which filters it uses, it can be set to focus on color or movement. It is a project out of MIT (patent pending) that, in my opinion, is incredibly promising. Potential applications include baby monitors, portable lie detectors, and medical applications, though these will probably only become a reality after it has been thoroughly tested. But the question remains, how does it work and what does it require?

The how part is a bit complicated. It involves a list of intimidating image processing algorithms. I will list these one by one and describe their function. I will then explain what you need in order to get it to work.

EVM begins by splitting a video into two buffers: the original video, and the video that will be processed by the algorithms. The original video is stored on the side while the process video runs through the algorithms. The first algorithm is one of two image processing algorithms that make use of image pyramids. An image pyramid is simply a method of removing every other row and column in an image while still preserving as much detail as possible, resulting in an image that is 1/4 the area of the original. Gaussian pyramids are used to highlight color change, while Laplacian pyramids are used to highlight movement.

After the processed video feed is shrunk down by the pyramid algorithm, it is fed into a bandpass filter. A bandpass filter consists of three steps in this case: we begin by running an FFT (Fast Fourier Transform) on each channel of each pixel in the video buffer. This results in a buffer that represents the frequencies measured for the changes that occur in each of the pixels over time. We use the frame rate to figure out which frequencies we want to keep (based on high and low values) and we zero out the frequencies that occur outside of that range; in other words, the high and low values define a “band” of frequencies we want to pass through the filter, and all other frequencies are set to zero. After that, the last step is to undo the FFT by applying the inverse FFT. The result is a video in which only the changes that occur inside the target frequency are allowed through (in an ideal universe, but the reality is much more complicated than that and far beyond the scope of this post).

At this point, an amplification factor is applied to the processed video buffer. This factor is multiplied against the contents of the video buffer, resulting in the amplification effect in the final video. Once this is finished, we need to increase the size of each image in the buffer back to the original size. We do this by reversing the pyramid algorithm we used at the beginning. Once the buffer is returned to its original size, the processed video is combined with the original video to produce the output video that you see in the YouTube videos.

Because the algorithm runs on a buffer, it is difficult to write a program that runs EVM on a live video feed. All publicly available open source code that I know of requires the user to first record a video to a video file, then run that video file through EVM to produce a new video file with the desired effects. The problem with this is that you can’t check your settings live; you have to wait until the video is “compiled” in order to view it, and the process can be somewhat tedious.

Now for what is required to do this: EVM can be run using just about any web camera, but the higher the quality of the web camera, the better the results. Some web cameras introduce a lot of noise into a video feed, and these are not ideal. I suggest for the serious exploration of EVM that you purchase a better web camera, though modern laptops (those purchased within the last year or so) and modern smart phones usually have good enough cameras built in. No, you don’t need a Kinect in order to do EVM; it is my suspicion that Microsoft, in the rumors about Xbox One’s capacity to measure heart rate, has either developed a method of doing so that doesn’t require EVM and makes use of infrared light, or that they’ve come to an agreement with the folks at MIT that allows them to use it without the threat of future lawsuits.

The current legal status of EVM is that MIT has a patent pending on it. They do, however, provide source code for free that can allow computer savvy individuals to experiment with it. They do this under the agreement that the code not be used for any commercial purposes and that if an entity is asked to do so, they must cease making their EVM implementation available. However, it is available (and legal) to be used for research and development purposes, and if you want to use it in a commercial application, you can contact MIT to arrive at an agreement.

As a computer science student, I am required to produce a senior project. The subject of my senior project is EVM. I plan to implement it in C++ using the OpenCV library. It will use a circular frame buffer that can contain enough frames to cover a few seconds of video. While there will be some delay between recording and displaying video, it will be a far better workflow than is currently available through publicly released code. Once finished, a user will be able to switch between filters, adjust amplification level, and adjust the bandpass high and low levels as they watch the video feed. It is hoped that this will facilitate the live study of EVM, which could potentially reveal many more applications.

Imagine, a “poor man’s x-ray” created using EVM on a video feed from different light frequencies! Imagine portable medical devices (Star Trek’s Tricorder for example…) that could one day be used to diagnose problems! Imagine being able to use your cell phone to gauge how nervous people get when you are around! Honestly, the potential applications of this are enormous. It does have a long way to go between now and then. My hope is that my project, once I make it available, will aid in the process of improving the art.

Questions? Comments? By all means!

Adam Nickle

About Adam Nickle

I'm a total nerd, intellectual explorer, number theory enthusiast, and computer science nut. I'll write about anything from math and programming to religion and science fiction, all of which play central roles in my life.
This entry was posted in Algorithmics. Bookmark the permalink.

3 Responses to Eulerian Video Magnification

  1. ali says:

    TQ very much, for ur great post, and great explanation! :)

  2. J.Smith says:

    I was wondering if you could elaborate on some of the parameters and how they work such as spatial wavelength cutoff and chrominance scaling.

    • Adam Nickle Adam Nickle says:

      To be honest, I don’t know much about the mathematics behind these algorithms, especially when it comes to the bandpass filter. In this case, I prefer using an FFT on a frame buffer to “flatten” frequencies outside of the range I’m interested in. I don’t know all of the effects this introduces to the signal when converted back to the time domain using the IFFT, but I do know this constitutes a rough bandpass filter that is adjustable and efficient enough to be run on a live video stream without introducing more than a few seconds of delay to the resultant video stream.

Leave a Reply

Your email address will not be published. Required fields are marked *