Real Time Music Video Synthesis by Reconstruction

The world truly is full of wonder.

You have to check this out. A guy named Sven König, who I assume is from Germany, wrote some amazing software for creating music videos in real time. The idea is, you feed the system existing material (anything from existing music videos to someone giving a filmed interview). The system analyzes the input media, breaking it apart into sound/video snippets that are, say, as long as a quarter note in the input song, or an eighth note, a sixteenth, etc. It then creates a sound signature of each sample (I assume performing some sort of time/frequency analysis), and stores all the samples along with their signatures in a database. Then, when you run the software, you sing/talk/beatbox/make random noise into a microphone. The software now takes apart your voice coming in, and generates signatures in the same manner. It mathes those signatures to the nearest existing sound samples in its database, and plays back the sound/video clip corresponding to the closest match, instead of just playing back your own voice.

What ends up happening is that if you were to say “Hello” into the mic, the computer would break that apart into maybe “Heh” and “Low”, and search its databse for some moment in time in one of the source media you fed it at the beginning, where it sounds like “heh” or “low”, and play back those two clips in order, thus reconstructing output sound that sounds like “Hello,” but isn’t being said by you.

I’m not sure if that made any sense, but check out this video of Sven explaining his “sCrAmBlEd?HaCkZ!” software, and pay attention to the examples, and I think you’ll understand better. It’s a really awesome idea. I don’t know if just nobody’s thought about it before, or if nobody has done it completely, or if I just wasn’t aware of it, but as a sound guy, I think it’s one of the coolest things I’ve ever seen a computer do.

The Spark Between

Just another WordPress site

Real Time Music Video Synthesis by Reconstruction