Random thoughts
This page contains the latest entries thave I've published in my personal weblog. It discusses random bits related to my life, software development, my studies and such. The RSS for this feed is available from http://blogs.gnome.org/, as is the complete history.
The world’s fastest VP8 decoder: FFmpeg

Performance chart for FFmpeg's VP8 decoder vs. libvpx
Jason does a great job explaining what we did and how we did it.
(Posted at Fri, 23 Jul 2010 23:07:39 +0000 - Comments)
Google’s VP8 video codec
Now that the hype is over, let’s talk the real deal. How good is Google’s VP8 video codec? Since “multiple independent implementations help a standard mature quicker and become more useful to its users”, me and others (David for the decoder core and PPC optimizations, Jason for x86 optimizations) decided that we should implement a native VP8 decoder in FFmpeg. This has several advantages from other approaches (e.g. linking to libvpx, which is Google’s decoder library for VP8):
- we can share code (and more importantly: optimizations) between FFmpeg’s VP8 decoder and decoders for previous versions of the VPx codec series (e.g. the entropy coder is highly similar compared to VP5/6). Thus, your phone’s future media player will be smaller and faster.
- since H.264 (the current industry standard video codec) and VP8 are highly similar, we can share code (and more importantly: optimizations) between FFmpeg’s H.264 and VP8 decoders (e.g. intra prediction). Thus, again, your desktop computer’s future media player will be smaller and faster.
- Since FFmpeg’s native VP3/Theora and Vorbis decoders (these are video/audio codecs praised by free software advocates) already perform better than the ones provided by Xiph (libvorbis/libtheora), it is highly likely that our native VP8 decoder will (once properly optimized) also perform better than Google’s libvpx. The pattern here is that since each libXYZ has to reinvent its own wheel, they’ll always fall short of reaching the top. FFmpeg comes closer simply because our existing wheels are like what you’d want on your next sports car.
- Making a video decoder is fun!
In short, we wrote a video decoder that heavily reuses existing components in FFmpeg, leading to a vp8.c file that is a mere 1400 lines of code (including whitespace, comments and headers) and another 450 for the DSP functions (the actual math backend of the codec, which will be heavily optimized using SIMD). And it provides binary-identical output compared to libvpx for all files in the vector testsuite. libvpx’ vp8/decoder/*.c plus vp8/common/*.c alone is over 10,000 lines of code (i.e. this excludes optimizations), with another > 1000 lines of code in vpx/, which is the public API to actually access the decoder.
Current work is ongoing to optimize the decoder to outperform libvpx on a variety of computer devices (think beyond your desktop, it will crunch anything; performance becomes much more relevant on phones and such devices). More on that later.
Things to notice so so far:
- Google’s VP8 specs are not always equally useful. They only describe the baseline profile (0). Other profiles (including those part of the vector testsuite, i.e. 1-3) use features not described in the specifications, such as chroma fullpixel motion vector (MV) rounding, a bilinear motion compensation (MC) filter (instead of a subpixel six-tap MC filter). Several parts of the spec are incomplete (“what if a MV points outside the frame?”) or confusing (the MV reading is oddly spread through 3 sections in a chapter, where the code in each section specifically calls code from the previous section, i.e. they really are one section), which means that in the end, it’s much quicker to just read libvpx source code rather than depend on the spec. Most importantly, the spec really is a straight copypaste of the decoder’s source code. As a specification, that’s not very useful or professional. We hope that over time, this will improve.
- Google’s libvpx is full of (hopefully) well-performing assembly code, quite some of which isn’t actually compiled or used (e.g. the PPC code), which makes some of us wonder what the purpose of its presence is.
- Now that VP8 is released, will Google release specifications for older (currently undocumented) media formats such as VP7?
(Posted at Sun, 27 Jun 2010 17:31:03 +0000 - Comments)
WMAVoice postfilter
I previously posted about my ongoing studies on the WMA Voice codec. A basic implementation of the actual codec was submitted and accepted/applied into FFmpeg SVN. Speech codecs work at ultra-low bitrates (~10kbps and lower) and suffer from obvious encoding artifacts, leading to “robotic” output sounds. Also, depending on the source (imaging a phone conversation in a mall), samples often have considerable levels of background noise. These types of artifacts are common to all speech codecs, and there are a variety of postfilters meant to reduce their effects. In fact, most speech codecs use the exact same filters. Imagine the smile on a developer’s face if a common proprietary postfilter can be implemented by calling no more than 3-4 already-implemented functions (as was the case with QCELP, another speech codec).
This was almost the case with WMAVoice, with one exception. This was the first time we saw an implementation of a Wiener filter. The purpose of the filter is noise reduction. Clearly, if noisy signal = signal + noise, then signal = noisy signal – noise. Sounds simple, right? The math is actually a little complex, but fortunately this is quite well-documented in the scientific literature of signal processing. The idea is that noise has lower signal strength than the intended signal. By increasing the contrast between the strength of these two, you decrease noise and thus enhance perception of the signal itself.
Here’s what the filter does:
- Take FFT (“frequency distribution”) of the LPCs (“time-independent representation of signal”);
- Calculate a power spectrum of these, which is basically a representation of the strongest power/frequency pairs versus the weakest ones, along with the desired level/strength of noise subtraction, as quasi-coefficients;
- turn these into actual denoising filter coefficients using a Hilbert/Laplace transform;
- apply these to the FFT of the “noisy” output of the speech synthesis filter.
The resulting patch was applied to SVN trunk last week. Thanks to Alex (hm, old…) and Vitor (hm, no blog…) for helping me understand! Time for something new, I guess…
(Posted at Fri, 30 Apr 2010 22:18:37 +0000 - Comments)
Google Summer-of-Code 2010 deadline nearing
I blogged about it before, but let’s remind all students that you can work on FFmpeg this summer, and earn money ($5000) while doing so. The deadline is this Friday, the 9th.
Google’s Summer-of-Code is a yearly recurring event where students spend their summer coding for free software projects, and make a buck. In the past few years, some of our much-valued contributions created during the Summer-of-Code have included a VC-1/WMV9, RealVideo3/4, WMAPro and AMR-NB decoder and an MPEG-4/AAC encoder/decoder (and many, many more!). This year, we have had several high-quality proposals from students wanting to work on network-related protocols or audio codecs, but are still looking for applications related to:
- better seeking
- filter support (libavfilter)
- video codecs (e.g. WVP2 or VC-1 interlaced)
- optimizations
- many more
If you’re interested in learning more about the innermost workings of multimedia, you have good C-skills and are willing to learn a lot more about these, then send an email to the ffmpeg-soc mailinglist, or come to IRC (#ffmpeg-devel on Freenode) to find out more. Please apply before Friday!
(Posted at Tue, 06 Apr 2010 14:28:10 +0000 - Comments)
Google’s Summer of Code 2010
It’s that time again – the time where Google will announce their Summer of Code! In the summer of code, students can work on free software projects during their summer break, and make $4500 while they’re at it. FFmpeg has traditionally been a strong contender, and some of its highest profile code (VC-1, WMAPro and RealVideo4 decoders, just to name a few) was developed in part in the Summer of Code.
Are you a student, proficient in C, with excellent technical skills / insight (or you want to learn to develop these) and you want to contribute to one of the most exciting free software projects out there? Then apply for one of FFmpeg’s suggested projects for GSoC 2010!
(Posted at Mon, 01 Feb 2010 16:03:54 +0000 - Comments)
