Metric Quality Comparison on x264 and VP8 Ref Encoder
As Google(formally On2) released its newest patent-free video coding standard, along with the HTML5 debate, people are generally interested with how this newest codec performances when comparing with existing video coding standard. While Google claimed VP8 is a good quality codec with low complexity, it looks like rather promising. For the application field like anime/movie subber, H.264 is a very acquainted for almost four years by the folks, and on the choice of encoder, x264 has been long established as the best yet free encoder everyone can used. So it is quite natural to think how the result would be when x264 encounters VP8’s encoder, this is the motivation that yields my following comparison.
Before we go through the benchmark stuff, the first thing I want to mention is the standard level comparison which has been done by the main developer of x264 few weeks ago. It is rather easy to reach a conclusion from his analysis that VP8 is in general inferior to the H.264. Jason predicted that VP8 will fall into the gap of baseline profile and high profile regarding the compression efficiency. But in complexity level, VP8 fails to deliver what Google said “low complexity” cause its tedious DCT approximation, chroma interpolation in motion estimation as well as loop filtering. In this comparison I didn’t analysis those differences in coding standard, rather to give readers a kind of intuitive quality performance by using some commonly used quality metrics.
So when we talk about actual encoder quality performance, an actual implementation of certain coding standard should be used. For H.264 we have no choice but x264, since it is very popular, it is the best among H.264 encoders, and it is free; for VP8 we still have no choice but VP8’s reference encoder, which is developed by On2.
And in this article I didn’t compare the encoding speed since this is another issue and it is more dependent on the actual implementation side.
OK, let’s start:
Methodology:
Sequences: city, crew, harbour, ice, soccer; all of them are retrieved from xiph.org
Spatial resolution: 4CIF, which is 704×576
Framerate: 30FPS constant
Bitrate: 500 through 3000, 500k as the interval
Encoder:
x264 I used rev 1612, I git’ed from x264 development tree and complie them in Ubuntu 9.04 environment.
VP8 I used the latest source code from WebM’s development tree.
Metrics: Average PSNR and SSIM, SSIM analysis is only based on Y channel.
Generally, I use x264 and VP8 generate its encoded bitstream, and for x264 I use –dump-yuv to get coded YUV sequences, for VP8 I use its ivfdec to get YUV sequences.
Coding Parameters:
Both of them I used two-pass scheme, x264 I use –placebo preset but didn’t use –slow-firstpass; VP8 I use the recommended parameter for best quality, you can find the manual here. I didn’t tune SSIM or PSNR in x264 to get an approximation of what we use practically. Detailed coding parameters are given as follows:
x264
1-pass
x264 --preset placebo --pass 1 --psnr --ssim --no-dct-decimate --bitrate $b --stats "$seq_name" --fps 29.97 --force-cfr -o NUL $seq_name 704x576
2-pass
x264 --preset placebo --pass 2 --psnr --ssim --no-dct-decimate --bitrate $b --stats "$seq_name" --fps 29.97 --force-cfr -o $coding_name $seq_name 704x576
VP8
ivfenc $seq_name $coding_name --i420 -w 704 -h 576 -p 2 -t 4 --best --target-bitrate=$b --end-usage=0 --auto-alt-ref=1 --timebase=1001/30000 --psnr --minsection-pct=5 --maxsection-pct=800 --lag-in-frames=16 --kf-min-dist=0 --kf-max-dist=360 --token-parts=2 --static-thresh=0 --drop-frame=0 --min-q=0 --max-q=60
WIthout doubt, when both at best possible quality mode, only from this not-yet-accurate metric measurments, x264 outperform VP8’s reference encoder by 2dB in PSNR and 0.025 in SSIM in average. This is not a marginal win since we often treat a 0.5dB improvement in PSNR as very good. In perceptual experiences, x264 looks way better for the retention of fine details while VP8 looks like more bias towards a smooth(blurring) result particularly in low bitrate(<1500k).
There are two sequences which VP8 got higher SSIM than x264, crew@500k and 1000k, ice@500k. In both cases, the scene can be described as “multiple persons with moderate to high in-scene motion”, maybe the smoother nature of VP8 will lead higher score than x264 particularly when bitrate is insufficient in this case. I am not quite sure about that, and it may also an indication that VP8 can be fitted better than H.264, for the low-bitrate application situation. Anyway, I am looking forward for a more sophisticated and better implementation of VP8 in near future.










申请中文版转载妇联评论= =(谜之声:喂 上一篇你还没转呢)
P.S.秋月姐姐你打算再写中文版么?
[回复]
Aki 回复:
六月 2nd, 2010 at 9:08 上午
可以啊,中文我打算扔nmm去
或者,我有个更好的点子,蛇妹妹来翻译XDDD
[回复]
ssnake 回复:
六月 2nd, 2010 at 9:41 上午
@Aki, 这个还是本人做比较靠谱吧……话说我两次Anti-spam word怎么都是maid=_,=
[回复]
ssnake 回复:
六月 2nd, 2010 at 9:42 上午
@ssnake, = =突然想到“本人”有歧义,是指作者本人不是指本蛇
Aki 回复:
六月 2nd, 2010 at 10:19 上午
蛇妹抖
woshenmedoubuzhidao 回复:
六月 3rd, 2010 at 7:41 上午
@ssnake, 其实这是某人的阴谋,某人想让所有回复的人都成为某人的maid
啊,秋月前辈想要这么多女仆呀~
[回复]