This article describes briefly what H.264 is and how to get H.264 encoding support for Avidemux. It also summarizes and explains the x264 options available in Avidemux. This can be considered a (simple) guide to the encoder.
H.264, which is also known as “MPEG-4 Part 10” or “MPEG-4 Advanced Video Coding” (AVC), is a digital video compression standard, which is noted for achieving very high data compression. While H.264 generally requires more CPU power for playback than video encoded with the older MPEG-4(nbsp)Part(nbsp)2 standard (as used by Xvid or DivX), the compression efficiency is much better! That means: With H.264/AVC you can get a significantly better quality at the same file size -or- you can get the same quality at a significantly smaller file size (compared to MPEG-4(nbsp)ASP). While H.264 compresses much more efficient than MPEG-4(nbsp)Part(nbsp)2, the advantage over MPEG-2 is even greater.
More detailed information about H.264 can be found in the corresponding Wikipedia article. A comparison of various H.264 encoders against MPEG-4 Part-2, MPEG-2 and other video formats can be found at http://mirror05.x264.nl/Dark/website/compare.html.
While Avidemux uses “built-in” libavcodec from FFmpeg for H.264 decoding, it needs an additional (external) library for H.264 encoding. Therefore Avidemux uses x264. x264 is a free library for encoding H.264/AVC video streams. The code is written from scratch by Laurent Aimar, Loren Merritt, Eric Petit (OS X), Min Chen (VfW/asm), Justin Clay (VfW), Måns Rullgård, Radek Czyz, Christian Heine (asm), Alex Izvorski (asm), and Alex Wright. It is released under the terms of the GPL license. So to clarify, the encoder library is called x264 while the compression standard it uses is called H.264 (or MPEG-4 AVC). In other words: The x264 encoder software creates H.264/AVC video. It should be noted that x264 while being “free” software can compete with commercial H.264 encoders in terms of quality and speed. Major companies in the video business, such as Youtube and Facebook, are known to use the x264 encoder.
If x264 is not available in your version of Avidemux, there is a guide on how to download and compile x264 by yourself. It is in the Compiling x264 section.
After you compile x264, you will have to re-compile Avidemux to build in the x264 feature. There is also a guide on how to do this in the Compiling Avidemux section.
Note that if you are using the pre-compiled Avidemux builds for Microsoft Windows, the required x264 library ships with the installer. Hence no additional software is required! Stuff like “Codec Packs”, “VFW Codecs” or “DirectShow Filters” will not work with Avidemux! Anyway, the latest builds of the x264 library for Avidemux can be found in the libx264 GIT builds thread (make sure you navigate to the very last post!). These builds usually are newer – and less tested – than the ones that ships with Avidemux.
Avidemux contains most of the options available in the x264 library. For options not yet available, see the “Unavailable” section in this article.
The human eye doesn't just want the image to look similar to the original, it wants the image to have similar complexity. Therefore, we would rather see a somewhat distorted but still detailed block than a non-distorted but completely blurred block. The result is a bias towards a detailed and/or grainy output image, a bit like xvid except that its actual detail rather than ugly blocking (see http://x264dev.multimedia.cx/?p=164 and http://forum.doom9.org/showpost.php?p=1144270&postcount=1 for more info). The purpose of Psy RDO is to keep the complexity of an encoded block similar to the complexity of the original block. This way Psy RDO produces an image that looks much sharper and more detailed in many cases (compared to none Psy RDO). It also helps to preserve film grain greatly! Please note that Psy RDO will inherently hurt metrics, such as PSNR and SSIM. As soon as psycho-visual optimizations are involved, the classical metrics become useless! Also note that Psy RDO will work with RDO modes only: If Partition Decision is set to 6 (or higher), then Psy RDO will be on by default, otherwise it will be disabled. In addition to Psy RDO there also is Psy-Trellis now. This is still considered an “experimental” feature and disable by default, but it seems to help greatly for retaining textures in the video. Note that Psy Trellis is based on Trellis quantization. Consequently it will only be effective with Trellis quantization enabled too (Trellis 1 is sufficient now, but 2 will be more effective).
Adaptive Quantization (AQ) allows each macroblock within the frame to choose a different quantizer, instead of assigning the same quantizer to all blocks within the frame. The purpose of AQ is moving more bits into “flat” macroblocks. This is done by adaptively lowering the quantizers of certain blocks (and raising the quantizers of other blocks). Without AQ, flat and dark areas of the image tend to show ugly blocking or banding. Thanks to the new AQ algorithm, blocking and banding can be greatly reduced! With AQ enabled, you can expect a significant(!) gain in overall image quality. Especially in dark scenes and scenes with “flat” backgrounds (sky, grass, walls, etc.) much more details can be preserved. Nevertheless AQ seems to perform less efficient with “Animation” material than it does with “Film” material, but still helps to prevent banding. Note that AQ can be used with the bitrate-base modes (Single-Pass and Two-Pass) as well as with the CRF mode. It can not be used with the QP mode! That's because QP mode uses constant quantizers per definition, which is one of the reasons why QP mode generally should be avoided nowadays.
VBV (Video Buffering Verifier) defines a specific buffering model. In that model the decoder (player) reads the input data from a buffer. That buffer has a limited size. Also the buffer is filled at a limited data rate. VBV makes sure that the buffer will never run out of data, i.e. it makes sure that there is always enough data left in the buffer to decode the next frame. Therefore VBV forces additional bitrate and buffering constraints on the encoder. It's highly recommended to not use VBV, unless you can't get around it. VBV may hurt the video quality, but it never will improve the quality! Unfortunately hardware players (including mobile devices) may need VBV for proper playback. You will have to look up the particular VBV requirements for each device individually. In particular BluRay has strict VBV requirements. Note that x264's VBV implementation now works just fine with both, 1-Pass and 2-Pass modes. There's no need to use 2-Pass mode for VBV anymore. (See http://en.wikipedia.org/wiki/Video_buffering_verifier for details about VBV)
H.264 allows to segment each frame into several parts. These parts are called “slices”. The advantage of using multiple slices (per frame) is that the slices can be processed independently and in parallel. This allows easy multi-threading implementations in H.264 encoders and decoders. Unfortunately using multiple slices hurts compression efficiency! The more slices are used the worse! Therefore you should not use slices, if you don't have to. But if your H.264 decoder uses slice-based multi-threading (i.e. multiple slices are decoded in parallel), then multi-threading will only be used, if the video was encoded with multiple slices. Fortunately most software decoders do not require slices, because they use frame-based multi-threading (i.e. multiple frames are decoded in parallel). Hardware decoders may require slices though. In particular the Blu-ray specs say that at least 4 slices must be used.
Zones can be used to manually assign a lower or higher bitrate to a certain section of the video (e.g. enforce a lower bitrate for the ending credits). There are two modes to control the bitrate of a zone: Using a “Bitrate Factor” you can change the bitrate relative to the encoders decision and using a “Quantizer” you can overwrite the encoders decision with a constant quantizer value.
This setting defines the “Pixel Aspect Ratio” (PAR) of the video. Do not change the default value of 1:1 (aka “Square Pixels”), unless you are encoding anamorphic video! In case you are encoding anamorphic material and you want to keep it anamorphic, then you will have to set the correct PAR value. Otherwise your video would be displayed with wrong aspect ratio! If you have an anamorphic source and you want to convert it to “Square Pixels” (PAR = 1:1), then you must invoke the Resize filter and resize the video accordingly. Note that “Pixel Aspect Ratio” is not equal to “Display Aspect Ratio” (DAR). Anyway, the DAR can be calculated from the PAR using this formula: DAR = Width/Height * PAR. For example: 720/576 * 64/45 = 16/9. The advantage of working with PAR values is that the PAR of a video won't change when cropping the video, while the DAR most likely will change. The following PAR options are available:
These settings are only suggestions for the playback equipment. Use them at your own risk!
The H.264/AVC specifications define a number of different profiles. Each profile specifies which features of H.264 are allowed (or not allowed). If you want your H.264 video stream to be compliant to a certain profile, then you may only enabled features allowed in this profile. Profiles are needed to make sure your video file will play fine on a certain decoder. For example a “Main” profile compliant video will play 100% fine on every “Main” profile capable decoder/player. When working with the x264 encoder, there are basically two profiles you have to take care of: the “Main” profile and the “High” profile. Nevertheless x264 is missing the Error Resilience feature from the “Baseline Profile” as well as the Interlacing Support from “Extended Profile”. If you want to play your video on software players, then you don't need to care about profiles that much. The H.264 decoder from “libavcodec”, which is used in MPlayer, VLC Player, ffdshow and many more, supports all of x264' features, including the “High” and “Predictive Lossless” profile features. Same for proprietary decoders, such as CoreAVC. Nevertheless if you are targeting a hardware player, then profiles are very important, as hardware players are very restrictive on what profile they support.
In addition to the profiles, the H.264/AVC specifications also define a number of levels. While profiles define which compression features of H.264 may (or may not) be used, the levels put further restrictions on other properties of the video. These restrictions include the maximum resolution, the maximum bitrate, the maximum framerate (for a given resolution) and the maximum number of reference frames (indirectly limited though MaxDPB). In order play your H.264 video on a specific hardware player, that player must not only support your videos profile, but also your video's level (or a higher one). Again software players usually don't have such restrictions, as long as you CPU is powerful enough.
Note: The common notation for Profiles and Levels is “Profile@Level”, for example High@4.1. Furthermore there is no way to directly encode your video to a specific level and/or profile. If you want your video to comply to a certain profile/level, you must choose the encoder settings accordingly. Presets may be helpful to find the correct settings. Anyway, it may still be necessary to resize your video and/or change the framerate.
Baseline | Extended | Main | High | High 10 | High 4:2:2 | High 4:4:4 Predictive | |
---|---|---|---|---|---|---|---|
I and P Slices | YES | YES | YES | YES | YES | YES | YES |
B Slices | NO | YES | YES | YES | YES | YES | YES |
SI and SP Slices | NO | YES | NO | NO | NO | NO | NO |
Multiple Reference Frames | YES | YES | YES | YES | YES | YES | YES |
In-Loop Deblocking Filter | YES | YES | YES | YES | YES | YES | YES |
CAVLC Entropy Coding | YES | YES | YES | YES | YES | YES | YES |
CABAC Entropy Coding | NO | NO | YES | YES | YES | YES | YES |
Flexible Macroblock Ordering (FMO) | YES | YES | NO | NO | NO | NO | NO |
Arbitrary Slice Ordering (ASO) | YES | YES | NO | NO | NO | NO | NO |
Redundant Slices (RS) | YES | YES | NO | NO | NO | NO | NO |
Data Partitioning | NO | YES | NO | NO | NO | NO | NO |
Interlaced Coding (PicAFF, MBAFF) | NO | YES | YES | YES | YES | YES | YES |
4:2:0 Chroma Format | YES | YES | YES | YES | YES | YES | YES |
Monochrome Video Format (4:0:0) | NO | NO | NO | YES | YES | YES | YES |
4:2:2 Chroma Format | NO | NO | NO | NO | NO | YES | YES |
4:4:4 Chroma Format | NO | NO | NO | NO | NO | NO | YES |
8 Bit Sample Depth | YES | YES | YES | YES | YES | YES | YES |
9 and 10 Bit Sample Depth | NO | NO | NO | NO | YES | YES | YES |
11 to 14 Bit Sample Depth | NO | NO | NO | NO | NO | NO | YES |
8×8 vs. 4×4 Transform Adaptivity | NO | NO | NO | YES | YES | YES | YES |
Quantization Scaling Matrices | NO | NO | NO | YES | YES | YES | YES |
Separate Cb and Cr QP control | NO | NO | NO | YES | YES | YES | YES |
Separate Color Plane Coding | NO | NO | NO | NO | NO | NO | YES |
Predictive Lossless Coding | NO | NO | NO | NO | NO | NO | YES |
Baseline | Extended | Main | High | High 10 | High 4:2:2 | High 4:4:4 Predictive |
From Wikipedia, the free encyclopedia
Level number | Max macroblocks per second | Max frame size (macroblocks) | Max video bit rate (VCL) for Baseline, Extended and Main Profiles | Max video bit rate (VCL) for High Profile | Max video bit rate (VCL) for High 10 Profile | Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles | Examples for high resolution @ frame rate (max stored frames) in Level |
---|---|---|---|---|---|---|---|
1 | 1485 | 99 | 64(nbsp)kbit/s | 80(nbsp)kbit/s | 192(nbsp)kbit/s | 256(nbsp)kbit/s | 128×96@30.9(nbsp)(8) 176×144@15.0(nbsp)(4) |
1b | 1485 | 99 | 128(nbsp)kbit/s | 160(nbsp)kbit/s | 384(nbsp)kbit/s | 512(nbsp)kbit/s | 128×96@30.9(nbsp)(8) 176×144@15.0(nbsp)(4) |
1.1 | 3000 | 396 | 192(nbsp)kbit/s | 240(nbsp)kbit/s | 576(nbsp)kbit/s | 768(nbsp)kbit/s | 176×144@30.3(nbsp)(9) 320×240@10.0(nbsp)(3) 352×288@7.5(nbsp)(2) |
1.2 | 6000 | 396 | 384(nbsp)kbit/s | 480(nbsp)kbit/s | 1152(nbsp)kbit/s | 1536(nbsp)kbit/s | 320×240@20.0(nbsp)(7) 352×288@15.2(nbsp)(6) |
1.3 | 11880 | 396 | 768(nbsp)kbit/s | 960(nbsp)kbit/s | 2304(nbsp)kbit/s | 3072(nbsp)kbit/s | 320×240@36.0(nbsp)(7) 352×288@30.0(nbsp)(6) |
2 | 11880 | 396 | 2(nbsp)Mbit/s | 2.5(nbsp)Mbit/s | 6(nbsp)Mbit/s | 8(nbsp)Mbit/s | 320×240@36.0(nbsp)(7) 352×288@30.0(nbsp)(6) |
2.1 | 19800 | 792 | 4(nbsp)Mbit/s | 5(nbsp)Mbit/s | 12(nbsp)Mbit/s | 16(nbsp)Mbit/s | 352×480@30.0(nbsp)(7) 352×576@25.0(nbsp)(6) |
2.2 | 20250 | 1620 | 4(nbsp)Mbit/s | 5(nbsp)Mbit/s | 12(nbsp)Mbit/s | 16(nbsp)Mbit/s | 352×480@30.7(nbsp)(10) 352×576@25.6(nbsp)(7) 720×480@15.0(nbsp)(6) 720×576@12.5(nbsp)(5) |
3 | 40500 | 1620 | 10(nbsp)Mbit/s | 12.5(nbsp)Mbit/s | 30(nbsp)Mbit/s | 40(nbsp)Mbit/s | 352×480@61.4(nbsp)(12) 352×576@51.1(nbsp)(10) 720×480@30.0(nbsp)(6) 720×576@25.0(nbsp)(5) |
3.1 | 108000 | 3600 | 14(nbsp)Mbit/s | 14(nbsp)Mbit/s | 42(nbsp)Mbit/s | 56(nbsp)Mbit/s | 720×480@80.0(nbsp)(13) 720×576@66.7(nbsp)(11) 1280×720@30.0(nbsp)(5) |
3.2 | 216000 | 5120 | 20(nbsp)Mbit/s | 25(nbsp)Mbit/s | 60(nbsp)Mbit/s | 80(nbsp)Mbit/s | 1280×720@60.0(nbsp)(5) 1280×1024@42.2(nbsp)(4) |
4 | 245760 | 8192 | 20(nbsp)Mbit/s | 25(nbsp)Mbit/s | 60(nbsp)Mbit/s | 80(nbsp)Mbit/s | 1280×720@68.3(nbsp)(9) 1920×1080@30.1(nbsp)(4) 2048×1024@30.0(nbsp)(4) |
4.1 | 245760 | 8192 | 50(nbsp)Mbit/s | 62.5(nbsp)Mbit/s | 150(nbsp)Mbit/s | 200(nbsp)Mbit/s | 1280×720@68.3(nbsp)(9) 1920×1080@30.1(nbsp)(4) 2048×1024@30.0(nbsp)(4) |
4.2 | 522240 | 8704 | 50(nbsp)Mbit/s | 62.5(nbsp)Mbit/s | 150(nbsp)Mbit/s | 200(nbsp)Mbit/s | 1920×1080@64.0(nbsp)(4) 2048×1080@60.0(nbsp)(4) |
5 | 589824 | 22080 | 135(nbsp)Mbit/s | 168.75(nbsp)Mbit/s | 405(nbsp)Mbit/s | 540(nbsp)Mbit/s | 1920×1080@72.3 (13) 2048×1024@72.0 (13) 2048×1080@67.8 (12) 2560×1920@30.7 (5) 3680×1536@26.7(nbsp)(5) |
5.1 | 983040 | 36864 | 240(nbsp)Mbit/s | 300(nbsp)Mbit/s | 720(nbsp)Mbit/s | 960(nbsp)Mbit/s | 1920×1080@120.5 (16) 4096×2048@30.0 (5) 4096×2304@26.7 (5) |
Level number | Max macroblocks per second | Max frame size (macroblocks) | Max video bit rate (VCL) for Baseline, Extended and Main Profiles | Max video bit rate (VCL) for High Profile | Max video bit rate (VCL) for High 10 Profile | Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Predictive Profiles | Examples for high resolution @ frame rate (max stored frames) in Level |
From Wikipedia, the free encyclopedia
For more detailed information, please refer to “Annex A” in the official ITU-T H.264 specifications!
Since GPGPU has become a hot topic, people began asking for GPU support in Avidemux. These people need to understand that Avidemux cannot offer GPU support for H.264 encoding, until GPU support is implemented in the x264 library. There is a project scheduled to add CUDA support to x264 (see http://wiki.videolan.org/SoC_x264_2009#GPU_Motion_Estimation), but there are no results yet (May 2009). We know that there are commercial H.264 encoders with GPU support available already. But if you look at these encoders closely, you will notice that their speed-up claims are marketing blabber. These encoders may be fast, but their quality isn't anywhere near x264's quality! Also note that marketing people tend to compare their encoders to the completely unoptimized H.264 Reference Encoder. x264 is faster than the reference encoder by several orders of magnitude, which renders these speed comparisons meaningless. x264 can run extremely fast on a CPU and scales up to at least 16 cores. So don't believe everything that marketing people claim!
IDR frames are: An IDR frame is what has been traditionally known as an I frame. An IDR frame, just like an I frame in MPEG-1/2 and MPEG-4 ASP, starts with a clean slate, and all subsequent frames will make reference to the IDR frame and subsequent frames. Non IDR I frames should be rare, but since they cannot be ruled out, enforcing a minimal IDR interval can help improve compression in some high motion scenes. In H.264/AVC you can also have I frames inside a GOP, which are not seekable, since the long time references introduced in H.264/AVC could result in a P frame after the I frame to reference a P frame before the I frame.
Max IDR-keyframe interval indicates the maximum distance between two IDR frames. Similarly, Min IDR-keyframe interval indicates the minimum distance between two IDR frames.