Table of Contents

Glossary

AAC

Advanced Audio Coding is the audio coding standard defined by the International Organization for Standardization (ISO) as part of the MPEG-2 specification. It is considered to be “state of the art” in general audio coding and the successor of MP3. Compared to MP3, AAC provides higher quality music with approximately 30% less storage space or bandwidth. AAC provides up to 48 full-bandwidth audio channels with sample rates up to 96(nbsp)kHz, plus 15 low frequency (limited to 120(nbsp)Hz) channels. Someone wants to limit the use of this format enforcing patents, see http://www.vialicensing.com/products/mpeg4aac/licenseFAQ.html.

In widely popular audio codecs, the major competitor of AAC is Vorbis. AAC audio supposedly offers the best available acoustic quality at levels lower than or equal to 96 kb/s while maintaining the smallest file size possible. At levels greater than 96 kb/s, Vorbis acoustic quality is notably better while producing smaller files.

Essentially, in the case of very low level quality on your audio, AAC appears to test as preferable. If you like the audio to sound equivalent to 128(nbsp)kb/s Vorbis (which is equivalent to sound 160~192 kb/s MP3 files), then use AAC at around 96(nbsp)kb/s. AAC will give you the best sound for the smallest file at lower quality. If you want higher quality at the smallest file size, the next choice is Vorbis at levels greater than 128(nbsp)kb/s.

ABR

Average Bit Rate. For a lossy compression algorithm it would be better to encode the analog signal with more bits when the signal is complex (say a piece of music with many instruments playing together), and few bits when the signal is simple. In this case the bit rate is not constant, we can instead speak of an Average Bit Rate. Seeking a specific time offset into an ABR file is not simple without decoding all the previous samples or without building a time map.

AC-3

Format for audio files also known as Dolby Digital. It can contain from 1 to 5 full-range channels (10(nbsp)Hz-22(nbsp)kHz), plus a limited channel (10(nbsp)Hz-120(nbsp)Hz, Low Frequency Effect channel) reserved for bass tones to be reproducted by a subwoofer. Generally the audio in a DVD Video Disk is coded in some AC-3 variant. Audio is compressed approximately 12:1 compared to PCM.

ACM

Audio Compression Manager is the Windows Multimedia software component that manages audio codecs (compressor/decompressors). ACM can also be considered an API specification. A codec must conform to the implicit ACM specification to work with Windows Multimedia.

AVC

H.264/MPEG-4 AVC. Stands for Advanced Video Coding. It is a digital video codec standard which is noted for achieving very high data compression.

Bitrate

insert definition here

CBR

Constant Bit Rate. The analog signal will be encoded in the same amount of bits for each unit of time. For example a 128(nbsp)kbits rate will use (128(nbsp)*(nbsp)1000)(nbsp)/(nbsp)8(nbsp)=(nbsp)16000 bytes per second. With CBR files it is easy to seek an exact time offset, because offset(nbsp)=(nbsp)time(nbsp)*(nbsp)bitrate.

Joint Stereo

Joint stereo coding methods try to increase the coding efficiency when encoding stereo signals by exploiting commonalties between the left and right signal. It can disturb Dolby Surround (Dolby Pro Logic) signal, it is suggested specially for low bit rates (⇐(nbsp)128 kbit/s) and when the sound has less stereo effects.

HDTV

High-definition television means broadcast of television signals with a higher resolution than traditional formats (NTSC, PAL) allow. Except for early analog formats in Europe and Japan, HDTV is broadcast digitally, and therefore its introduction sometimes coincides with the introduction of digital television (DTV). HDTV is usually broadcast in the 16:9 widescreen aspect ratio format.

The format 720p60 is 1280×720 pixels, progressive encoding with 60 fields (30 frames per second). The format 1080i50 is 1920×1080 pixels, interlaced encoding with 50 fields (25 frames per second). Often the frame or field rate is left out. It can then usually be assumed to be either 50 or 60, except for 1080p which is only supported as either 1080p24, 1080p25 or 1080p30 by current technology.

FFmpeg

FFmpeg is a free, open-source tool for recording, converting and streaming audio and video. Its main part is libavcodec, which is a library containing many audio and video codecs such as MPEG-4 or MPEG-2. Built-in copy of libavcodec is used in many popular applications, including MPlayer, xine and Avidemux, where its codecs are used by default for decoding and encoding.

LPCM

Linear Pulse Code Modulation (or LPCM) is a format that is a popular choice in music production. It can have up to 8 channels of audio at 48(nbsp)kHz or 96(nbsp)kHz sampling frequency and 16, 20 or 24 bits per sample. It has a maximum bit rate of 6.144(nbsp)MB/s. The format, without compressing the sound data, simultaneously samples and captures analog signals and transforms them into digital signals.

MP2

MPEG-1 layer 2 audio files. Compression technique less powerful than Layer 3 (MP3), but very similar in principles. The compression ratio usually is 8:1 (corresponds to 256..192 kbps for a stereo signal). It seems that no patents are claimed on MP2 compression, but the fact that MP2 and MP3 are derivatives of the same algorithm family can pose doubts for patent issues on this technique too.

MP3

MPEG-1 Layer 3 audio files. It is a compression technique for audio recording. The compression ratio usually is 12:1 (128..112 kbps for a stereo signal) and there is only very little loss of quality. Basically, this is realized by perceptual coding techniques addressing the perception of sound waves by the human ear. This technique was developed by Fraunhofer IIS (http://www.iis.fraunhofer.de/amm/techinf/layer3/). There are some patent issues on the software licensing, which concern Free Software.

NTSC

Video format for the United States, Japan, Canada, and some other American continent countries. It consists of 29.97 interlaced frames of video per second. Each frame consists of 480 lines out of a total of 525 (the rest are used for sync, vertical retrace, and other data such as captioning). The NTSC refresh frequency was originally exactly 60(nbsp)Hz in the black and white system. In the color system the refresh frequency was shifted slightly downward to 59.94(nbsp)Hz to eliminate stationary dot patterns in the color carrier. Generally 60(nbsp)Hz can be used for anything. All NTSC television video should be capture or encoded at 704×480 at 29.97 or 30(nbsp)fps (frames-per-second). All NTSC DVD video should be encoded at 720×480 for full screen at 29.97 or 30(nbsp)fps (frames-per-second). NTSC widescreen sizes may vary.

Confusion sometimes arises about the proper video resolution size for NTSC video. This is because NTSC VGA monitors have a natural size of 640×480. This is only for monitors, not digital video such as Television or DVDs. You should never encode any NTSC video in 640×480. Similar confusion happens with the monitor resolutions 1280×1024, which is not a correct size because it does not maintain aspect ratios properly. This resolution will lead to size distortions in video and images. To avoid this, use 1280×960.

OCR

Optical Character Recognition. This involves computer software designed to translate images of typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them in (ASCII or Unicode).

Ogg

Ogg is a multimedia container format maintained by Xiph.Org. It is primarily used for storing Vorbis audio and Theora video.

Ogg Writ

A text-phrase codec used with the Ogg encapsulation format. It was initially designed to provide subtitles for Ogg Theora videos, but is also useful for song lyrics with Ogg Vorbis, transcripts with Ogg Speex, or any other place where it's useful to combine text with audio or video. Unlike most subtitle formats which are in separate files from the audio/video content, Ogg Writ is mixed with audio/video streams so that it can be delivered as one file. Its design makes it easy to extend with new features. It currently supports multiple languages and specific placement of the text in one or more windows.

PAL

PAL is the alternative color and video size encoding system used in video. The PAL television system is usually used with a video format that has 625 lines per frame (576 visible lines, the rest being used for other information such as sync data and captioning) and a refresh rate of 50(nbsp)Hz, 25(nbsp)fps (frames per second). PAL non-television video (cinema usually) are typically recorded at 24(nbsp)fps. All PAL television video should be captured or encoded at 768×576 or 720×576 at 25(nbsp)fps. All PAL DVD video should be encoded at 720×576 for full screen at 25 or 24(nbsp)fps. PAL widescreen sizes may vary.

In Brazil, PAL is used in onjunction with the 525 line, 29.97 frame/s system M, using (very nearly) the NTSC color subcarrier frequency. Almost all other countries using system M use NTSC. In Argentina, Paraguay and Uruguay, PAL is used with the standard 625 line system, but again with (very nearly) the NTSC color subcarrier frequency; these variants are called PAL-N and PAL-CN.

SRT

SRT is a simple straightforward type of text subtitle file. It was designed by Subrip. This format is natively used by Avidemux for its own subtitle suite.

Streaming/Streamable

“Streaming media” is media that is consumed (read, heard, viewed) while it is being delivered. Streaming is more a property of the delivery system than the media itself. The distinction is usually applied to media that are distributed over computer networks; most other delivery systems are either inherently streaming (radio, television) or inherently non-streaming (books, video cassettes, audio CDs).

Theora

A video codec being developed by the Xiph.org Foundation as part of their Ogg project. Theora is targeted at competing with MPEG-4 video (implemented for example by Xvid and DivX), RealVideo, Windows Media Video, and similar lower-bitrate video compression schemes. In the Ogg multimedia framework, Theora provides a video layer, while Vorbis usually acts as the audio layer (Speex and FLAC can also act as audio layers).

TTXT

MPEG-4 Part 17, or MPEG-4 Timed Text is the text based subtitle format for MPEG-4. It is also streamable, which was one of the main aspects when creating the format. It is mainly aimed for use in the *.mp4 container, but can also be used in the *.3gp container (as 3GPP Timed Text), which is technically almost identical with *.mp4 but more used in cell phones. 3GPP Timed Text is exactly the same as MPEG-4 Timed Text when used in the *.mp4 container. QuickTime Pro and MP4Box can create or produce these kind of subtitle streams out of various subtitle input formats. MP4Box uses the fourcc tx3g for MPEG-4 Timed Text because of its inherent higher compatibility. MPEG-4 Timed Text is heavily based on XML semantics.

VBR

Variable Bit Rate. The same considerations made for ABR applies here. For the LAME encoder (http://www.mp3dev.org/), the VBR compression is also called Extreme because it works computing the actual encoding/quantization error and using more bits to minimize the error.

VFW

Stands for “Video For Windows”. It was a multimedia technology developed by Microsoft that allowed Microsoft Windows to play digital video.

VOB

A VOB file (DVD-Video Object) is a file type contained in DVD-Video media. It contains the actual Video, Audio, Subtitle and Menu contents in stream form. VOB files are encoded very much like standard MPEG-2 files. When the extension is renamed from .vob to .mpg or .mpeg the file will still be readable and will continue to hold all information, although most players supporting MPEG-2 don't support subtitle tracks. In order to burn the VOB files to a DVD±R disc, other standard DVD-Video files are needed as well, including IFO and BUP files.

VobSub

VSFilter is a DirectShow filter that is able to rip subtitles from VOB files into a separate format. It can also render several different subtitle formats onto video, either during playback or encoding. It is part of the guliverkli project, and was previously named VobSub.

Vorbis

Vorbis is a fully open, non-proprietary, patent-and-royalty-free, general-purpose compressed audio format for audio and music at fixed and variable bitrates from 16 to 128(nbsp)kbps/channel. Vorbis has been designed to completely replace all proprietary, patented audio formats. Quality of Vorbis audio at same bitrate compared to MP3 is always slightly better. Ogg Vorbis files are slightly smaller than MP3 files of same quality. Using a modern Vorbis audio codec, you can achieve better acoustic quality than similar bitrate AAC audio files at a same or smaller file size. The home page for the project is http://www.vorbis.com/.

WAV LPCM

See LPCM

WAV PCM

See PCM

WMA

Windows Media Audio (WMA) is a proprietary compressed audio file format developed by Microsoft. It is not recommended ever for use under any circumstances whatsoever for anyone.