Terms & Definitions

For the purposes of this document, the following terms and definitions apply:

AC coefficient

Any transform coefficient whose frequency indices are non-zero in at least one dimension.


(Alternative reference frame) A frame that can be used in inter coding.

Base layer

The layer with spatial_id and temporal_id values equal to 0.


The sequence of bits generated by encoding a sequence of frames.

Bit string

An ordered string with limited number of bits. The left most bit is the most significant bit (MSB), the right most bit is the least significant bit (LSB).


A square or rectangular region of samples.

Block scan

A specified serial ordering of quantized coefficients.


An 8-bit bit string.

Byte alignment

One bit is byte aligned if the position of the bit is an integer multiple of eight from the position of the first bit in the bitstream.


Constrained Directional Enhancement Filter designed to adaptively filter blocks based on identifying the direction.


Cumulative distribution function representing the probability times 32768 that a symbol has value less than or equal to a given level.


A sample value matrix or a single sample value of one of the two color difference signals.

Note: Symbols of chroma are U and V.

Coded frame

The representation of one frame before the decoding process.


One of the three sample value matrices (one luma matrix and two chroma matrices) or its single sample value.

Compound prediction

A type of inter prediction where sample values are computed by blending together predictions from two reference frames (the frames blended can be the same or different).

DC coefficient

A transform coefficient whose frequency indices are zero in both dimensions.

Decoded frame

The frame reconstructed out of the bitstream by the decoder.


One embodiment of the decoding process.

Decoding process

The process that derives decoded frames from syntax elements, including any processing steps used prior to and for the film grain synthesis process.


The process in which transform coefficients are obtained by scaling the quantized coefficients.


One embodiment of the encoding process.

Encoding process

A process not specified in this Specification that generates the bitstream that conforms to the description provided in this document.

Enhancement layer

A layer with either spatial_id greater than 0 or temporal_id greater than 0.


A binary variable - some variables and syntax elements (e.g. obu_extension_flag) are described using the word flag to highlight that the syntax element can only be equal to 0 or equal to 1.


The representation of video signals in the spatial domain, composed of one luma sample matrix (Y) and two chroma sample matrices (U and V).

Frame context

A set of probabilities used in the decoding process.

Golden frame

A frame that can be used in inter coding. Typically the golden frame is encoded with higher quality and is used as a reference for multiple inter frames.

Inter coding

Coding one block or frame using inter prediction.

Inter frame

A frame compressed by referencing previously decoded frames and which may use intra prediction or inter prediction.

Inter prediction

The process of deriving the prediction value for the current frame using previously decoded frames.

Intra coding

Coding one block or frame using intra prediction.

Intra frame

A frame compressed using only intra prediction which can be independently decoded.

Intra prediction

The process of deriving the prediction value for the current sample using previously decoded sample values in the same decoded frame.

Inverse transform

The process in which a transform coefficient matrix is transformed into a spatial sample value matrix.

Key frame

An Intra frame which resets the decoding process when it is shown.


A set of tile group OBUs with identical spatial_id and identical temporal_id values.


A defined set of constraints on the values for the syntax elements and variables.

Loop filter

A filtering process applied to the reconstruction intended to reduce the visibility of block edges.


A sample value matrix or a single sample value representing the monochrome signal related to the primary colors.

Note: The symbol representing luma is Y.

Mode info

Syntax elements sent for a block containing an indication of how a block is to be predicted during the decoding process.

Mode info block

A luma sample value block of size 4x4 or larger and its two corresponding chroma sample value blocks (if present).

Motion vector

A two-dimensional vector used for inter prediction which refers the current frame to the reference frame, the value of which provides the coordinate offsets from a location in the current frame to a location in the reference frame.


All structures are packetized in "Open Bitstream Units" or OBUs. Each OBU has a header, which provides identifying information for the contained data (payload).


The procedure of getting the syntax element from the bitstream.


The implementation of the prediction process consisting of either inter or intra prediction.

Prediction process

The process of estimating the decoded sample value or data element using a predictor.

Prediction value

The value, which is the combination of the previously decoded sample values or data elements, used in the decoding process of the next sample value or data element.


A subset of syntax, semantics and algorithms defined in a part.

Quantization parameter

A variable used for scaling the quantized coefficients in the decoding process.

Quantized coefficient

A transform coefficient before dequantization.

Raster scan

Maps a two dimensional rectangular raster into a one dimensional raster, in which the entry of the one dimensional raster starts from the first row of the two dimensional raster, and the scanning then goes through the second row and the third row, and so on. Each raster row is scanned in left to right order.


Obtaining the addition of the decoded residual and the corresponding prediction values.


One of a set of tags, each of which is mapped to a reference frame.

Reference frame

A storage area for a previously decoded frame and associated information.


A special syntax element value which may be used to extend this part in the future.


The differences between the reconstructed samples and the corresponding prediction values.


The basic elements that compose the frame.

Sample value

The value of a sample. This is an integer from 0 to 255 (inclusive) for 8-bit frames, from 0 to 1023 (inclusive) for 10-bit frames, and from 0 to 4095 (inclusive) for 12-bit frames.

Segmentation map

A 3-bit number containing the segment affiliation for each 4x4 block in the image. A segmentation map is stored for each reference frame to allow new frames to use a previously coded map.


The highest level syntax structure of coding bitstream, including one or several consecutive coded frames.


The top level of the block quadtree within a tile. All superblocks within a frame are the same size and are square. The superblocks may be 128x128 luma samples or 64x64 luma samples. A superblock may contain 1 or 2 or 4 mode info blocks, or may be bisected in each direction to create 4 sub-blocks, which may themselves be further subpartitioned, forming the block quadtree.

Switch Frame

An inter frame that can be used as a point to switch between sequences. Switch frames overwrite all the reference frames without forcing the use of intra coding. The intention is to allow a streaming use case where videos can be encoded in small chunks (say of 1 second duration), each starting with a switch frame. If the available bandwidth drops, the server can start sending chunks from a lower bitrate encoding instead. When this happens the inter prediction uses the existing higher quality reference frames to decode the switch frame. This approach allows a bitrate switch without the cost of a full key frame.

Syntax element

An element of data represented in the bitstream.

Temporal delimiter OBU

An indication that the following OBUs will have a different presentation/decoding time stamp from the one of the last frame prior to the temporal delimiter.

Temporal unit

A Temporal unit consists of all the OBUs that are associated with a specific, distinct time instant. It consists of a temporal delimiter OBU, and all the OBUs that follow, up to but not including the next temporal delimiter.

Temporal group

A set of frames whose temporal prediction structure is used periodically in a video sequence.


A rectangular region of the frame that can be decoded and encoded independently, although loop-filtering across tile edges is still applied.

Transform block

A rectangular transform coefficient matrix, used as input to the inverse transform process.

Transform coefficient

A scalar value, considered to be in a frequency domain, contained in a transform block.

Uncompressed header

High level description of the frame to be decoded that is encoded without the use of arithmetic encoding.