Conventions
General
The mathematical operators and their precedence rules used to describe this Specification are similar to those used in the C programming language. However, the operation of integer division with truncation is specifically defined.
In addition, a length 2 array used to hold a motion vector (indicated
by the variable name ending with the letters Mv
or Mvs
) can be accessed
using either array notation (e.g. Mv[ 0 ]
and Mv[ 1 ]
), or by just
the name (e.g., Mv
). The only operations defined when using the name are
assignment and equality/inequality testing. Assignment of an array is
represented using the notation A = B
and is specified to mean the same
as doing both the individual assignments A[ 0 ] = B[ 0 ]
and
A[ 1 ] = B[ 1 ].
Equality testing of 2 motion vectors is represented using the
notation A == B
and is specified to mean the same as
(A[ 0 ] == B[ 0 ] && A[ 1 ] == B[ 1 ])
. Inequality testing is defined as
A != B
and is specified to mean the same as
(A[ 0 ] != B[ 0 ] || A[ 1 ] != B[ 1 ])
.
When a variable is said to be representable by a signed integer with x
bits,
it means that the variable is greater than or equal to -(1 << (x-1))
, and that
the variable is less than or equal to (1 << (x-1))-1
.
The key words "must", "must not", "required", "shall", "shall not", "should", "should not", "recommended", "may", and "optional" in this document are to be interpreted as described in RFC 2119.
Arithmetic operators
+ | Addition |
– | Subtraction (as a binary operator) or negation (as a unary prefix operator) |
* | Multiplication |
/ | Integer division with truncation of the result toward zero. For example, 7/4 and -7/-4 are truncated to 1 and -7/4 and 7/-4 are truncated to -1 . |
a % b | Remainder from division of a by b . Both a and b are positive integers. |
÷ | Floating point (arithmetical) division. |
ceil(x) | The smallest integer that is greater or equal than x . |
floor(x) | The largest integer that is smaller or equal than x . |
Logical operators
a && b | Logical AND operation between a and b |
a || b | Logical OR operation between a and b |
! | Logical NOT operation. |
Relational operators
> | Greater than |
>= | Greater than or equal to |
< | Less than |
<= | Less than or equal to |
== | Equal to |
!= | Not equal to |
Bitwise operators
& | AND operation |
| | OR operation |
^ | XOR operation |
~ | Negation operation |
a >> b | Shift a in 2's complement binary integer representation format to the right by b bit positions. This operator is only used with b being a non-negative integer. Bits shifted into the MSBs as a result of the right shift have a value equal to the MSB of a prior to the shift operation. |
a << b | Shift a in 2's complement binary integer representation format to the left by b bit positions. This operator is only used with b being a non-negative integer. Bits shifted into the LSBs as a result of the left shift have a value equal to 0 . |
Assignment
= | Assignment operator |
++ | Increment, x++ is equivalent to x = x + 1 . When this operator is used for an array index, the variable value is obtained before the auto increment operation |
- - | Decrement, i.e. x-- is equivalent to x = x - 1 . When this operator is used for an array index, the variable value is obtained before the auto decrement operation |
+= | Addition assignment operator, for example x += 3 corresponds to x = x + 3 |
-= | Subtraction assignment operator, for example x -= 3 corresponds to x = x - 3 |
Mathematical functions
The following mathematical functions (Abs, Clip3, Clip1, Min, Max, Round2 and Round2Signed) are defined as follows:
The definition of Round2 uses standard mathematical power and division operations, not integer operations. An equivalent definition using integer operations is:
Round2( x, n ) {
if ( n == 0 )
return x
return ( x + ( 1 << (n - 1) ) ) >> n
}
The FloorLog2(x) function is defined to be the floor of the base 2 logarithm of the input x.
The input x will always be an integer, and will always be greater than or equal to 1.
This function extracts the location of the most significant bit in x.
An equivalent definition (using the pseudo-code notation introduced in the following section) is:
FloorLog2( x ) {
s = 0
while ( x != 0 ) {
x = x >> 1
s++
}
return s - 1
}
The CeilLog2(x) function is defined to be the ceiling of the base 2 logarithm of the input x (when x is 0, it is defined to return 0).
The input x will always be an integer, and will always be greater than or equal to 0.
This function extracts the number of bits needed to code a value in the range 0 to x-1.
An equivalent definition (using the pseudo-code notation introduced in the following section) is:
CeilLog2( x ) {
if ( x < 2 )
return 0
i = 1
p = 2
while ( p < x ) {
i++
p = p << 1
}
return i
}
Method of describing bitstream syntax
The description style of the syntax is similar to the C programming language. Syntax elements in the bitstream are represented in bold type. Each syntax element is described by its name (using only lower case letters with underscore characters) and a descriptor for its method of coded representation. The decoding process behaves according to the value of the syntax element and to the values of previously decoded syntax elements. When a value of a syntax element is used in the syntax tables or the text, it appears in regular (i.e. not bold) type. If the value of a syntax element is being computed (e.g. being written with a default value instead of being coded in the bitstream), it also appears in regular type (e.g. tile_size_minus_1).
In some cases the syntax tables may use the values of other variables derived from syntax elements values. Such variables appear in the syntax tables, or text, named by a mixture of lower case and upper case letter and without any underscore characters. Variables starting with an upper case letter are derived for the decoding of the current syntax structure and all depending syntax structures. These variables may be used in the decoding process for later syntax structures. Variables starting with a lower case letter are only used within the process from which they are derived. (Single character variables are allowed.)
Constant values appear in all upper case letters with underscore characters (e.g. MI_SIZE).
Constant lookup tables appear as words (with the first letter of each word in upper case, and remaining letters in lower case) separated with underscore characters (e.g. Block_Width[…]).
Hexadecimal notation, indicated by prefixing the hexadecimal number by 0x
,
may be used when the number of bits is an integer multiple of 4. For example,
0x1a
represents a bit string 0001 1010
.
Binary notation is indicated by prefixing the binary number by 0b
. For
example, 0b00011010
represents a bit string 0001 1010
. Binary numbers may
include underscore characters to enhance readability. If present, the
underscore characters appear every 4 binary digits starting from the LSB. For
example, 0b11010
may also be written as 0b1_1010
.
A value equal to 0 represents a FALSE condition in a test statement. The value TRUE is represented by any value not equal to 0.
The following table lists examples of the syntax specification format. When
syntax_element
appears (with bold face font), it specifies that this syntax
element is parsed from the bitstream.
Type | |
/* A statement can be a syntax element with associated | |
descriptor or can be an expression used to specify its | |
existence, type, and value, as in the following | |
examples */ | |
syntax_element | f(1) |
/* A group of statements enclosed in brackets is a | |
compound statement and is treated functionally as a single | |
statement. */ | |
{ | |
statement | |
… | |
} | |
/* A "while" structure specifies that the statement is | |
to be evaluated repeatedly while the condition remains | |
true. */ | |
while ( condition ) | |
statement | |
/* A "do .. while" structure executes the statement once, | |
and then tests the condition. It repeatedly evaluates the | |
statement while the condition remains true. */ | |
do | |
statement | |
while ( condition ) | |
/* An "if .. else" structure tests the condition first. If | |
it is true, the primary statement is evaluated. Otherwise, | |
the alternative statement is evaluated. If the alternative | |
statement is unnecessary to be evaluated, the "else" and | |
corresponding alternative statement can be omitted. */ | |
if ( condition ) | |
primary statement | |
else | |
alternative statement | |
/* A "for" structure evaluates the initial statement at the | |
beginning then tests the condition. If it is true, the primary | |
and subsequent statements are evaluated until the condition | |
becomes false. */ | |
for ( initial statement; condition; subsequent statement ) | |
primary statement | |
/* The return statement in a syntax structure specifies | |
that the parsing of the syntax structure will be terminated | |
without processing any additional information after this stage. | |
When a value immediately follows a return statement, this value | |
shall also be returned as the output of this syntax structure. */ | |
return x |
Functions
Bitstream functions used for syntax description are specified in this section.
Other functions are included in the syntax tables. The convention is that a section is called syntax if it causes syntax elements to be read from the bitstream, either directly or indirectly through subprocesses. The remaining sections are called functions.
The specification of these functions makes use of a bitstream position indicator. This bitstream position indicator locates the position of the bit that is going to be read next.
get_position( ): Return the value of the bitstream position indicator.
init_symbol( sz ): Initialize the arithmetic decode process for the Symbol decoder with a size of sz bytes as specified in Initialization process for symbol decoder.
exit_symbol( ): Exit the arithmetic decode process as described in Exit process for symbol decoder (this includes reading trailing bits).
Descriptors
General
The following descriptors specify the parsing of syntax elements. Lower case descriptors specify syntax elements that are represented by an integer number of bits in the bitstream; upper case descriptors specify syntax elements that are represented by arithmetic coding.
f(n)
Unsigned n-bit number appearing directly in the bitstream. The bits are read from high to low order. The parsing process specified in Parsing process for f(n) is invoked and the syntax element is set equal to the return value.
uvlc()
Variable length unsigned n-bit number appearing directly in the bitstream. The parsing process for this descriptor is specified below:
uvlc() { | Type |
leadingZeros = 0 | |
while ( 1 ) { | |
done | f(1) |
if ( done ) | |
break | |
leadingZeros++ | |
} | |
if ( leadingZeros >= 32 ) { | |
return ( 1 << 32 ) - 1 | |
} | |
value | f(leadingZeros) |
return value + ( 1 << leadingZeros ) - 1 | |
} |
le(n)
Unsigned little-endian n-byte number appearing directly in the bitstream. The parsing process for this descriptor is specified below:
le(n) { | Type |
t = 0 | |
for ( i = 0; i < n; i++) { | |
byte | f(8) |
t += ( byte << ( i * 8 ) ) | |
} | |
return t | |
} |
Note: This syntax element will only be present when the bitstream position is byte aligned.
leb128()
Unsigned integer represented by a variable number of little-endian bytes.
Note: This syntax element will only be present when the bitstream position is byte aligned.
In this encoding, the most significant bit of each byte is equal to 1 to signal that more bytes should be read, or equal to 0 to signal the end of the encoding.
A variable Leb128Bytes is set equal to the number of bytes read during this process.
The parsing process for this descriptor is specified below:
leb128() { | Type |
value = 0 | |
Leb128Bytes = 0 | |
for ( i = 0; i < 8; i++ ) { | |
leb128_byte | f(8) |
value |= ( (leb128_byte & 0x7f) << (i*7) ) | |
Leb128Bytes += 1 | |
if ( !(leb128_byte & 0x80) ) { | |
break | |
} | |
} | |
return value | |
} |
It is a requirement of bitstream conformance that the value returned from the leb128 parsing process is less than or equal to (1 << 32) - 1.
leb128_byte contains 8 bits read from the bitstream. The bottom 7 bits are used to compute the variable value. The most significant bit is used to indicate that there are more bytes to be read.
It is a requirement of bitstream conformance that the most significant bit of leb128_byte is equal to 0 if i is equal to 7. (This ensures that this syntax descriptor never uses more than 8 bytes.)
Note: There are multiple ways of encoding the same value depending on how many leading zero bits are encoded. There is no requirement that this syntax descriptor uses the most compressed representation. This can be useful for encoder implementations by allowing a fixed amount of space to be filled in later when the value becomes known.
su(n)
Signed integer converted from an n bits unsigned integer in the bitstream. (The unsigned integer corresponds to the bottom n bits of the signed integer.) The parsing process for this descriptor is specified below:
su(n) { | Type |
value | f(n) |
signMask = 1 << (n - 1) | |
if ( value & signMask ) | |
value = value - 2 * signMask | |
return value | |
} |
ns(n)
Unsigned encoded integer with maximum number of values n (i.e. output in range 0..n-1).
This descriptor is similar to f(CeilLog2(n)), but reduces wastage incurred when encoding non-power of two value ranges by encoding 1 fewer bits for the lower part of the value range. For example, when n is equal to 5, the encodings are as follows (full binary encodings are also presented for comparison):
Value | Full binary encoding | ns(n) encoding |
---|---|---|
0 | 000 | 00 |
1 | 001 | 01 |
2 | 010 | 10 |
3 | 011 | 110 |
4 | 100 | 111 |
The parsing process for this descriptor is specified as:
ns( n ) { | Type |
w = FloorLog2(n) + 1 | |
m = (1 << w) - n | |
v | f(w - 1) |
if ( v < m ) | |
return v | |
extra_bit | f(1) |
return (v << 1) - m + extra_bit | |
} |
The abbreviation ns stands for non-symmetric. This encoding is non-symmetric because the values are not all coded with the same number of bits.
L(n)
Unsigned arithmetic encoded n-bit number encoded as n flags (a "literal"). The flags are read from high to low order. The syntax element is set equal to the return value of read_literal( n ) (see Parsing process for read_literal for a specification of this process).
S()
An arithmetic encoded symbol coded from a small alphabet of at most 16 entries.
The symbol is decoded based on a context sensitive CDF (see Parsing process for CDF encoded syntax elements for the specification of this process).
NS(n)
Unsigned arithmetic encoded integer with maximum number of values n (i.e. output in range 0..n-1).
This descriptor is the same as ns(n) except the underlying bits are coded arithmetically.
The parsing process for this descriptor is specified as:
NS( n ) { | Type |
w = FloorLog2(n) + 1 | |
m = (1 << w) - n | |
v | L(w - 1) |
if ( v < m ) | |
return v | |
extra_bit | L(1) |
return (v << 1) - m + extra_bit | |
} |