US10911773B2 - Motion vector difference coding and decoding - Google Patents

Motion vector difference coding and decoding Download PDF

Info

Publication number
US10911773B2
US10911773B2 US16/097,344 US201716097344A US10911773B2 US 10911773 B2 US10911773 B2 US 10911773B2 US 201716097344 A US201716097344 A US 201716097344A US 10911773 B2 US10911773 B2 US 10911773B2
Authority
US
United States
Prior art keywords
mvd
mvd component
motion vector
component
xor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/097,344
Other versions
US20190141346A1 (en
Inventor
Per Wennersten
Ruoyang Yu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Priority to US16/097,344 priority Critical patent/US10911773B2/en
Assigned to TELEFONAKTIEBOLAGET LM ERICSSON (PUBL) reassignment TELEFONAKTIEBOLAGET LM ERICSSON (PUBL) ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WENNERSTEN, PER, YU, Ruoyang
Publication of US20190141346A1 publication Critical patent/US20190141346A1/en
Application granted granted Critical
Publication of US10911773B2 publication Critical patent/US10911773B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/467Embedding additional information in the video signal during the compression process characterised by the embedded information being invisible, e.g. watermarking

Definitions

  • Embodiments herein relate to the field of video coding, such as High Efficiency Video Coding (HEVC) or the like.
  • HEVC High Efficiency Video Coding
  • embodiments herein relate to a method and a video encoder for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP.
  • Embodiments herein relate to a method and a video decoder for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD).
  • MVP motion vector prediction
  • MVD motion vector difference
  • Corresponding computer programs therefor are also disclosed.
  • High Efficiency Video Coding also known as H.265
  • HEVC High Efficiency Video Coding
  • ITU-T International Telecommunications Union
  • MPEG Moving Pictures Experts Group
  • coding unit coding unit
  • CU coding unit
  • Each unit can be further partitioned into smaller blocks for spatial or temporal prediction. Up to 4 symmetric and 4 asymmetric partition structures are supported for better capturing characteristics for different video content.
  • Spatial prediction which is also called intra prediction, is achieved by predicting current block using previous decoded blocks within the same picture.
  • a picture consisting of only intra-predicted blocks is referred as an I-picture.
  • I-pictures do not have any dependency on any other pictures, they provide starting points for encoding or decoding. Therefore, the first picture inside a video sequence is typically encoded as an I-picture.
  • Temporal prediction also known as inter prediction, is achieved by predicting blocks in a current picture using blocks from previous decoded pictures, the so-called reference pictures, along with motion vectors indicating the movement between the pictures.
  • Temporal prediction is achieved using inter (P) prediction, i.e., prediction from one block in one reference picture, or bi-directional inter (B) prediction, i.e. prediction from blending of two blocks in one or two reference pictures.
  • P inter
  • B bi-directional inter
  • a picture containing only P-predicted blocks is referred as a P-picture.
  • a picture that contains at least one bi-predicted block is referred as a B-picture.
  • reference pictures When encoding a picture, it is possible to have several reference pictures. These reference pictures are grouped into two lists. The reference pictures that are displayed before the current picture are usually grouped into list 0 while the reference pictures that are displayed after the current picture are usually grouped into list 1. For signaling the usage of a specific reference picture for current block, a reference picture list flag and a reference picture index are needed to be encoded.
  • the reference picture list flag specifies which reference picture list to use, while the reference picture index specifies the targeted reference picture's index inside the list.
  • MV motion vector
  • Each MV consists of x and y components that signal the displacements between current block and the referenced block.
  • the MV may point to full-pel (full-pixel), half-pel (half-pixel) and quarter-pel (quarter-pixel) samples.
  • Half-pel and quarter-pel samples are generated by interpolating their corresponding neighboring full-pel samples.
  • MVs can constitute up to about 50% of the total bitrate.
  • a common motion vector prediction process begins with constructing a list of neighboring MVs which is referred to as MVP list, selects one of them as motion vector predictor (MVP) and then calculates the motion vector difference (MVD) between the predictor and current MV. Therefore, instead of encoding a “full” MV, only the MVP index which tells the index of the selected MVP inside the MVP list and the corresponding MVD are encoded.
  • MVP list a list of neighboring MVs which is referred to as MVP list
  • MVP motion vector predictor
  • MVD motion vector difference
  • the current MV ( ⁇ 2,4) is to be predicted from a list of two MVP candidates: a MV with index 0 and having a value (24,9) and a MV with index 1 and having a value ( ⁇ 4,7). As the current MV is closer to the MV with index 1, the MV with index 1 is chosen as the MVP candidate. What is being encoded is the MVP index 1 and the MVD (2, ⁇ 3), i.e., the difference between the current MV ( ⁇ 2, 4) and the MVP candidate ( ⁇ 4,7). As shown in the figure, the MVD has x and y components. Each of them consists of two parts: magnitude part and sign part. These two parts are usually encoded separately.
  • a first aspect of the embodiments defines a method performed by a video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP.
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the method comprises encoding the first MVD component y.
  • the method comprises encoding a representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the method comprises sending information to a video decoder on how reconstruct the second MVD component x from the representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • a second aspect of the embodiments defines a video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP.
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the video encoder comprises processing means operative to encode the first MVD component y.
  • the video encoder comprises processing means operative to encode a representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the video encoder comprises processing means operative to send information to a video decoder on how reconstruct the second MVD component x from the representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • a third aspect of the embodiments defines a computer program, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP.
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the computer program comprises code means which, when run on a computer, causes the computer to encode the first MVD component y.
  • the computer program comprises code means which, when run on a computer, causes the computer to encode a representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the computer program comprises code means which, when run on a computer, causes the computer to send information to a video decoder on how reconstruct the second MVD component x from the representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • a fourth aspect of the embodiments defines a computer program product comprising computer readable means and a computer program according to the third aspect, stored on the computer readable means.
  • a fifth aspect of the embodiments defines a method, performed by a video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD).
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the method comprises receiving information from a video encoder on how to reconstruct the second MVD component x from a representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • the method comprises decoding the first MVD component y.
  • the method comprises decoding the representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the method comprises reconstructing the second MVD component x in accordance with the received information.
  • a sixth aspect of the embodiments defines a video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD).
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the video decoder comprises processing means operative to receive information from a video encoder on how to reconstruct the second MVD component x from a representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • the video decoder comprises processing means operative to decode the first MVD component y.
  • the video decoder comprises processing means operative to decode the representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the video decoder comprises processing means operative to reconstruct the second MVD component x in
  • a seventh aspect of the embodiments defines a computer program, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD).
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the computer program comprises code means which, when run on a computer, causes the computer to receive information from a video encoder on how to reconstruct the second MVD component x from a representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • the computer program comprises code means which, when run on a computer, causes the computer to decode the first MVD component y.
  • the computer program comprises code means which, when run on a computer, causes the computer to decode the representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the computer program comprises code means which, when run on a computer, causes the computer to reconstruct the second MVD component x in accordance with the received information.
  • An eight aspect of the embodiments defines a computer program product comprising computer readable means and a computer program according to the seventh aspect, stored on the computer readable means.
  • At least some of the embodiments provide a reduced bitrate, without introducing noticeable decoding complexity.
  • any feature of the first, second, third, fourth, fifth, sixth, seventh and eighth aspects may be applied to any other aspect, whenever appropriate.
  • any advantage of the first aspect may equally apply to the second, third, fourth, fifth, sixth, seventh and eighth aspect respectively, and vice versa.
  • FIG. 1 illustrates an example of motion vector prediction according to the prior art.
  • FIG. 2 illustrates the steps performed in an encoding method according to the embodiments of the present invention.
  • FIGS. 3 (A) and (B) illustrate examples of motion vector search patterns according to an embodiment of the present invention.
  • FIG. 4 illustrates a motion vector search pattern according to an embodiment of the present invention.
  • FIG. 5 illustrates the steps performed in a decoding method according to the embodiments of the present invention.
  • FIG. 6 depicts a schematic block diagram illustrating functional units of a video encoder for encoding a motion vector according to embodiments of the present invention.
  • FIG. 7 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for encoding a motion vector, according to embodiments of the present invention.
  • FIG. 8 depicts a schematic block diagram illustrating functional units of a video decoder for reconstructing a motion vector, according to an embodiment of the present invention.
  • FIG. 9 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for reconstructing a motion vector, according to an embodiment of the present invention.
  • the present embodiments generally relate to a method and an encoder for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction and a motion vector difference between the motion vector and the MVP, as well as a method and a decoder for reconstructing a motion vector from a motion vector prediction and a motion vector difference.
  • a method, performed by a video encoder, for encoding a motion vector is provided, as described in FIG. 2 .
  • the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP.
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the method comprises a step S 1 of encoding the first MVD component y.
  • the first MVD component y may, for example, be encoded as is currently done in HEVC.
  • the method comprises a step S 2 of encoding a representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the representation ⁇ circumflex over (x) ⁇ may be a reduced size version of the second MVD component x, i.e., may carry a part of the second MVD component x. For example, it may be an incomplete version of x, obtained by removing at least one bit of the second MVD component x.
  • the representation may contain all but the least significant bit of the second MVD component, in which case the parity of the second MVD component is removed, or all but the most significant bit of the second MVD component, in which case the sign of the MVD component is removed and the representation corresponds to the magnitude of the second MVD component. More details about the representation ⁇ circumflex over (x) ⁇ will be given below.
  • the method comprises a step S 3 of sending information to a video decoder on how reconstruct the second MVD component x from the representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • This in principle means that the representation ⁇ circumflex over (x) ⁇ is created in such a way that the second MVD component x can be reconstructed from ⁇ circumflex over (x) ⁇ and at least one of: the first MVD component y and syntax elements such as MVP index, reference picture index, reference picture list flag, block partition size, block partition type etc.
  • Sending this information ensures identical reconstruction at the video decoder.
  • the video decoder is this way informed about how the reconstruction of the second MVD component and the MVD accordingly needs to be performed, i.e., which parts of the MVD components and syntax elements are used and what operation needs to be performed on them.
  • each aforementioned syntax elements are as follows: MVP index specifies the MVP candidate index in a MVP list; reference picture list flag specifies which reference picture list of usage, i.e. list 0 or list 1; reference picture index specifies the index of the targeted reference picture inside the reference picture list; block partition size specifies the partitioning size and block partition type specifies the partitioning structure.
  • these elements contain various prediction information for an inter-coded block. Therefore, it gives the video encoder additional freedom of deciding a best way to represent ⁇ circumflex over (x) ⁇ by selectively using these syntax elements.
  • the identical reconstruction needs to be performed at both the video encoder and the video decoder in order to correctly reconstruct the second MVD component x.
  • Sending (signaling) information may be done, but is not limited to, by using a sequence parameter set (SPS), picture parameter set (PPS) or video parameter set (VPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • VPS video parameter set
  • the information may also be sent “offline”, i.e., the video encoder and the video decoder may exchange in advance the details of the reconstruction process.
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x corresponds to a magnitude of the second MVD component x.
  • Magnitude(x) and Magnitude(y) respectively refer to the magnitudes of the second and first MVD components
  • PartSize and RefIdx refer to refers to a block partition size and a reference picture index, as described above.
  • ‘AND 0x01’ in the expression above means that only the last bit of the operation (Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) is taken as the Sign(x).
  • Sign(x) may also be referred to as the most significant bit of x, or MSB(x).
  • the sign of the second MVD component x is not encoded, i.e., it is not sent to a video decoder. Instead, Sign(x) is reconstructed at the decoder from the sent information that tells to use the expression above. Not encoding a bit of the second MVD component, while knowing how to reconstruct it, results in bitrate savings. This also implies that the reconstruction process of the second MVD component, according to this embodiment, needs to be identical at the video encoder and the video decoder side, as mentioned above.
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x).
  • ‘AND 0x01’ in the expression above means that only the last bit of the operation LSB(y) XOR MVPindex is taken as LSB(x). The same operation is performed at the video encoder and the video decoder, so that the least significant bit of the second MVD component x does not need to be sent.
  • This embodiment may also be generalized such that a bit other than the LSB is embedded in a corresponding bit from the first MVD component.
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x).
  • this embodiment may also be generalized such that a bit other than the LSB is embedded in a corresponding bit from the first MVD component.
  • the second MVD component's least significant bit 0 can be derived using the parity of the first MVD component. Thus, only (2, ⁇ 8) will be encoded.
  • Another alternative is to embed the second MVD component's LSB using the inverse parity of the first MVD component y.
  • a motion search pattern as shown in FIG. 3(B) may be applied on the encoder side.
  • this embodiment may also be generalized to use other bit than the LSB to be embedded in a corresponding bit from another motion vector component.
  • the procedure when reconstructing, instead of always multiplying the second MVD component's magnitude by two and adding the LSB of the first MVD component's magnitude, the procedure depends on the value of the second MVD component. For example, when this value is 0, no modification is made. When the value is smaller than 0, it is multiplied by 2 and the LSB of the first component's magnitude is added. When it is larger than 0, it is multiplied by 2, and the inverse of the LSB of the y component's magnitude is subtracted.
  • a motion vector search pattern as shown in FIG. 4 may be applied by the video encoder. The benefit of this is that the commonly used motion vectors (0,0), (0,1), (1,0), (0, ⁇ 1) and ( ⁇ 1,0) can be all used and reconstructed, which leads to slightly better overall compression.
  • this embodiment may also be generalized to use a bit other than the LSB to be embedded in a corresponding bit from another MVD component.
  • a method, performed by a video decoder, for reconstructing a motion vector from a motion vector prediction and a motion vector difference is provided, as illustrated in FIG. 5 .
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the method comprises a step S 4 of receiving information from a video encoder on how to reconstruct the second MVD component x from a representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. This ensures a correct reconstruction of the second MVD component x.
  • the received information may be signaled by using a sequence parameter set, picture parameter set or video parameter set, as mentioned above.
  • the video encoder and the video decoder may as well exchange in advance the details of the reconstruction process.
  • the method comprises a step S 5 of decoding the first MVD component y.
  • the method further comprises a step S 6 of decoding the representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the representation ⁇ circumflex over (x) ⁇ may be a reduced size version of the second MVD component x and may, for example, be obtained by removing at least one bit of the second MVD component x.
  • the representation may contain all but the least significant bit of the second MVD component, in which case the parity of the second MVD component is removed, or all but the most significant bit of the second MVD component, in which case the sign of the MVD component is removed and the representation corresponds to the magnitude of the second MVD component.
  • the method further comprises a step S 7 of reconstructing the second MVD component x in accordance with the received information.
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x corresponds to a magnitude of the second MVD component x
  • the reconstruction process in this embodiment is identical to the one described in the first encoder embodiment. For more details, please refer to the related section above.
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x).
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x).
  • the representation ⁇ circumflex over (x) ⁇ of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x).
  • FIG. 6 is a schematic block diagram of a video encoder 100 , for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction and a motion vector difference between the motion vector and the MVP.
  • the MVD comprises a first MVD component y and a second MVD component x.
  • the video encoder 100 comprises, according to this aspect, an encoding unit 160 , configured to encode the first MVD component y and to encode a representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the video encoder 100 comprises, according to this aspect, a sending unit 170 , configured to send information to a video decoder on how reconstruct the second MVD component x from the representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • the encoding 160 and sending 170 units may be hardware based, software based (in this case they are called encoding and sending modules respectively) or may be a combination of hardware and software.
  • the encoder 100 may be an HEVC encoder or any other state of the art or future video encoder.
  • the video encoder 100 can be implemented in hardware, in software or a combination of hardware and software.
  • the video encoder 100 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.
  • the video encoder 100 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.
  • FIG. 6 Although the respective units disclosed in conjunction with FIG. 6 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in FIG. 7 .
  • ASICs Application Specific Integrated Circuits
  • FIG. 7 schematically illustrates an embodiment of a computer 150 having a processing unit 110 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit).
  • the processing unit 110 can be a single unit or a plurality of units for performing different steps of the method described herein.
  • the computer also comprises an input/output (I/O) unit 120 for receiving a video sequence.
  • the I/O unit 120 has been illustrated as a single unit in FIG. 6 but can likewise be in the form of a separate input unit and a separate output unit.
  • the computer 150 comprises at least one computer program product 130 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive.
  • the computer program product 130 comprises a computer program 140 , which comprises code means which, when run on the computer 150 , such as by the processing unit 110 , causes the computer 150 to perform the steps of the method described in the foregoing in connection with FIG. 2 .
  • FIG. 8 is a schematic block diagram of a video decoder 200 for reconstructing a motion vector from a motion vector prediction and a motion vector difference, wherein the motion vector difference comprises a first MVD component y and a second MVD component x.
  • the video decoder 200 comprises, according to this aspect, a receiving unit 260 , configured to receive information from a video encoder 100 on how to reconstruct the second MVD component x from a representation ⁇ circumflex over (x) ⁇ of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
  • the video decoder 200 comprises, according to this aspect, a decoding unit 270 , configured to decode the first MVD component y and to decode the representation ⁇ circumflex over (x) ⁇ of the second MVD component x.
  • the video decoder comprises, according to this aspect, a reconstructing unit 280 , configured to reconstruct the second MVD component x in accordance with the received information.
  • the receiving 260 , decoding 270 and reconstructing 280 units may be hardware based, software based (in this case they are called receiving, decoding and reconstructing modules respectively) or may be a combination of hardware and software.
  • the video decoder 200 may be an HEVC decoder or any other state of the art or future video decoder.
  • the video decoder 200 can be implemented in hardware, in software or a combination of hardware and software.
  • the video decoder 200 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.
  • the video decoder 200 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.
  • FIG. 8 Although the respective units disclosed in conjunction with FIG. 8 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in FIG. 9 .
  • ASICs Application Specific Integrated Circuits
  • FIG. 9 schematically illustrates an embodiment of a computer 250 having a processing unit 210 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit).
  • the processing unit 210 can be a single unit or a plurality of units for performing different steps of the method described herein.
  • the computer also comprises an input/output (I/O) unit 220 for receiving a video bitstream.
  • the I/O unit 220 has been illustrated as a single unit in FIG. 9 but can likewise be in the form of a separate input unit and a separate output unit.
  • the computer 250 comprises at least one computer program product 230 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive.
  • the computer program product 230 comprises a computer program 240 , which comprises code means which, when run on the computer 250 , such as by the processing unit 210 , causes the computer 250 to perform the steps of the method described in the foregoing in connection with FIG. 5 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

There are provided mechanisms for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The method comprises encoding the first MVD component y. The method comprises encoding a representation {circumflex over (x)} of the second MVD component x. The method comprises sending information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. There are provided mechanisms for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD). The MVD comprises a first MVD component y and a second MVD component x. The method comprises receiving information from a video encoder on how to reconstruct the second MVD component x from a representation x of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. The method comprises decoding the first MVD component y. The method comprises decoding the representation x of the second MVD component x. The method comprises reconstructing the second MVD component x in accordance with the received information.

Description

TECHNICAL FIELD
Embodiments herein relate to the field of video coding, such as High Efficiency Video Coding (HEVC) or the like. In particular, embodiments herein relate to a method and a video encoder for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. Embodiments herein relate to a method and a video decoder for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD). Corresponding computer programs therefor are also disclosed.
BACKGROUND
High Efficiency Video Coding (HEVC), also known as H.265, is a block based hybrid video codec, standardized by the Telecommunication Standardization Sector of the International Telecommunications Union (ITU-T) and the Moving Pictures Experts Group (MPEG), that utilizes both temporal and spatial prediction.
Pictures in a video sequences are divided into small square units as bases for encoding. In HEVC, these units are referred to as coding unit (CU) which can range from size of 8×8 to 64×64. Each unit can be further partitioned into smaller blocks for spatial or temporal prediction. Up to 4 symmetric and 4 asymmetric partition structures are supported for better capturing characteristics for different video content.
Spatial prediction, which is also called intra prediction, is achieved by predicting current block using previous decoded blocks within the same picture. A picture consisting of only intra-predicted blocks is referred as an I-picture. As I-pictures do not have any dependency on any other pictures, they provide starting points for encoding or decoding. Therefore, the first picture inside a video sequence is typically encoded as an I-picture.
Temporal prediction, also known as inter prediction, is achieved by predicting blocks in a current picture using blocks from previous decoded pictures, the so-called reference pictures, along with motion vectors indicating the movement between the pictures. Temporal prediction is achieved using inter (P) prediction, i.e., prediction from one block in one reference picture, or bi-directional inter (B) prediction, i.e. prediction from blending of two blocks in one or two reference pictures. A picture containing only P-predicted blocks is referred as a P-picture. A picture that contains at least one bi-predicted block is referred as a B-picture.
When encoding a picture, it is possible to have several reference pictures. These reference pictures are grouped into two lists. The reference pictures that are displayed before the current picture are usually grouped into list 0 while the reference pictures that are displayed after the current picture are usually grouped into list 1. For signaling the usage of a specific reference picture for current block, a reference picture list flag and a reference picture index are needed to be encoded. The reference picture list flag specifies which reference picture list to use, while the reference picture index specifies the targeted reference picture's index inside the list.
The position of a referenced block is indicated using motion vector (MV). Each MV consists of x and y components that signal the displacements between current block and the referenced block. The MV may point to full-pel (full-pixel), half-pel (half-pixel) and quarter-pel (quarter-pixel) samples. Half-pel and quarter-pel samples are generated by interpolating their corresponding neighboring full-pel samples. For an encoded video bitstream, MVs can constitute up to about 50% of the total bitrate.
Since a moving object in a picture may span across several blocks, the MV of the current block is usually highly correlated with neighboring blocks' MVs. Thus, one way of improving the compression efficiency is to predict current MV from neighboring MVs. A common motion vector prediction process begins with constructing a list of neighboring MVs which is referred to as MVP list, selects one of them as motion vector predictor (MVP) and then calculates the motion vector difference (MVD) between the predictor and current MV. Therefore, instead of encoding a “full” MV, only the MVP index which tells the index of the selected MVP inside the MVP list and the corresponding MVD are encoded. FIG. 1 shows an example of motion vector prediction. The current MV (−2,4) is to be predicted from a list of two MVP candidates: a MV with index 0 and having a value (24,9) and a MV with index 1 and having a value (−4,7). As the current MV is closer to the MV with index 1, the MV with index 1 is chosen as the MVP candidate. What is being encoded is the MVP index 1 and the MVD (2, −3), i.e., the difference between the current MV (−2, 4) and the MVP candidate (−4,7). As shown in the figure, the MVD has x and y components. Each of them consists of two parts: magnitude part and sign part. These two parts are usually encoded separately.
SUMMARY
The problem with motion vector prediction is that there still exists redundancy in motion vector difference encoding which leaves room for further improving compression efficiency. In order to reduce a bitrate or, equivalently, increase coding efficiency, a method and arrangements for encoding and decoding of motion vectors are provided.
A first aspect of the embodiments defines a method performed by a video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The method comprises encoding the first MVD component y. The method comprises encoding a representation {circumflex over (x)} of the second MVD component x. The method comprises sending information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
A second aspect of the embodiments defines a video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The video encoder comprises processing means operative to encode the first MVD component y. The video encoder comprises processing means operative to encode a representation {circumflex over (x)} of the second MVD component x. The video encoder comprises processing means operative to send information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
A third aspect of the embodiments defines a computer program, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The computer program comprises code means which, when run on a computer, causes the computer to encode the first MVD component y. The computer program comprises code means which, when run on a computer, causes the computer to encode a representation {circumflex over (x)} of the second MVD component x. The computer program comprises code means which, when run on a computer, causes the computer to send information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
A fourth aspect of the embodiments defines a computer program product comprising computer readable means and a computer program according to the third aspect, stored on the computer readable means.
A fifth aspect of the embodiments defines a method, performed by a video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD). The MVD comprises a first MVD component y and a second MVD component x. The method comprises receiving information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. The method comprises decoding the first MVD component y. The method comprises decoding the representation {circumflex over (x)} of the second MVD component x. The method comprises reconstructing the second MVD component x in accordance with the received information.
A sixth aspect of the embodiments defines a video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD). The MVD comprises a first MVD component y and a second MVD component x. The video decoder comprises processing means operative to receive information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. The video decoder comprises processing means operative to decode the first MVD component y. The video decoder comprises processing means operative to decode the representation {circumflex over (x)} of the second MVD component x. The video decoder comprises processing means operative to reconstruct the second MVD component x in accordance with the received information.
A seventh aspect of the embodiments defines a computer program, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD). The MVD comprises a first MVD component y and a second MVD component x. The computer program comprises code means which, when run on a computer, causes the computer to receive information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
The computer program comprises code means which, when run on a computer, causes the computer to decode the first MVD component y. The computer program comprises code means which, when run on a computer, causes the computer to decode the representation {circumflex over (x)} of the second MVD component x. The computer program comprises code means which, when run on a computer, causes the computer to reconstruct the second MVD component x in accordance with the received information.
An eight aspect of the embodiments defines a computer program product comprising computer readable means and a computer program according to the seventh aspect, stored on the computer readable means.
Advantageously, at least some of the embodiments provide a reduced bitrate, without introducing noticeable decoding complexity.
It is to be noted that any feature of the first, second, third, fourth, fifth, sixth, seventh and eighth aspects may be applied to any other aspect, whenever appropriate. Likewise, any advantage of the first aspect may equally apply to the second, third, fourth, fifth, sixth, seventh and eighth aspect respectively, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims and from the drawings.
Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to “a/an/the element, apparatus, component, means, step, etc.” are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention, together with further objects and advantages thereof, may best be understood by referring to the following description taken together with the accompanying drawings, in which:
FIG. 1 illustrates an example of motion vector prediction according to the prior art.
FIG. 2 illustrates the steps performed in an encoding method according to the embodiments of the present invention.
FIGS. 3 (A) and (B) illustrate examples of motion vector search patterns according to an embodiment of the present invention.
FIG. 4 illustrates a motion vector search pattern according to an embodiment of the present invention.
FIG. 5 illustrates the steps performed in a decoding method according to the embodiments of the present invention.
FIG. 6 depicts a schematic block diagram illustrating functional units of a video encoder for encoding a motion vector according to embodiments of the present invention.
FIG. 7 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for encoding a motion vector, according to embodiments of the present invention.
FIG. 8 depicts a schematic block diagram illustrating functional units of a video decoder for reconstructing a motion vector, according to an embodiment of the present invention.
FIG. 9 illustrates a schematic block diagram illustrating a computer comprising a computer program product with a computer program for reconstructing a motion vector, according to an embodiment of the present invention.
DETAILED DESCRIPTION
The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the art to make and use the invention.
Even though the description of the invention is based on the HEVC codec, it is to be understood by a person skilled in the art that the invention could be applied to any other state-of-the-art and a future block-based video coding standard.
The present embodiments generally relate to a method and an encoder for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction and a motion vector difference between the motion vector and the MVP, as well as a method and a decoder for reconstructing a motion vector from a motion vector prediction and a motion vector difference.
According to one aspect, a method, performed by a video encoder, for encoding a motion vector is provided, as described in FIG. 2. The motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The method comprises a step S1 of encoding the first MVD component y. The first MVD component y may, for example, be encoded as is currently done in HEVC.
The method comprises a step S2 of encoding a representation {circumflex over (x)} of the second MVD component x. The representation {circumflex over (x)} may be a reduced size version of the second MVD component x, i.e., may carry a part of the second MVD component x. For example, it may be an incomplete version of x, obtained by removing at least one bit of the second MVD component x. Typically, the representation may contain all but the least significant bit of the second MVD component, in which case the parity of the second MVD component is removed, or all but the most significant bit of the second MVD component, in which case the sign of the MVD component is removed and the representation corresponds to the magnitude of the second MVD component. More details about the representation {circumflex over (x)} will be given below.
The method comprises a step S3 of sending information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. This in principle means that the representation {circumflex over (x)} is created in such a way that the second MVD component x can be reconstructed from {circumflex over (x)} and at least one of: the first MVD component y and syntax elements such as MVP index, reference picture index, reference picture list flag, block partition size, block partition type etc. Sending this information (i.e., signaling information) ensures identical reconstruction at the video decoder. The video decoder is this way informed about how the reconstruction of the second MVD component and the MVD accordingly needs to be performed, i.e., which parts of the MVD components and syntax elements are used and what operation needs to be performed on them.
The semantics of each aforementioned syntax elements are as follows: MVP index specifies the MVP candidate index in a MVP list; reference picture list flag specifies which reference picture list of usage, i.e. list 0 or list 1; reference picture index specifies the index of the targeted reference picture inside the reference picture list; block partition size specifies the partitioning size and block partition type specifies the partitioning structure. As can be seen, these elements contain various prediction information for an inter-coded block. Therefore, it gives the video encoder additional freedom of deciding a best way to represent {circumflex over (x)} by selectively using these syntax elements. As will be further seen below, the identical reconstruction needs to be performed at both the video encoder and the video decoder in order to correctly reconstruct the second MVD component x.
Sending (signaling) information may be done, but is not limited to, by using a sequence parameter set (SPS), picture parameter set (PPS) or video parameter set (VPS). The information may also be sent “offline”, i.e., the video encoder and the video decoder may exchange in advance the details of the reconstruction process.
According to the first embodiment of the present invention, the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x. The sign of the second MVD component x is reconstructed as:
Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR Part Size XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, and AND is a logical AND operation. In this expression, Magnitude(x) and Magnitude(y) respectively refer to the magnitudes of the second and first MVD components, and PartSize and RefIdx refer to refers to a block partition size and a reference picture index, as described above. ‘AND 0x01’ in the expression above means that only the last bit of the operation (Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) is taken as the Sign(x). Sign(x) may also be referred to as the most significant bit of x, or MSB(x).
According to this embodiment, the sign of the second MVD component x is not encoded, i.e., it is not sent to a video decoder. Instead, Sign(x) is reconstructed at the decoder from the sent information that tells to use the expression above. Not encoding a bit of the second MVD component, while knowing how to reconstruct it, results in bitrate savings. This also implies that the reconstruction process of the second MVD component, according to this embodiment, needs to be identical at the video encoder and the video decoder side, as mentioned above.
According to the second embodiment of the present invention, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x). The least significant bit of the second MVD component x is, according to this embodiment, embedded (contained) in the parity information (i.e. the least significant bit) of the first MVD component y, LSB(y), and the MVP index:
LSB(x)=(LSB(y)XOR MVPindex) AND 0x01.
In this case, the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex) AND 0x01).
Similar as above, ‘AND 0x01’ in the expression above means that only the last bit of the operation LSB(y) XOR MVPindex is taken as LSB(x). The same operation is performed at the video encoder and the video decoder, so that the least significant bit of the second MVD component x does not need to be sent.
This embodiment may also be generalized such that a bit other than the LSB is embedded in a corresponding bit from the first MVD component.
According to the third embodiment of the present invention, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x). However, in this case, the least significant bit of the second MVD component is obtained by combining multiple syntax elements:
LSB(x)=(LSB(y)XOR MVPindex XOR RefIdx XOR PartSize) AND 0x01.
In this case the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+(LSB(y)XOR MVPindex XOR RefIdx XOR PartSize) AND 0x01.
Similar as in the second embodiment above, this embodiment may also be generalized such that a bit other than the LSB is embedded in a corresponding bit from the first MVD component.
According to the fourth embodiment of the present invention, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), which is instead embedded in the parity information of the first MVD component y:
LSB(x)=LSB(y).
For example, if the MVD is (4, −8), i.e. x=4 and y=−8, then the second MVD component's least significant bit 0 can be derived using the parity of the first MVD component. Thus, only (2, −8) will be encoded.
According to this embodiment, the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+LSB(y).
Setting a condition LSB(x)=LSB(y) may imply that, on the video encoder side, a motion vector search pattern as shown in FIG. 3(A) may be applied when searching for the best motion vector around a certain MVP. The circles correspond to allowed MV positions, i.e. the positions for which LSB(x)=LSB(y), while crosses indicate disallowed positions, i.e. LSB(x)≠LSB(y). By using this search pattern it is assured that the condition LSB(x)=LSB(y), in which case the second MVD component x can always be reconstructed as above.
Another alternative is to embed the second MVD component's LSB using the inverse parity of the first MVD component y. In this case, a motion search pattern as shown in FIG. 3(B) may be applied on the encoder side.
Similar as above, this embodiment may also be generalized to use other bit than the LSB to be embedded in a corresponding bit from another motion vector component.
According to another version of this embodiment, when reconstructing, instead of always multiplying the second MVD component's magnitude by two and adding the LSB of the first MVD component's magnitude, the procedure depends on the value of the second MVD component. For example, when this value is 0, no modification is made. When the value is smaller than 0, it is multiplied by 2 and the LSB of the first component's magnitude is added. When it is larger than 0, it is multiplied by 2, and the inverse of the LSB of the y component's magnitude is subtracted. In this case, a motion vector search pattern as shown in FIG. 4 may be applied by the video encoder. The benefit of this is that the commonly used motion vectors (0,0), (0,1), (1,0), (0,−1) and (−1,0) can be all used and reconstructed, which leads to slightly better overall compression.
Similar as above, this embodiment may also be generalized to use a bit other than the LSB to be embedded in a corresponding bit from another MVD component.
According to another aspect, a method, performed by a video decoder, for reconstructing a motion vector from a motion vector prediction and a motion vector difference is provided, as illustrated in FIG. 5. The MVD comprises a first MVD component y and a second MVD component x. The method comprises a step S4 of receiving information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. This ensures a correct reconstruction of the second MVD component x. The received information may be signaled by using a sequence parameter set, picture parameter set or video parameter set, as mentioned above. The video encoder and the video decoder may as well exchange in advance the details of the reconstruction process.
The method comprises a step S5 of decoding the first MVD component y. The method further comprises a step S6 of decoding the representation {circumflex over (x)} of the second MVD component x. Similar as described in step S2 above, the representation {circumflex over (x)} may be a reduced size version of the second MVD component x and may, for example, be obtained by removing at least one bit of the second MVD component x. Typically, the representation may contain all but the least significant bit of the second MVD component, in which case the parity of the second MVD component is removed, or all but the most significant bit of the second MVD component, in which case the sign of the MVD component is removed and the representation corresponds to the magnitude of the second MVD component.
The method further comprises a step S7 of reconstructing the second MVD component x in accordance with the received information.
According to an embodiment, which is combined with the first encoder embodiment, the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x, and a sign of the second MVD component x is calculated as:
Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01.
The reconstruction process in this embodiment is identical to the one described in the first encoder embodiment. For more details, please refer to the related section above.
According to another embodiment, which is related to the second encoder embodiment above, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x). The second MVD component x is reconstructed as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex) AND 0x01).
According to yet another embodiment, which is related to the third encoder embodiment above, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x). The second MVD component x is reconstructed as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex XOR RefIdx XOR PartSize) AND 0x01).
According to yet another embodiment, which is related to the fourth encoder embodiment above, the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x). The second MVD component x is reconstructed as:
x=2{circumflex over (x)}+LSB(y).
FIG. 6 is a schematic block diagram of a video encoder 100, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction and a motion vector difference between the motion vector and the MVP. The MVD comprises a first MVD component y and a second MVD component x. The video encoder 100 comprises, according to this aspect, an encoding unit 160, configured to encode the first MVD component y and to encode a representation {circumflex over (x)} of the second MVD component x. The video encoder 100 comprises, according to this aspect, a sending unit 170, configured to send information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type.
The encoding 160 and sending 170 units may be hardware based, software based (in this case they are called encoding and sending modules respectively) or may be a combination of hardware and software.
The encoder 100 may be an HEVC encoder or any other state of the art or future video encoder.
The sending unit 170 may send information to the video decoder that the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x, and that the sign of the second MVD component x is reconstructed as:
Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, and AND is a logical AND operation.
The sending unit 170 may send information to the video decoder that the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), and that the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex) AND 0x01).
The sending unit 170 may send information to the video decoder that the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), and that the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex XOR RefIdx XOR PartSize) AND 0x01).
The sending unit 170 may send information to the video decoder that the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), and that the second MVD component x is reconstructed as:
x=2{circumflex over (x)}+LSB(y).
The video encoder 100 can be implemented in hardware, in software or a combination of hardware and software. The video encoder 100 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer. The video encoder 100 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.
Although the respective units disclosed in conjunction with FIG. 6 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in FIG. 7.
FIG. 7 schematically illustrates an embodiment of a computer 150 having a processing unit 110 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit). The processing unit 110 can be a single unit or a plurality of units for performing different steps of the method described herein. The computer also comprises an input/output (I/O) unit 120 for receiving a video sequence. The I/O unit 120 has been illustrated as a single unit in FIG. 6 but can likewise be in the form of a separate input unit and a separate output unit.
Furthermore, the computer 150 comprises at least one computer program product 130 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive. The computer program product 130 comprises a computer program 140, which comprises code means which, when run on the computer 150, such as by the processing unit 110, causes the computer 150 to perform the steps of the method described in the foregoing in connection with FIG. 2.
FIG. 8 is a schematic block diagram of a video decoder 200 for reconstructing a motion vector from a motion vector prediction and a motion vector difference, wherein the motion vector difference comprises a first MVD component y and a second MVD component x. The video decoder 200 comprises, according to this aspect, a receiving unit 260, configured to receive information from a video encoder 100 on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index MVPindex, reference picture index RefIdx, reference picture list flag, block partition size PartSize and block partition type. The video decoder 200 comprises, according to this aspect, a decoding unit 270, configured to decode the first MVD component y and to decode the representation {circumflex over (x)} of the second MVD component x. The video decoder comprises, according to this aspect, a reconstructing unit 280, configured to reconstruct the second MVD component x in accordance with the received information.
The receiving 260, decoding 270 and reconstructing 280 units may be hardware based, software based (in this case they are called receiving, decoding and reconstructing modules respectively) or may be a combination of hardware and software.
The video decoder 200 may be an HEVC decoder or any other state of the art or future video decoder.
The reconstructing unit 280 may, when the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x, reconstruct the sign of the second MVD component x as:
Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, and AND is a logical AND operation.
The reconstructing unit 280 may, when the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), reconstruct the second MVD component x as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex) AND 0x01).
The reconstructing unit 270 may, when the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), reconstruct the second MVD component x as:
x=2{circumflex over (x)}+((LSB(y)XOR MVPindex XOR RefIdx XOR PartSize) AND 0x01).
The reconstructing unit 270 may, when the representation {circumflex over (x)} of the second MVD component x comprises all but the least significant bit of the second MVD component x, LSB(x), reconstruct the second MVD component x as:
x=2{circumflex over (x)}+LSB(y).
The video decoder 200 can be implemented in hardware, in software or a combination of hardware and software. The video decoder 200 can be implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer. The video decoder 200 may also be implemented in a network device in the form of or connected to a network node, such as radio base station, in a communication network or system.
Although the respective units disclosed in conjunction with FIG. 8 have been disclosed as physically separate units in the device, where all may be special purpose circuits, such as ASICs (Application Specific Integrated Circuits), alternative embodiments of the device are possible where some or all of the units are implemented as computer program modules running on a general purpose processor. Such an embodiment is disclosed in FIG. 9.
FIG. 9 schematically illustrates an embodiment of a computer 250 having a processing unit 210 such as a DSP (Digital Signal Processor) or CPU (Central Processing Unit). The processing unit 210 can be a single unit or a plurality of units for performing different steps of the method described herein. The computer also comprises an input/output (I/O) unit 220 for receiving a video bitstream. The I/O unit 220 has been illustrated as a single unit in FIG. 9 but can likewise be in the form of a separate input unit and a separate output unit.
Furthermore, the computer 250 comprises at least one computer program product 230 in the form of a non-volatile memory, for instance an EEPROM (Electrically Erasable Programmable Read-Only Memory), a flash memory or a disk drive. The computer program product 230 comprises a computer program 240, which comprises code means which, when run on the computer 250, such as by the processing unit 210, causes the computer 250 to perform the steps of the method described in the foregoing in connection with FIG. 5.
The embodiments described above are to be understood as a few illustrative examples of the present invention. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the scope of the present invention. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.

Claims (4)

The invention claimed is:
1. A method, performed by a video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP, wherein the MVD comprises a first MVD component y and a second MVD component x, the method comprising:
encoding the first MVD component y;
encoding a representation {circumflex over (x)} of the second MVD component x;
sending information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index (MVPindex), reference picture index (RefIdx), reference picture list flag, block partition size (PartSize), or block partition type;
wherein the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x; and
wherein the sign of the second MVD component x is reconstructed as:

Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, and AND is a logical AND operation.
2. A method, performed by a video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD), wherein the MVD comprises a first MVD component y and a second MVD component x, the method comprising:
receiving information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index (MVPindex), reference picture index (RefIdx), reference picture list flag, block partition size (PartSize), or block partition type;
decoding the first MVD component y;
decoding the representation {circumflex over (x)} of the second MVD component x;
reconstructing the second MVD component x in accordance with the received information;
wherein the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x; and
wherein a sign of the second MVD component x is calculated as:

Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, AND is a logical AND operation.
3. A video encoder, for encoding a motion vector, wherein the motion vector is represented by a sum of a motion vector prediction (MVP) and a motion vector difference (MVD) between the motion vector and the MVP, wherein the MVD comprises a first MVD component y and a second MVD component x, the video encoder comprising:
processing circuitry; and
memory comprising instructions executable by the processing circuitry whereby the video encoder is operative to:
encode the first MVD component y;
encode a representation {circumflex over (x)} of the second MVD component x;
send information to a video decoder on how reconstruct the second MVD component x from the representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index (MVPindex), reference picture index (RefIdx), reference picture list flag, block partition size (PartSize), or block partition type;
wherein the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x; and
wherein the sign of the second MVD component x is reconstructed as:

Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01,
wherein XOR is an exclusive OR logical operation, AND is a logical AND operation.
4. A video decoder, for reconstructing a motion vector from a motion vector prediction (MVP) and a motion vector difference (MVD), wherein the MVD comprises a first MVD component y and a second MVD component x, the video decoder comprising:
processing circuitry;
memory containing instructions executable by the processing circuitry whereby the video decoder is operative to:
receive information from a video encoder on how to reconstruct the second MVD component x from a representation {circumflex over (x)} of the second MVD component x and at least one of: the first MVD component y, MVP index (MVPindex), reference picture index (RefIdx), reference picture list flag, block partition size (PartSize), or block partition type;
decode the first MVD component y;
decode the representation 2 of the second MVD component x;
reconstruct the second MVD component x in accordance with the received information;
wherein the representation {circumflex over (x)} of the second MVD component x corresponds to a magnitude of the second MVD component x; and
wherein a sign of the second MVD component x is calculated as:
Sign(x)=(Magnitude(x)XOR Magnitude(y)XOR PartSize XOR RefIdx) AND 0x01, wherein XOR is an exclusive OR logical operation, AND is a logical AND operation.
US16/097,344 2016-05-13 2017-05-12 Motion vector difference coding and decoding Active 2037-07-08 US10911773B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/097,344 US10911773B2 (en) 2016-05-13 2017-05-12 Motion vector difference coding and decoding

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201662335998P 2016-05-13 2016-05-13
US16/097,344 US10911773B2 (en) 2016-05-13 2017-05-12 Motion vector difference coding and decoding
PCT/EP2017/061531 WO2017194773A1 (en) 2016-05-13 2017-05-12 Motion vector difference coding and decoding

Publications (2)

Publication Number Publication Date
US20190141346A1 US20190141346A1 (en) 2019-05-09
US10911773B2 true US10911773B2 (en) 2021-02-02

Family

ID=58701657

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/097,344 Active 2037-07-08 US10911773B2 (en) 2016-05-13 2017-05-12 Motion vector difference coding and decoding

Country Status (2)

Country Link
US (1) US10911773B2 (en)
WO (1) WO2017194773A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3806461A4 (en) 2018-06-07 2022-03-16 Wilus Institute of Standards and Technology Inc. Video signal processing method and apparatus using adaptive motion vector resolution
TWI723430B (en) * 2018-06-19 2021-04-01 大陸商北京字節跳動網絡技術有限公司 Multi-candidates of different precisions
EP3827586A1 (en) 2018-09-19 2021-06-02 Beijing Bytedance Network Technology Co. Ltd. Syntax reuse for affine mode with adaptive motion vector resolution
US10841356B2 (en) * 2018-11-28 2020-11-17 Netflix, Inc. Techniques for encoding a media title while constraining bitrate variations
US10880354B2 (en) 2018-11-28 2020-12-29 Netflix, Inc. Techniques for encoding a media title while constraining quality variations
WO2020181476A1 (en) * 2019-03-11 2020-09-17 华为技术有限公司 Video image prediction method and device
US11153598B2 (en) * 2019-06-04 2021-10-19 Tencent America LLC Method and apparatus for video coding using a subblock-based affine motion model
JP2023553922A (en) * 2021-09-15 2023-12-26 テンセント・アメリカ・エルエルシー Method and apparatus for improved signaling of motion vector differences

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120189055A1 (en) 2011-01-21 2012-07-26 Qualcomm Incorporated Motion vector prediction
EP2618572A1 (en) 2012-01-20 2013-07-24 Research In Motion Limited Multiple sign bit hiding within a transform unit
EP2637409A1 (en) 2012-03-08 2013-09-11 BlackBerry Limited Motion vector sign bit hiding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120189055A1 (en) 2011-01-21 2012-07-26 Qualcomm Incorporated Motion vector prediction
EP2618572A1 (en) 2012-01-20 2013-07-24 Research In Motion Limited Multiple sign bit hiding within a transform unit
EP2637409A1 (en) 2012-03-08 2013-09-11 BlackBerry Limited Motion vector sign bit hiding
US20130235936A1 (en) * 2012-03-08 2013-09-12 Research In Motion Limited Motion vector sign bit hiding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Clare, G., "Sign Data Hiding", Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTCI/SC29/WG11, 7th Meeting, Geneva, Document: JCTVS-G271, Nov. 21, 2011, pp. 1-9, ISO.
Samuelsson, J., "Motion vector coding optimization", Joint Video Exploration Team (JVET), of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC29/WG 11, 3rd Meeting, Geneva, Document JVET-C0068, May 26, 2016, pp. 1-3, ISO.

Also Published As

Publication number Publication date
WO2017194773A1 (en) 2017-11-16
US20190141346A1 (en) 2019-05-09

Similar Documents

Publication Publication Date Title
US10911773B2 (en) Motion vector difference coding and decoding
US11140408B2 (en) Affine motion prediction
US11503333B2 (en) Unified merge candidate list usage
US10257536B2 (en) Method of temporal derived bi-directional motion vector for motion vector prediciton
RU2697806C9 (en) Moving picture decoding device and moving picture decoding method
JP2020109976A (en) Image prediction method and device
US10567786B2 (en) Motion vector processing
JP7297874B2 (en) Apparatus and method for conditional decoder-side motion vector refinement in video coding
WO2013099285A1 (en) Video encoding device, video encoding method and video encoding program, and video decoding device, video decoding method and video decoding program
KR20220064962A (en) History-based motion vector prediction
CN111052739A (en) Method and apparatus for image processing based on inter prediction mode
KR102600670B1 (en) Motion vector prediction method and device, encoder, and decoder
JP2016042727A (en) Moving picture coding device, moving picture coding method and moving picture coding program, and transmission device, transmission method and transmission program
CN113508599A (en) Syntax for motion information signaling in video coding
JP2013153431A (en) Video decoding device, video decoding method, and video decoding program
US10863189B2 (en) Motion vector reconstruction order swap
WO2019191867A1 (en) Method and apparatus for video encoding and decoding
JP2014131289A (en) Moving image decoder and method
JP5725009B2 (en) Moving picture decoding apparatus, moving picture decoding method, moving picture decoding program, receiving apparatus, receiving method, and receiving program
JP2022513492A (en) How to derive a constructed affine merge candidate
RU2809673C2 (en) Method and device for image prediction
CN111247804B (en) Image processing method and device
JP5842803B2 (en) Moving picture coding apparatus, moving picture coding method, moving picture coding program, transmission apparatus, transmission method, and transmission program
US20230095946A1 (en) Block Vector Difference Signaling for Intra Block Copy
CN111355958B (en) Video decoding method and device

Legal Events

Date Code Title Description
AS Assignment

Owner name: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL), SWEDEN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WENNERSTEN, PER;YU, RUOYANG;REEL/FRAME:047338/0487

Effective date: 20170515

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE