WO2020052653A1 - Decoding method and device for predicted motion information - Google Patents

Decoding method and device for predicted motion information Download PDF

Info

Publication number
WO2020052653A1
WO2020052653A1 PCT/CN2019/105711 CN2019105711W WO2020052653A1 WO 2020052653 A1 WO2020052653 A1 WO 2020052653A1 CN 2019105711 W CN2019105711 W CN 2019105711W WO 2020052653 A1 WO2020052653 A1 WO 2020052653A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion information
candidate
motion vector
identifier
list
Prior art date
Application number
PCT/CN2019/105711
Other languages
French (fr)
Chinese (zh)
Inventor
陈旭
郑建铧
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811264674.7A external-priority patent/CN110896485B/en
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19860217.9A priority Critical patent/EP3843404A4/en
Priority to SG11202102362UA priority patent/SG11202102362UA/en
Priority to BR112021004429-9A priority patent/BR112021004429A2/en
Priority to KR1020247028818A priority patent/KR20240135033A/en
Priority to KR1020217010321A priority patent/KR102701208B1/en
Priority to JP2021513418A priority patent/JP7294576B2/en
Priority to CA3112289A priority patent/CA3112289A1/en
Publication of WO2020052653A1 publication Critical patent/WO2020052653A1/en
Priority to US17/198,544 priority patent/US20210203944A1/en
Priority to ZA2021/01890A priority patent/ZA202101890B/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding

Definitions

  • the present application relates to the technical field of video encoding and decoding, and in particular, to a method and device for predicting motion information decoding.
  • Digital video technology can be widely used in various devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), notebook computers, tablet computers, e-book readers, digital cameras, digital Recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio telephones, video teleconferencing devices, video streaming devices, and the like.
  • Digital video devices implement video decoding technology to effectively send, receive, encode, decode, and / or store digital video information.
  • Video compression techniques perform spatial (intra-image) prediction and / or temporal (inter-image) prediction to reduce or remove redundant information inherent in a video sequence.
  • the basic principle of video compression is to remove the redundancy as much as possible by using the correlation between the spatial, temporal and codewords.
  • the current popular approach is to use a block-based hybrid video coding framework to achieve video coding compression through prediction (including intra prediction and inter prediction), transformation, quantization, and entropy coding.
  • Inter-prediction is to use the correlation of the video time domain to predict the pixels of the current image by using the pixels of adjacent coded images in order to effectively remove the video time domain redundancy.
  • the predicted motion information of each image block is determined from the candidate motion information list, thereby generating its prediction block through a motion compensation process.
  • the motion information includes reference image information and motion vectors.
  • the reference picture information includes unidirectional / bidirectional prediction information, a reference picture list, and a reference picture index corresponding to the reference picture list.
  • Motion vectors refer to horizontal and vertical position shifts.
  • the embodiments of the present application provide a decoding method and device for predicting motion information, which can effectively control the length of the candidate motion information list when more candidate motion information is introduced.
  • a first aspect of the embodiments of the present application provides a method for decoding motion information prediction, including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first candidate
  • the elements in the set include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset; when the target element When it is the first candidate motion information, the first candidate motion information as the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; when the target element is based on the plurality of second candidate motion information When obtained, the code stream is parsed to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information.
  • the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information.
  • the structure of the multilayer candidate set when more is introduced
  • a set of candidate motion information sets can be added as an element to the first candidate set.
  • the length of the first candidate set is greatly shortened.
  • the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
  • the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
  • the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier.
  • the target element is uniquely determined by combining the fourth identifier with the first identifier.
  • the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
  • the first candidate motion information may be candidate motion information generated by a Merge mode.
  • the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
  • determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier includes: biasing the plurality of preset motion information according to the second identifier.
  • the target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
  • an encoding codeword for identifying the first motion information is shortest.
  • the method for decoding prediction motion information provided in the present application may further include: parsing a bitstream to obtain a third An identifier, and the third identifier includes a preset coefficient.
  • the method before determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier, the method further includes: converting a plurality of preset motion information The offset is multiplied by a preset coefficient to obtain a plurality of adjusted motion information offsets.
  • the target motion information is used to predict motion information of an image block to be processed, and includes: using the target motion information as motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
  • the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
  • the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
  • another decoding method for predicting motion information including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first
  • the elements in the candidate set include at least one first candidate motion information and at least one second candidate set.
  • the elements in the second candidate set include multiple second candidate motion information.
  • the target element is the first candidate motion information, it will be regarded as The first candidate motion information of the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed.
  • the code stream is parsed to obtain a second identifier. Two identifiers, determining the target motion information from the plurality of second candidate motion information.
  • the elements in the first candidate set include the first candidate motion information and at least one second candidate set.
  • the structure of the multi-layer candidate set when more candidates are introduced
  • a type of candidate motion information set can be added as an element to the first candidate set.
  • the length of the first candidate set is greatly selected.
  • the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
  • the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
  • the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier.
  • the target element is uniquely determined by combining the fourth identifier with the first identifier.
  • the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
  • the first candidate motion information may be candidate motion information generated by a Merge mode.
  • the second candidate motion information includes motion information of a spatial domain non-adjacent image block of the image block to be processed.
  • the second candidate motion information may be candidate motion information generated by the Affine Merge mode.
  • the first candidate motion information includes first motion information
  • the second candidate motion information includes second motion information
  • the second motion information is based on the first motion information and preset motion information. Offset is obtained.
  • the first candidate motion information includes the first motion information
  • the second candidate motion information includes a preset motion information offset
  • the determination of the target motion information in the second candidate motion information includes: determining a target offset from a plurality of preset motion information offsets according to a second identifier; and determining the target motion information based on the first motion information and the target offset.
  • the first candidate motion information includes first motion information
  • at least one second candidate set included in the first candidate set is a plurality of second candidate sets, and a plurality of second candidates
  • the set includes at least one third candidate set and at least one fourth candidate set.
  • Elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed
  • elements in the fourth candidate set include multiple Motion information obtained based on the first motion information and a preset motion information offset.
  • a coding codeword for identifying the first motion information is shortest.
  • the first motion information does not include motion information obtained according to an alternative temporal motion vector prediction (alternative temporal vector prediction) mode.
  • At least one second candidate set included in the first candidate set is a plurality of second candidate sets
  • the plurality of second candidate sets includes at least one fifth candidate set and at least one The sixth candidate set
  • the elements in the fifth candidate set include motion information of spatial non-adjacent image blocks of the plurality of image blocks to be processed
  • the elements in the sixth candidate set include a plurality of preset motion information offsets.
  • the decoding method for predicting motion information provided in this application may further include: parsing the code stream to obtain a third identifier, where the third identifier Including preset coefficients.
  • the method before determining the target offset from a plurality of preset motion information offsets according to the second identifier, the method further includes: shifting the plurality of preset motion information. And the preset coefficients included in the third identifier are multiplied to obtain a plurality of adjusted motion information offsets; correspondingly, the target offset is determined from the plurality of preset motion information offsets according to the second identifier Includes: determining a target offset amount from a plurality of adjusted motion information offset amounts adjusted according to a preset coefficient according to a second identifier.
  • the second candidate motion information and the first candidate motion information are different.
  • the first candidate motion information and the second candidate motion information may be candidate motion information selected according to different inter prediction modes.
  • the target motion information is used to predict the motion information of the image block to be processed, and includes: using the target motion information as the motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
  • the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
  • the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
  • a decoding apparatus for predicting motion information including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier.
  • the target element is determined in the set.
  • the elements in the first candidate set include at least one first candidate motion information and a plurality of second candidate motion information.
  • the first candidate motion information includes the first motion information
  • the second candidate motion information includes a preset.
  • an assignment module configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed;
  • the module is further configured to: when the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and determine the target motion information based on one of the plurality of second candidate motion information according to the second identifier.
  • the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information.
  • the structure of the multi-layer candidate set when more is introduced When candidate is selected, a type of candidate motion information set can be added as an element to the first candidate set.
  • the length of the first candidate set is greatly selected.
  • the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
  • the first candidate motion information may include motion information of a spatially adjacent image block of the image block to be processed.
  • the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
  • the analysis module is specifically configured to determine a target offset from a plurality of preset motion information offsets according to the second identifier; based on the first motion information and the target offset Determine the target motion information.
  • an encoding codeword for identifying the first motion information is shortest.
  • the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
  • the device further includes a calculation module for multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motion information offsets. Shift amount.
  • the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained by the calculation module according to the second identifier, and then based on the first motion The information and target offset determine target motion information.
  • the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
  • the second identifier adopts a fixed-length encoding manner.
  • the second identifier adopts a variable length coding method.
  • the apparatus for decoding prediction motion information provided in the third aspect of the embodiments of the present application is configured to execute the method for decoding prediction motion information provided in the first aspect, and the specific implementation is the same, and details are not repeated one by one.
  • a decoding apparatus for predicting motion information including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier.
  • the target element is determined in the set.
  • the elements in the first candidate set include at least one first candidate motion information and at least one second candidate set.
  • the elements in the second candidate set include a plurality of second candidate motion information; the assignment module, when the target When the element is the first candidate motion information, it is used to use the first candidate motion information as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; the analysis module is further configured to, when the target element is the second candidate During assembly, the code streams are parsed to obtain a second identifier, and the determining module is further configured to determine target motion information from the plurality of second candidate motion information according to the second identifier.
  • the elements in the first candidate set include the first candidate motion information and at least one second candidate set.
  • the structure of the multi-layer candidate set when more candidates are introduced
  • a type of candidate motion information set can be added as an element to the first candidate set.
  • the length of the first candidate set is greatly selected.
  • the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
  • the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed.
  • the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed.
  • the first candidate motion information includes first motion information
  • the second candidate motion information includes second motion information
  • the second motion information is based on the first motion information and preset motion information. Offset is obtained.
  • the first candidate motion information includes first motion information
  • the second candidate motion information includes a preset motion information offset
  • the analysis module is specifically configured to: The second identifier determines a target offset from a plurality of preset motion information offsets; and determines the target motion information based on the first motion information and the target offset.
  • the first candidate motion information includes first motion information
  • at least one second candidate set is a plurality of second candidate sets
  • the plurality of second candidate sets includes at least one third candidate Set and at least one fourth candidate set
  • the elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed
  • the elements in the fourth candidate set include multiple based on the first motion information and Motion information obtained by a preset motion information offset.
  • an encoding codeword for identifying the first motion information is shortest.
  • the first motion information does not include motion information obtained according to the ATMVP mode.
  • the at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set.
  • the elements in the candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed, and the elements in the sixth candidate set include multiple preset motion information offsets.
  • the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
  • the fourth aspect further includes a calculation module, configured to multiply a plurality of preset motion information offsets by a preset coefficient to obtain a plurality of adjusted motion information offsets.
  • the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained from the calculation module according to the second identifier, and then determine the target motion based on the first motion information and the target offset. information.
  • the second candidate motion information and the first candidate motion information are different.
  • the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
  • the second identifier adopts a fixed-length encoding manner.
  • the second identifier adopts a variable length coding method.
  • a fifth aspect of the embodiments of the present application provides a decoding apparatus for predicting motion information, including: a processor and a memory coupled to the processor; the processor is configured to execute the first aspect or the second aspect. Decoding method for predicting motion information.
  • a video decoder which includes a non-volatile storage medium and a central processing unit.
  • the non-volatile storage medium stores executable programs, and the central processing unit and the The non-volatile storage medium is connected and executes the decoding method for predicting motion information according to the first aspect and / or the second aspect, or any one of the possible implementation manners.
  • a computer-readable storage medium stores instructions. When the instructions are run on a computer, the computer is caused to execute the first aspect or the first aspect.
  • a computer program product including instructions is provided, and when the instructions are run on a computer, the computer is caused to execute the method for decoding motion prediction information described in the first or second aspect. .
  • FIG. 1 is an exemplary block diagram of a video decoding system that can be configured for use in an embodiment of the present application
  • FIG. 2 is an exemplary system block diagram of a video encoder that can be configured for use in an embodiment of the present application
  • FIG. 3 is an exemplary system block diagram of a video decoder that can be configured for use in embodiments of the present application
  • FIG. 4 is a block diagram of an exemplary inter prediction module that can be configured for use in an embodiment of the present application
  • FIG. 5 is an exemplary implementation flowchart of a merge prediction mode
  • FIG. 6 is an exemplary implementation flowchart of an advanced motion vector prediction mode
  • FIG. 7 is an exemplary implementation flowchart of a motion compensation performed by a video decoder that can be configured for an embodiment of the present application
  • FIG. 8 is a schematic diagram of an exemplary coding unit and adjacent position image blocks associated with the coding unit
  • FIG. 9 is an exemplary implementation flowchart of constructing a candidate prediction motion vector list
  • 10 is an exemplary implementation diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list
  • 11 is an exemplary implementation diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list
  • FIG. 12 is an exemplary implementation diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list
  • FIG. 13 is a schematic diagram of another exemplary coding unit and adjacent position image blocks associated with the coding unit
  • 14A is a schematic diagram of an exemplary method for constructing a candidate motion vector set
  • 14B is a schematic diagram of an exemplary method for constructing a candidate motion vector set
  • 15 is a schematic flowchart of a decoding method for predicting motion information according to an embodiment of the present application.
  • 16A is a schematic diagram of an exemplary method for constructing a candidate motion vector set
  • 16B is a schematic diagram of an exemplary method for constructing a candidate motion vector set
  • 16C is a schematic diagram of an exemplary method for constructing a candidate motion vector set
  • FIG. 17 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application.
  • FIG. 18 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of the words “exemplary” or “for example” is intended to present the relevant concept in a concrete manner.
  • FIG. 1 is a block diagram of a video decoding system 1 according to an example described in the embodiment of the present application.
  • video coder generally refers to both video encoders and video decoders.
  • video coding or “coding” may generally refer to video encoding or video decoding.
  • the video encoder 100 and the video decoder 200 of the video decoding system 1 are configured to predict motion information, such as a motion vector, of a currently decoded image block or a sub-block thereof according to any one of multiple new inter prediction modes,
  • the predicted motion vector is close to the motion vector obtained by using the motion estimation method to the greatest extent, so that it is not necessary to transmit the motion vector difference value during encoding, thereby further improving the encoding and decoding performance.
  • the video decoding system 1 includes a source device 10 and a destination device 20.
  • the source device 10 generates encoded video data. Therefore, the source device 10 may be referred to as a video encoding device.
  • the destination device 20 may decode the encoded video data generated by the source device 10. Therefore, the destination device 20 may be referred to as a video decoding device.
  • Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors.
  • the memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other media that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
  • the source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called “smart” phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
  • the destination device 20 may receive the encoded video data from the source device 10 via the link 30.
  • the link 30 may include one or more media or devices capable of moving the encoded video data from the source device 10 to the destination device 20.
  • the link 30 may include one or more communication media enabling the source device 10 to directly transmit the encoded video data to the destination device 20 in real time.
  • the source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 20.
  • the one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from the source device 10 to the destination device 20.
  • the encoded data may be output from the output interface 140 to the storage device 40.
  • the encoded data can be accessed from the storage device 40 through the input interface 240.
  • the storage device 40 may include any of a variety of distributed or locally accessed data storage media, such as a hard disk drive, a Blu-ray disc, a digital video disc (DVD), and a compact disc-read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or any other suitable digital storage medium for storing encoded video data.
  • the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 10.
  • the destination device 20 may access the stored video data from the storage device 40 via streaming or download.
  • the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to the destination device 20.
  • Example file servers include a network server (for example, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, or a local disk drive.
  • the destination device 20 can access the encoded video data through any standard data connection, including an Internet connection.
  • This may include wireless channels (e.g., wireless-fidelity (Wi-Fi) connections), wired connections (e.g., digital subscriber lines (DSL), cable modems, etc.), or suitable for accessing storage
  • Wi-Fi wireless-fidelity
  • DSL digital subscriber lines
  • the transmission of the encoded video data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
  • the decoding method for predicting motion information can be applied to video encoding and decoding to support a variety of multimedia applications, such as air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (for example, via the Internet), Encoding video data stored on a data storage medium, decoding video data stored on a data storage medium, or other applications.
  • the video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
  • the video decoding system 1 illustrated in FIG. 1 is merely an example, and the techniques of the present application can be applied to a video decoding setting (for example, video encoding or video decoding) that does not necessarily include any data communication between the encoding device and the decoding device. .
  • data is retrieved from local storage, streamed over a network, and so on.
  • the video encoding device may encode the data and store the data to a memory, and / or the video decoding device may retrieve the data from the memory and decode the data.
  • encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
  • the source device 10 includes a video source 120, a video encoder 100, and an output interface 140.
  • the output interface 140 may include a regulator / demodulator (modem) and / or a transmitter.
  • Video source 120 may include a video capture device (e.g., a camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and / or a computer for generating video data Graphics systems, or a combination of these sources of video data.
  • the video encoder 100 may encode video data from the video source 120.
  • the source device 10 transmits the encoded video data directly to the destination device 20 via the output interface 140.
  • the encoded video data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
  • the destination device 20 includes an input interface 240, a video decoder 200, and a display device 220.
  • the input interface 240 includes a receiver and / or a modem.
  • the input interface 240 may receive the encoded video data via the link 30 and / or from the storage device 40.
  • the display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays decoded video data.
  • the display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit Or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
  • the demultiplexer (MUX-DEMUX) unit may conform to the International Telecommunication Union (ITU) H.223 multiplexer protocol, or, for example, the user datagram protocol (user) datagram protocol, UDP) and other protocols.
  • ITU International Telecommunication Union
  • UDP user datagram protocol
  • Each of the video encoder 100 and the video decoder 200 may be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), and application specific integrated circuits. (application-specific integrated circuit (ASIC)), field programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder in a corresponding device / Decoder (codec).
  • codec device / Decoder
  • This application may generally refer to video encoder 100 as “signaling” or “transmitting” certain information to another device, such as video decoder 200.
  • the terms “signaling” or “transmitting” may generally refer to the transmission of syntax elements and / or other data to decode the compressed video data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur over a period of time, such as when a syntax element is stored in a coded stream to a computer-readable storage medium at the time of encoding, and the decoding device may then store the syntax element after the syntax element is stored on this medium Retrieve the syntax element at any time.
  • H.265 High Efficiency Video Coding (HEVC)
  • HEVC High Efficiency Video Coding
  • the HEVC standardization is based on an evolution model of a video decoding device called a HEVC test model (HEVC model).
  • the latest standard document of H.265 can be obtained from http://www.itu.int/rec/T-REC-H.265.
  • the latest version of the standard document is H.265 (12/16).
  • the standard document is in full text.
  • the citation is incorporated herein.
  • HM assumes that video decoding devices have several additional capabilities over existing algorithms of ITU-TH.264 / AVC. For example, H.264 provides 9 intra-prediction encoding modes, while HM provides up to 35 intra-prediction encoding modes.
  • H.266 test model The evolution model of the video decoding device.
  • the algorithm description of H.266 can be obtained from http://phenix.int-evry.fr/jvet. The latest algorithm description is included in JVET-F1001-v2.
  • the algorithm description document is incorporated herein by reference in its entirety.
  • the reference software for the JEM test model is available from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/ and is also incorporated herein by reference in its entirety.
  • HM can divide a video frame or image into a tree block or a sequence of the largest coding unit (LCU) that contains both luma and chroma samples.
  • LCU is also called the coding tree unit.
  • CTU coding tree unit
  • the tree block has a similar purpose as the macro block of the H.264 standard.
  • a slice contains several consecutive tree blocks in decoding order.
  • a video frame or image can be split into one or more slices.
  • Each tree block can be split into coding units according to a quadtree. For example, a tree block that is a root node of a quad tree may be split into four child nodes, and each child node may be a parent node and split into another four child nodes.
  • the final indivisible child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded video blocks.
  • decoding nodes such as decoded video blocks.
  • the syntax data associated with the decoded codestream can define the maximum number of times a tree block can be split, and can also define the minimum size of a decoding node.
  • the coding unit includes a decoding node and a prediction unit (PU), and a transform unit (TU) associated with the decoding node.
  • the size of the CU corresponds to the size of the decoding node and the shape must be square.
  • the size of the CU can range from 8 ⁇ 8 pixels to a maximum 64 ⁇ 64 pixels or larger tree block size.
  • Each CU may contain one or more PUs and one or more TUs.
  • the syntax data associated with a CU may describe a case where a CU is partitioned into one or more PUs.
  • the partitioning mode may be different between cases where the CU is skipped or is encoded in direct mode, intra prediction mode, or inter prediction mode.
  • the PU can be divided into non-square shapes.
  • the syntax data associated with a CU may also describe a case where a CU is partitioned into one or more TUs according to a quadtree.
  • the shape of the TU can be square or non-square.
  • the HEVC standard allows transformation based on the TU, which can be different for different CUs.
  • the TU is usually sized based on the size of the PUs within a given CU defined for the partitioned LCU, but this may not always be the case.
  • the size of the TU is usually the same as or smaller than the PU.
  • a quad-tree structure called "residual quad-tree" (RQT) may be used to subdivide the residual samples corresponding to the CU into smaller units.
  • the leaf node of RQT may be called TU.
  • the pixel difference values associated with the TU may be transformed to produce a transformation coefficient, which may be quantized.
  • the PU contains data related to the prediction process.
  • the PU may include data describing the intra-prediction mode of the PU.
  • the PU may include data defining a motion vector of the PU.
  • the data defining the motion vector of the PU may describe the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), motion vector The reference image pointed to, and / or the reference image list of the motion vector (eg, list 0, list 1 or list C).
  • TU uses transform and quantization processes.
  • a given CU with one or more PUs may also contain one or more TUs.
  • video encoder 100 may calculate a residual value corresponding to the PU.
  • the residual values include pixel differences that can be transformed into transform coefficients, quantized, and scanned using TU to generate serialized transform coefficients for entropy decoding.
  • This application generally uses the term "video block" to refer to the decoding node of a CU.
  • the term “video block” may also be used in this application to refer to a tree block including a decoding node and a PU and a TU, such as an LCU or a CU.
  • a video sequence usually contains a series of video frames or images.
  • a group of pictures exemplarily includes a series, one or more video pictures.
  • the GOP may include syntax data in the header information of the GOP, the header information of one or more of the pictures, or elsewhere, and the syntax data describes the number of pictures included in the GOP.
  • Each slice of the image may contain slice syntax data describing the coding mode of the corresponding image.
  • Video encoder 100 typically operates on video blocks within individual video slices to encode video data.
  • a video block may correspond to a decoding node within a CU.
  • Video blocks may have fixed or varying sizes, and may differ in size according to a specified decoding standard.
  • HM supports prediction of various PU sizes. Assuming the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU sizes of 2N ⁇ 2N or N ⁇ N, and symmetric PU sizes of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N or N ⁇ N prediction. HM also supports asymmetric partitioning of PU-sized inter predictions of 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N, and nR ⁇ 2N. In asymmetric partitioning, one direction of the CU is not partitioned, and the other direction is partitioned into 25% and 75%.
  • 2N ⁇ nU refers to a horizontally divided 2N ⁇ 2NCU, where 2N ⁇ 0.5NPU is at the top and 2N ⁇ 1.5NPU is at the bottom.
  • N ⁇ N and “N times N” are used interchangeably to refer to the pixel size of a video block according to vertical and horizontal dimensions, for example, 16 ⁇ 16 pixels or 16 ⁇ 16 pixels.
  • an N ⁇ N block has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value.
  • Pixels in a block can be arranged in rows and columns.
  • the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction.
  • a block may include N ⁇ M pixels, where M is not necessarily equal to N.
  • the video encoder 100 may calculate the residual data of the TU of the CU.
  • a PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may include transforming (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after being applied to the residual video data.
  • the residual data may correspond to a pixel difference between a pixel of an uncoded image and a prediction value corresponding to a PU.
  • the video encoder 100 may form a TU including residual data of a CU, and then transform the TU to generate a transform coefficient of the CU.
  • video encoder 100 may perform quantization of the transform coefficients.
  • Quantization exemplarily refers to the process of quantizing coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
  • the quantization process may reduce the bit depth associated with some or all of the coefficients. For example, n-bit values may be rounded down to m-bit values during quantization, where n is greater than m.
  • the JEM model further improves the coding structure of video images.
  • a block coding structure called "Quad Tree Combined with Binary Tree” (QTBT) is introduced.
  • QTBT Quality Tree Combined with Binary Tree
  • a CU can be square or rectangular.
  • a CTU first performs a quadtree partition, and the leaf nodes of the quadtree further perform a binary tree partition.
  • there are two partitioning modes in binary tree partitioning symmetrical horizontal partitioning and symmetrical vertical partitioning.
  • the leaf nodes of a binary tree are called CUs, and JEM's CUs cannot be further divided during the prediction and transformation process, that is, JEM's CU, PU, and TU have the same block size.
  • the maximum size of the CTU is 256 ⁇ 256 luminance pixels.
  • the video encoder 100 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that can be entropy encoded.
  • the video encoder 100 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, the video encoder 100 may perform context-based adaptive variable-length decoding (CAVLC), context-adaptive binary arithmetic decoding (context-based based adaptive binary coding (CABAC), syntax-based adaptive binary binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) decoding, or other entropy decoding methods Entropy decodes a one-dimensional vector.
  • Video encoder 100 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 200 to decode the video data.
  • video encoder 100 may assign a context within a context model to a symbol to be transmitted. Context can be related to whether adjacent values of a symbol are non-zero.
  • Context can be related to whether adjacent values of a symbol are non-zero.
  • the video encoder 100 may select a variable length code of a symbol to be transmitted. Codewords in variable-length decoding (VLC) may be constructed such that relatively short codes correspond to more likely symbols, and longer codes correspond to less likely symbols. In this way, the use of VLC can achieve the goal of saving code rates relative to using equal length codewords for each symbol to be transmitted.
  • the probability in CABAC can be determined based on the context assigned to the symbol.
  • the video encoder may perform inter prediction to reduce temporal redundancy between images.
  • a CU may have one or more prediction units PU according to the provisions of different video compression codec standards.
  • multiple PUs may belong to a CU, or PUs and CUs are the same size.
  • the CU's partitioning mode is not divided, or it is divided into one PU, and the PU is uniformly used for expression.
  • the video encoder may signal the video decoder motion information for the PU.
  • the motion information of the PU may include: a reference image index, a motion vector, and a prediction direction identifier.
  • a motion vector may indicate a displacement between an image block (also called a video block, a pixel block, a pixel set, etc.) of a PU and a reference block of the PU.
  • the reference block of the PU may be a part of the reference picture similar to the image block of the PU.
  • the reference block may be located in a reference image indicated by a reference image index and a prediction direction identifier.
  • the video encoder may generate candidate prediction motion vectors (Motion Vector, MV) for each of the PUs according to the merge prediction mode or advanced motion vector prediction mode process List.
  • MV Motion Vector
  • Each candidate prediction motion vector in the candidate prediction motion vector list for the PU may indicate motion information, and the MV list may also be referred to as a candidate motion information list.
  • the motion information indicated by some candidate prediction motion vectors in the candidate prediction motion vector list may be based on the motion information of other PUs. If the candidate prediction motion vector indicates motion information specifying one of a spatial candidate prediction motion vector position or a temporal candidate prediction motion vector position, the present application may refer to the candidate prediction motion vector as an "original" candidate prediction motion vector.
  • a merge mode also referred to herein as a merge prediction mode
  • the video encoder may generate additional candidate prediction motion vectors by combining partial motion vectors from different original candidate prediction motion vectors, modifying the original candidate prediction motion vectors, or inserting only zero motion vectors as candidate prediction motion vectors. These additional candidate prediction motion vectors are not considered as original candidate prediction motion vectors and may be referred to as artificially generated candidate prediction motion vectors in this application.
  • the techniques of this application generally relate to a technique for generating a list of candidate prediction motion vectors at a video encoder and a technique for generating the same list of candidate prediction motion vectors at a video decoder.
  • the video encoder and video decoder may generate the same candidate prediction motion vector list by implementing the same techniques used to construct the candidate prediction motion vector list. For example, both a video encoder and a video decoder may build a list with the same number of candidate prediction motion vectors (eg, five candidate prediction motion vectors).
  • Video encoders and decoders may first consider spatial candidate prediction motion vectors (e.g., neighboring blocks in the same image), then consider temporal candidate prediction motion vectors (e.g., candidate prediction motion vectors in different images), and finally consider The artificially generated candidate prediction motion vectors are added until a desired number of candidate prediction motion vectors are added to the list.
  • a type of candidate prediction motion vector may be indicated in the candidate prediction motion vector list through an identification bit to control the length of the candidate prediction motion vector list.
  • the spatial candidate prediction motion vector set and the temporal candidate prediction motion vector can be used as the original candidate prediction motion vector.
  • an identification bit space is added to the vector list to indicate an artificially generated candidate prediction motion vector set.
  • a prediction motion vector is selected from a set of candidate prediction motion vectors indicated by the identification bit.
  • the video encoder may select the candidate prediction motion vector from the candidate prediction motion vector list and output the candidate prediction motion vector index in the code stream.
  • the selected candidate prediction motion vector may be a candidate prediction motion vector having a motion vector that most closely matches the predictor of the target PU being decoded.
  • the candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list.
  • the video encoder may also generate a predictive image block for the PU based on a reference block indicated by the motion information of the PU. The motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector.
  • the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
  • the motion information of the PU may be determined based on the motion vector difference of the PU and the motion information indicated by the selected candidate prediction motion vector.
  • the video encoder may generate one or more residual image blocks for the CU based on the predictive image blocks of the PU of the CU and the original image blocks for the CU. The video encoder may then encode one or more residual image blocks and output one or more residual image blocks in a code stream.
  • the bitstream may include data identifying a selected candidate prediction motion vector in the candidate prediction motion vector list of the PU, which is referred to herein as an identifier or signal.
  • the data may include an index in the candidate prediction motion vector list, and the target motion vector is determined through the index; or the target motion vector is determined to be a certain type of candidate prediction motion vector through the index.
  • the data further includes an instruction selection Information on the specific position of the data of the candidate prediction motion vector in this type of candidate prediction motion vector.
  • the video decoder can parse the bitstream to obtain data of the selected candidate prediction motion vector in the candidate prediction motion vector list identifying the PU, and determine the data of the selected candidate prediction motion vector based on the data.
  • the motion information indicated by the selected candidate prediction motion vector in determines the motion information of the PU.
  • the video decoder may identify one or more reference blocks for the PU based on the motion information of the PU. After identifying one or more reference blocks of the PU, the video decoder may generate predictive image blocks for the PU based on the one or more reference blocks of the PU. The video decoder may reconstruct an image block for a CU based on a predictive image block for a PU of the CU and one or more residual image blocks for the CU.
  • the present application may describe a position or an image block as having various spatial relationships with a CU or a PU. This description can be interpreted to mean that the position or image block and the image block associated with the CU or PU have various spatial relationships.
  • a PU currently being decoded by a video decoder may be referred to as a current PU, and may also be referred to as a current image block to be processed.
  • This application may refer to the CU that the video decoder is currently decoding as the current CU.
  • This application may refer to the image currently being decoded by the video decoder as the current image. It should be understood that this application is applicable to a case where the PU and the CU have the same size, or the PU is the CU, and the PU is used to represent the same.
  • video encoder 100 may use inter prediction to generate predictive image blocks and motion information for a PU of a CU.
  • the motion information of a given PU may be the same or similar to the motion information of one or more nearby PUs (ie, PUs whose image blocks are spatially or temporally near the image blocks of the given PU). Because nearby PUs often have similar motion information, video encoder 100 may refer to the motion information of nearby PUs to encode motion information for a given PU. Encoding the motion information of a given PU with reference to the motion information of a nearby PU can reduce the number of encoded bits required to indicate the motion information of a given PU in the code stream.
  • Video encoder 100 may refer to motion information of nearby PUs in various ways to encode motion information for a given PU.
  • video encoder 100 may indicate that the motion information of a given PU is the same as the motion information of nearby PUs.
  • This application may use a merge mode to refer to indicating that the motion information of a given PU is the same as that of nearby PUs or may be derived from the motion information of nearby PUs.
  • the video encoder 100 may calculate a Motion Vector Difference (MVD) for a given PU.
  • MVD Motion Vector Difference
  • MVD indicates the difference between the motion vector of a given PU and the motion vector of a nearby PU.
  • Video encoder 100 may include MVD instead of a motion vector of a given PU in the motion information of a given PU. Representing MVD in the codestream requires fewer coding bits than representing the motion vector of a given PU.
  • This application may use advanced motion vector prediction mode to refer to the motion information of a given PU by using the MVD and an index value identifying a candidate motion vector.
  • the video encoder 100 may generate a list of candidate predicted motion vectors for a given PU.
  • the candidate prediction motion vector list may include one or more candidate prediction motion vectors.
  • Each of the candidate prediction motion vectors in the candidate prediction motion vector list for a given PU may specify motion information.
  • the motion information indicated by each candidate prediction motion vector may include a motion vector, a reference image index, and a prediction direction identifier.
  • the candidate prediction motion vectors in the candidate prediction motion vector list may include "raw" candidate prediction motion vectors, each of which indicates motion information that is different from one of the specified candidate prediction motion vector positions within a PU of a given PU.
  • the video encoder 100 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list for the PU. For example, a video encoder may compare each candidate prediction motion vector with the PU being decoded and may select a candidate prediction motion vector with a desired code rate-distortion cost. Video encoder 100 may output a candidate prediction motion vector index for a PU. The candidate prediction motion vector index may identify the position of the selected candidate prediction motion vector in the candidate prediction motion vector list.
  • the video encoder 100 may generate a predictive image block for a PU based on a reference block indicated by motion information of the PU.
  • the motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU.
  • the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
  • motion information of a PU may be determined based on a motion vector difference for the PU and motion information indicated by a selected candidate prediction motion vector.
  • Video encoder 100 may process predictive image blocks for a PU as described previously.
  • an identifier bit may be used in the candidate prediction motion vector list to indicate a type of candidate prediction motion vector to control the length of the candidate prediction motion vector list. I will not repeat them here.
  • video decoder 200 may generate a list of candidate predicted motion vectors for each of the PUs of the CU.
  • the candidate prediction motion vector list generated by the video decoder 200 for the PU may be the same as the candidate prediction motion vector list generated by the video encoder 100 for the PU.
  • the syntax element parsed by the video decoder 200 from the bitstream may indicate the position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
  • the video decoder 200 may generate predictive image blocks for the PU based on one or more reference blocks indicated by the motion information of the PU.
  • the video decoder 200 may determine the motion information of the PU from the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU based on the syntax element obtained by parsing the bitstream. Video decoder 200 may reconstruct an image block for a CU based on a predictive image block for a PU and a residual image block for a CU.
  • the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector.
  • the video decoder 200 first parses the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
  • the candidate prediction motion vector list of the PU includes at least one first candidate motion vector and at least one second candidate set, and the second candidate set includes at least one second candidate motion vector.
  • Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier.
  • the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is the second candidate set, the video decoder 200 parses the bitstream to obtain a second identifier, and the second identifier is used to identify a selected candidate prediction motion vector in the second candidate set indicated by the first identifier.
  • video decoder 200 determines target motion information from a plurality of second candidate motion vectors in a second candidate set indicated by the first ID according to the second identifier, and uses the target motion information to predict a to-be-processed image block (PU) The motion information is subsequently decoded.
  • the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector.
  • the video decoder 200 first analyzes the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
  • the PU candidate motion vector list includes at least one first candidate motion vector and multiple second candidate motion information.
  • the first candidate motion information includes first motion information
  • the second candidate motion information includes preset motion information. Shift amount.
  • Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier.
  • the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is obtained according to a plurality of second candidate motion information, the video decoder 200 parses the bitstream to obtain a second identifier, and determines the target motion based on one of the plurality of second candidate motion information according to the second identifier. Information, the target motion information is used to predict the motion information of the image block (PU) to be processed for subsequent decoding processes.
  • candidate motion vectors in the candidate prediction motion vector list may be obtained according to different modes, which are not specifically limited in this application.
  • the construction of the candidate prediction motion vector list and the parsing of the selected candidate prediction motion vector from the code stream in the candidate prediction motion vector list are independent of each other, and can be arbitrarily Sequentially or in parallel.
  • the position of the selected candidate prediction motion vector in the candidate prediction motion vector list is first parsed from the code stream, and a candidate prediction motion vector list is constructed based on the parsed position.
  • a candidate prediction motion vector list is constructed based on the parsed position.
  • the selected candidate predictive motion vector is obtained by parsing the bitstream and is a candidate predictive motion vector with an index of 3 in the candidate predictive motion vector list, only the candidate predictive motion vector from index 0 to index 3 needs to be constructed
  • the list can determine the candidate predicted motion vector with the index of 3, which can achieve the technical effect of reducing complexity and improving decoding efficiency.
  • FIG. 2 is a block diagram of a video encoder 100 according to an example described in the embodiment of the present application.
  • the video encoder 100 is configured to output a video to the post-processing entity 41.
  • the post-processing entity 41 represents an example of a video entity that can process the encoded video data from the video encoder 100, such as a media-aware network element (MANE) or a stitching / editing device.
  • the post-processing entity 41 may be an instance of a network entity.
  • the post-processing entity 41 and the video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to the post-processing entity 41 may be performed by the same device including the video encoder 100 carried out.
  • the post-processing entity 41 is an example of the storage device 40 of FIG. 1.
  • the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a decoded picture buffer (DPB) 107, a summer 112, a transformer 101, a quantizer 102, and entropy. Encoder 103.
  • the prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109.
  • the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111.
  • the filter unit 106 is intended to represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
  • the filter unit 106 is shown as an in-loop filter in FIG. 2, in other implementations, the filter unit 106 may be implemented as a post-loop filter.
  • the video encoder 100 may further include a video data memory and a segmentation unit (not shown in the figure).
  • the video data memory may store video data to be encoded by the components of the video encoder 100.
  • the video data stored in the video data storage may be obtained from the video source 120.
  • the DPB 107 may be a reference image memory that stores reference video data used by the video encoder 100 to encode video data in an intra-frame or inter-frame decoding mode.
  • the video data memory and the DPB 107 can be formed by any of a variety of memory devices, such as a dynamic random access memory (SDRAM) including a synchronous dynamic random access memory (SDRAM), a magnetoresistance RAM (magnetic random access memory, MRAM), resistive RAM (resistive random access memory, RRAM), or other types of memory devices.
  • Video data storage and DPB 107 can be provided by the same storage device or separate storage devices.
  • the video data memory may be on-chip with other components of video encoder 100 or off-chip relative to those components.
  • the video encoder 100 receives video data and stores the video data in a video data memory.
  • the segmentation unit divides the video data into several image blocks, and these image blocks can be further divided into smaller blocks, such as image block segmentation based on a quad tree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger units.
  • Video encoder 100 typically illustrates components that encode image blocks within a video slice to be encoded.
  • the slice can be divided into multiple image patches (and possibly into a collection of image patches called slices).
  • the prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes.
  • the prediction processing unit 108 may provide the obtained intra, inter-coded block to the summer 112 to generate a residual block, and to the summer 111 to reconstruct an encoded block used as a reference image.
  • the intra predictor 109 within the prediction processing unit 108 may perform intra predictive encoding of the current image block with respect to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy.
  • the inter predictor 110 within the prediction processing unit 108 may perform inter predictive coding of the current image block with respect to one or more prediction blocks in the one or more reference images to remove temporal redundancy.
  • the inter predictor 110 may be configured to determine an inter prediction mode for encoding a current image block. For example, the inter predictor 110 may use a rate-distortion analysis to calculate the rate-distortion values of various inter-prediction modes in the set of candidate inter-prediction modes, and select from them the best rate-distortion characteristics. Inter prediction mode. Code rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate (also That is, the number of bits). For example, the inter predictor 110 may determine that the inter prediction mode with the lowest code rate distortion cost of encoding the current image block in the candidate inter prediction mode set is the inter prediction mode used for inter prediction of the current image block.
  • Code rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate (also That is, the number of bits).
  • the inter predictor 110
  • the inter predictor 110 is configured to predict motion information (such as a motion vector) of one or more sub-blocks in the current image block based on the determined inter prediction mode, and use the motion information (such as the motion vector) of one or more sub-blocks in the current image block. Motion vector) to obtain or generate a prediction block of the current image block.
  • the inter predictor 110 may locate a prediction block pointed to by the motion vector in one of the reference image lists.
  • the inter predictor 110 may also generate syntax elements associated with image blocks and video slices for use by the video decoder 200 when decoding image blocks of the video slice.
  • the inter predictor 110 uses the motion information of each sub-block to perform a motion compensation process to generate a prediction block of each sub-block, thereby obtaining a prediction block of the current image block. It should be understood that the The inter predictor 110 performs motion estimation and motion compensation processes.
  • the inter predictor 110 may provide information indicating the selected inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the instruction. Information on the selected inter prediction mode.
  • the intra predictor 109 may perform intra prediction on the current image block.
  • the intra predictor 109 may determine an intra prediction mode used to encode the current block.
  • the intra predictor 109 may use a rate-distortion analysis to calculate the rate-distortion values of various intra-prediction modes to be tested, and select the one with the best rate-distortion characteristics from the modes to be tested.
  • Intra prediction mode In any case, after the intra prediction mode is selected for the image block, the intra predictor 109 may provide information indicating the selected intra prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the indication Information on the selected intra prediction mode.
  • the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded.
  • the summer 112 represents one or more components that perform this subtraction operation.
  • the residual video data in the residual block may be included in one or more (transform units) and applied to the transformer 101.
  • the transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform.
  • the transformer 101 may transform the residual video data from a pixel value domain to a transform domain, such as a frequency domain.
  • DCT discrete cosine transform
  • the transformer 101 may send the obtained transform coefficients to a quantizer 102.
  • a quantizer 102 quantizes the transform coefficients to further reduce the bit code rate.
  • the quantizer 102 may then perform a scan of a matrix containing the quantized transform coefficients.
  • the entropy encoder 103 may perform scanning.
  • the entropy encoder 103 After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 can perform context-adaptive variable-length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy (PIPE ) Coding or another entropy coding method or technique.
  • CAVLC context-adaptive variable-length coding
  • CABAC context-adaptive binary arithmetic coding
  • SBAC syntax-based context-adaptive binary arithmetic coding
  • PIPE probability interval segmentation entropy Coding or another entropy coding method or technique.
  • the encoded code stream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200.
  • the entropy encoder 103 may also perform entrop
  • the inverse quantizer 104 and the inverse changer 105 respectively apply inverse quantization and inverse transform to reconstruct the residual block in the pixel domain, for example, for later use as a reference block of a reference image.
  • the summer 111 adds the reconstructed residual block to a prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block.
  • the filter unit 106 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts. This reconstructed image block is then stored as a reference block in the decoded image buffer 107 and can be used by the inter predictor 110 as a reference block to perform inter prediction on subsequent video frames or blocks in the image.
  • the video encoder 100 may directly quantize the residual signal without processing by the transformer 101 and correspondingly does not need to be processed by the inverse transformer 105; or, for some image blocks Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transformer 101, quantizer 102, inverse quantizer 104, and inverse transformer 105; or, the video encoder 100 may convert the reconstructed image
  • the blocks are stored directly as reference blocks without being processed by the filter unit 106; alternatively, the quantizer 102 and the inverse quantizer 104 in the video encoder 100 may be merged together.
  • FIG. 3 is a block diagram of an example video decoder 200 described in the embodiment of the present application.
  • the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a DPB 207.
  • the prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209.
  • video decoder 200 may perform a decoding process that is substantially inverse to the encoding process described with respect to video encoder 100 from FIG. 2.
  • the video decoder 200 receives from the video encoder 100 an encoded video codestream representing image blocks of the encoded video slice and associated syntax elements.
  • the video decoder 200 may receive video data from the network entity 42, optionally, the video data may also be stored in a video data storage (not shown in the figure).
  • the video data memory may store video data, such as an encoded video code stream, to be decoded by components of the video decoder 200.
  • the video data stored in the video data storage can be obtained, for example, from the storage device 40, from a local video source such as a camera, via a wired or wireless network of video data, or by accessing a physical data storage medium.
  • the video data memory can be used as a decoded image buffer (CPB) for storing encoded video data from the encoded video bitstream. Therefore, although the video data storage is not shown in FIG. 3, the video data storage and the DPB 207 may be the same storage, or may be separately provided storages. Video data memory and DPB 207 can be formed by any of a variety of memory devices, such as: dynamic random access memory (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), and resistive RAM (RRAM) , Or other types of memory devices. In various examples, the video data memory may be integrated on a chip with other components of the video decoder 200 or provided off-chip relative to those components.
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • MRAM magnetoresistive RAM
  • RRAM resistive RAM
  • the video data memory may be integrated on a chip with other components of the video decoder 200 or provided off-chip relative to those components.
  • the network entity 42 may be, for example, a server, a MANE, a video editor / splicer, or other such device for implementing one or more of the techniques described above.
  • the network entity 42 may or may not include a video encoder, such as video encoder 100.
  • the network entity 42 may implement some of the techniques described in this application.
  • the network entity 42 and the video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to the network entity 42 may be performed by the same device including the video decoder 200.
  • the network entity 42 may be an example of the storage device 40 of FIG. 1.
  • the entropy decoder 203 of the video decoder 200 entropy decodes the code stream to produce quantized coefficients and some syntax elements.
  • the entropy decoder 203 forwards the syntax elements to the prediction processing unit 208.
  • Video decoder 200 may receive syntax elements at a video slice level and / or an image block level.
  • the intra predictor 209 of the prediction processing unit 208 may be based on the signaled intra prediction mode and the previously decoded block from the current frame or image. Data to generate prediction blocks for image blocks of the current video slice.
  • the inter predictor 210 of the prediction processing unit 208 may determine, based on the syntax elements received from the entropy decoder 203, the An inter prediction mode in which a current image block of a video slice is decoded, and based on the determined inter prediction mode, the current image block is decoded (for example, inter prediction is performed).
  • the inter predictor 210 may determine whether to use the new inter prediction mode to predict the current image block of the current video slice. If the syntax element indicates that the new inter prediction mode is used to predict the current image block, based on A new inter prediction mode (for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode) predicts the current image block of the current video slice or a sub-block of the current image block. Motion information, so that the motion information of the current image block or a sub-block of the current image block is used to obtain or generate a prediction block of the current image block or a sub-block of the current image block through a motion compensation process.
  • a new inter prediction mode for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode
  • the motion information here may include reference image information and motion vectors, where the reference image information may include but is not limited to unidirectional / bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
  • a prediction block may be generated from one of reference pictures within one of the reference picture lists.
  • the video decoder 200 may construct a reference image list, that is, a list 0 and a list 1, based on the reference images stored in the DPB 207.
  • the reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1.
  • the video encoder 100 may signal whether to use a new inter prediction mode to decode a specific syntax element of a specific block, or may be a signal to indicate whether to use a new inter prediction mode. And indicating which new inter prediction mode is used to decode a specific syntax element of a specific block. It should be understood that the inter predictor 210 here performs a motion compensation process.
  • the inverse quantizer 204 inverse quantizes, that is, dequantizes, the quantized transform coefficients provided in the code stream and decoded by the entropy decoder 203.
  • the inverse quantization process may include using a quantization parameter calculated by the video encoder 100 for each image block in the video slice to determine the degree of quantization that should be applied and similarly to determine the degree of inverse quantization that should be applied.
  • the inverse transformer 205 applies an inverse transform to transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process to generate a residual block in the pixel domain.
  • the video decoder 200 works by comparing the residual block from the inverse transformer 205 with the corresponding prediction generated by the inter predictor 210 The blocks are summed to get the reconstructed block, that is, the decoded image block.
  • the summer 211 represents a component that performs this summing operation.
  • a loop filter in or after the decoding loop
  • the filter unit 206 may represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
  • the filter unit 206 is shown as an in-loop filter in FIG. 3, in other implementations, the filter unit 206 may be implemented as a post-loop filter.
  • the filter unit 206 is adapted to reconstruct a block to reduce block distortion, and the result is output as a decoded video stream.
  • the decoded image block in a given frame or image can also be stored in the decoded image buffer 207, and the reference image used for subsequent motion compensation can be stored via the DPB 207.
  • the DPB 207 may be part of the memory, which may also store the decoded video for later presentation on a display device (such as the display device 220 of FIG. 1), or may be separate from such memory.
  • the video decoder 200 may generate an output video stream without being processed by the filter unit 206; or, for certain image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode the quantized coefficients, and accordingly, It does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
  • the techniques of this application exemplarily involve inter-frame decoding. It should be understood that the techniques of this application may be performed by any of the video decoders described in this application.
  • the video decoder includes, for example, the video encoder 100 and video decoding as shown and described with respect to FIGS. 1-3. ⁇ 200 ⁇ 200. That is, in one feasible implementation, the inter predictor 110 described with respect to FIG. 2 may perform specific techniques described below when performing inter prediction during encoding of a block of video data. In another possible implementation, the inter predictor 210 described with respect to FIG. 3 may perform specific techniques described below when performing inter prediction during decoding of blocks of video data.
  • a reference to a generic "video encoder" or "video decoder” may include video encoder 100, video decoder 200, or another video encoding or coding unit.
  • the processing result for a certain link may be further processed and output to the next link, for example, in interpolation filtering, motion vector derivation or loop After filtering and other steps, the results of the corresponding steps are further clipped or shifted.
  • the motion vector of the control point of the current image block derived according to the motion vector of the adjacent affine coding block may be further processed, which is not limited in this application.
  • the value range of the motion vector is restricted so that it is within a certain bit width. Assuming that the bit width of the allowed motion vector is bitDepth, the range of the motion vector is -2 ⁇ (bitDepth-1) to 2 ⁇ (bitDepth-1) -1, where the " ⁇ " symbol represents the power. If bitDepth is 16, the value ranges from -32768 to 32767. If bitDepth is 18, the value ranges from -131072 to 131071. Constraints can be implemented in two ways:
  • ux (vx + 2 bitDepth )% 2 bitDepth
  • the value of vx is -32769, and the value obtained by the above formula is 32767. Because in the computer, the value is stored in the two's complement form, and the two's complement of -32769 is 1,0111,1111,1111,1111 (17 bits). The computer treats the overflow as discarding the high order bits, so the value of vx For 0111, 1111, 1111, 1111, it is 32767, which is consistent with the result obtained by formula processing.
  • vx Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vx)
  • vy Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vy)
  • Clip3 is to clamp the value of z to the interval [x, y]:
  • FIG. 4 is a schematic block diagram of an inter prediction module 121 according to an embodiment of the present application.
  • the inter prediction module 121 may include a motion estimation unit and a motion compensation unit. The relationship between PU and CU is different in different video compression codecs.
  • the inter prediction module 121 may partition a current CU into a PU according to a plurality of partitioning modes. For example, the inter prediction module 121 may partition a current CU into a PU according to 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, and N ⁇ N partition modes. In other embodiments, the current CU is the current PU, which is not limited.
  • the inter prediction module 121 may perform integer motion estimation (IME) and then perform fractional motion estimation (FME) on each of the PUs.
  • IME integer motion estimation
  • FME fractional motion estimation
  • the inter prediction module 121 may search a reference block for a PU in one or more reference images. After the reference block for the PU is found, the inter prediction module 121 may generate a motion vector indicating the spatial displacement between the PU and the reference block for the PU with integer precision.
  • the inter prediction module 121 may improve a motion vector generated by performing IME on the PU.
  • a motion vector generated by performing FME on a PU may have sub-integer precision (eg, 1/2 pixel precision, 1/4 pixel precision, etc.).
  • the inter prediction module 121 may use the motion vector for the PU to generate a predictive image block for the PU.
  • the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU.
  • the candidate prediction motion vector list may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors.
  • the inter prediction module 121 may select the candidate prediction motion vector from the candidate prediction motion vector list and generate a motion vector difference (MVD) for the PU.
  • the MVD for a PU may indicate a difference between a motion vector indicated by a selected candidate prediction motion vector and a motion vector generated for the PU using IME and FME.
  • the inter prediction module 121 may output a candidate prediction motion vector index that identifies the position of the selected candidate prediction motion vector in the candidate prediction motion vector list.
  • the inter prediction module 121 may also output the MVD of the PU.
  • a detailed implementation of the advanced motion vector prediction (AMVP) mode in the embodiment of the present application in FIG. 6 is described in detail below.
  • the inter prediction module 121 may also perform a merge operation on each of the PUs.
  • the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU.
  • the candidate prediction motion vector list for the PU may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors.
  • the original candidate prediction motion vector in the candidate prediction motion vector list may include one or more spatial candidate prediction motion vectors and temporal candidate prediction motion vectors.
  • the spatial candidate prediction motion vector may indicate motion information of other PUs in the current image.
  • the temporal candidate prediction motion vector may be based on motion information of a corresponding PU different from the current picture.
  • the temporal candidate prediction motion vector may also be referred to as temporal motion vector prediction (TMVP).
  • the inter prediction module 121 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list. The inter prediction module 121 may then generate a predictive image block for the PU based on the reference block indicated by the motion information of the PU. In the merge mode, the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
  • Figure 5 described below illustrates an exemplary flowchart for Merge.
  • the original candidate predictive motion vector can be directly included in the candidate predictive motion vector list, and one type of additional candidate predictive motion vector is indicated through the identification bit to control the candidate predictive motion.
  • the length of the vector list In particular, different types of extra candidate prediction motion vectors are indicated by different identification bits.
  • a prediction motion vector is selected from the set of extra candidate prediction motion vectors indicated by the identification bit.
  • the candidate prediction motion vector indicated by the identification bit may be a preset motion information offset.
  • the inter prediction module 121 may select a predictive image block generated through the FME operation or a merge operation. Predictive image blocks. In some feasible implementations, the inter prediction module 121 may select a predictive image for a PU based on a code rate-distortion cost analysis of the predictive image block generated by the FME operation and the predictive image block generated by the merge operation. Piece.
  • the inter prediction module 121 may select a partitioning mode for the current CU. In some embodiments, the inter prediction module 121 may select a rate-distortion cost analysis for a selected predictive image block of the PU generated by segmenting the current CU according to each of the partitioning modes to select the Split mode.
  • the inter prediction module 121 may output a predictive image block associated with a PU belonging to the selected partition mode to the residual generation module 102.
  • the inter prediction module 121 may output a syntax element indicating motion information of a PU belonging to the selected partition mode to the entropy encoding module.
  • the inter prediction module 121 includes IME modules 180A to 180N (collectively referred to as “IME module 180"), FME modules 182A to 182N (collectively referred to as “FME module 182”), and merge modules 184A to 184N (collectively Are “merging module 184"), PU mode decision modules 186A to 186N (collectively referred to as “PU mode decision module 186”) and CU mode decision module 188 (may also include performing a mode decision process from CTU to CU).
  • IME module 180 IME modules 180A to 180N
  • FME module 182 FME modules 182A to 182N
  • merge modules 184A to 184N collectively Are “merging module 184"
  • PU mode decision modules 186A to 186N collectively referred to as "PU mode decision module 186”
  • CU mode decision module 188 may also include performing a mode decision process from CTU to CU).
  • the IME module 180, the FME module 182, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a PU of the current CU.
  • the inter prediction module 121 is illustrated in the schematic diagram of FIG. 4 as including a separate IME module 180, an FME module 182, and a merging module 184 for each PU of each partitioning mode of the CU. In other feasible implementations, the inter prediction module 121 does not include a separate IME module 180, an FME module 182, and a merge module 184 for each PU of each partitioning mode of the CU.
  • the IME module 180A, the FME module 182A, and the merge module 184A may perform IME operations, FME operations, and merge operations on a PU generated by dividing a CU according to a 2N ⁇ 2N split mode.
  • the PU mode decision module 186A may select one of the predictive image blocks generated by the IME module 180A, the FME module 182A, and the merge module 184A.
  • the IME module 180B, the FME module 182B, and the merge module 184B may perform an IME operation, an FME operation, and a merge operation on a left PU generated by dividing a CU according to an N ⁇ 2N division mode.
  • the PU mode decision module 186B may select one of the predictive image blocks generated by the IME module 180B, the FME module 182B, and the merge module 184B.
  • the IME module 180C, the FME module 182C, and the merge module 184C may perform an IME operation, an FME operation, and a merge operation on a right PU generated by dividing a CU according to an N ⁇ 2N division mode.
  • the PU mode decision module 186C may select one of the predictive image blocks generated by the IME module 180C, the FME module 182C, and the merge module 184C.
  • the IME module 180N, the FME module 182N, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a lower right PU generated by dividing a CU according to an N ⁇ N division mode.
  • the PU mode decision module 186N may select one of the predictive image blocks generated by the IME module 180N, the FME module 182N, and the merge module 184N.
  • the PU mode decision module 186 may select a predictive image block based on a code rate-distortion cost analysis of a plurality of possible predictive image blocks, and select a predictive image block that provides the best code rate-distortion cost for a given decoding situation. For example, for bandwidth-constrained applications, the PU mode decision module 186 may prefer to select predictive image blocks that increase the compression ratio, while for other applications, the PU mode decision module 186 may prefer to select predictive images that increase the quality of the reconstructed video. Piece.
  • the CU mode decision module 188 selects a partition mode for the current CU and outputs the predictive image block and motion information of the PU belonging to the selected partition mode. .
  • FIG. 5 is an implementation flowchart of a merge mode in an embodiment of the present application.
  • a video encoder eg, video encoder 20
  • the merging operation 200 may include: 202. Generate a candidate list for a current prediction unit. 204. Generate a predictive video block associated with a candidate in the candidate list. 206. Select a candidate from the candidate list. 208. Output candidates.
  • the candidate refers to a candidate motion vector or candidate motion information.
  • the video encoder may perform a merge operation different from the merge operation 200.
  • the video encoder may perform a merge operation, where the video encoder performs more or fewer steps than the merge operation 200 or steps different from the merge operation 200.
  • the video encoder may perform the steps of the merge operation 200 in a different order or in parallel.
  • the encoder may also perform a merge operation 200 on a PU encoded in a skip mode.
  • the video encoder may generate a list of candidate predicted motion vectors for the current PU (202).
  • the video encoder may generate a list of candidate prediction motion vectors for the current PU in various ways.
  • the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one of the example techniques described below with respect to FIGS. 8-12.
  • the candidate prediction motion vector list for the current PU includes at least one first candidate motion vector and at least one second candidate motion vector set identifier.
  • the candidate prediction motion vector list for the current PU may include a temporal candidate prediction motion vector.
  • the temporal candidate prediction motion vector may indicate motion information of a co-located PU in the time domain.
  • a co-located PU may be spatially in the same position in the image frame as the current PU, but in a reference picture instead of the current picture.
  • a reference picture including a PU corresponding to the time domain may be referred to as a related reference picture.
  • a reference image index of a related reference image may be referred to as a related reference image index in this application.
  • the current image may be associated with one or more reference image lists (eg, list 0, list 1, etc.).
  • the reference image index may indicate a reference image by indicating a position in a reference image list of the reference image.
  • the current image may be associated with a combined reference image list.
  • the related reference picture index is the reference picture index of the PU covering the reference index source position associated with the current PU.
  • the reference index source location associated with the current PU is adjacent to the left of the current PU or above the current PU.
  • the PU may "cover" the specific location.
  • the video encoder can use a zero reference picture index.
  • the reference index source location associated with the current PU is within the current CU.
  • the PU may need to access motion information of another PU of the current CU in order to determine a reference picture containing a co-located PU. Therefore, these video encoders may use motion information (ie, a reference picture index) of a PU belonging to the current CU to generate a temporal candidate prediction motion vector for the current PU. In other words, these video encoders may use temporal information of a PU belonging to the current CU to generate a temporal candidate prediction motion vector. Therefore, the video encoder may not be able to generate a list of candidate prediction motion vectors for the current PU and the PU covering the reference index source position associated with the current PU in parallel.
  • motion information ie, a reference picture index
  • the video encoder may explicitly set the relevant reference picture index without referring to the reference picture index of any other PU. This may enable the video encoder to generate candidate prediction motion vector lists for the current PU and other PUs of the current CU in parallel. Because the video encoder explicitly sets the relevant reference picture index, the relevant reference picture index is not based on the motion information of any other PU of the current CU. In some feasible implementations where the video encoder explicitly sets the relevant reference picture index, the video encoder may always set the relevant reference picture index to a fixed, predefined preset reference picture index (eg, 0).
  • a fixed, predefined preset reference picture index eg, 0
  • the video encoder may generate a temporal candidate prediction motion vector based on the motion information of the co-located PU in the reference frame indicated by the preset reference picture index, and may include the temporal candidate prediction motion vector in the candidate prediction of the current CU List of motion vectors.
  • the video encoder may be explicitly used in a syntax structure (e.g., image header, slice header, APS, or another syntax structure)
  • the related reference picture index is signaled.
  • the video encoder may signal the decoder to the relevant reference picture index for each LCU (ie, CTU), CU, PU, TU, or other type of sub-block. For example, the video encoder may signal that the relevant reference picture index for each PU of the CU is equal to "1".
  • the relevant reference image index may be set implicitly rather than explicitly.
  • the video encoder may use the motion information of the PU in the reference image indicated by the reference image index of the PU covering the location outside the current CU to generate a candidate prediction motion vector list for the PU of the current CU. Each time candidate predicts a motion vector, even if these locations are not strictly adjacent to the current PU.
  • the video encoder may generate predictive image blocks associated with the candidate prediction motion vectors in the candidate prediction motion vector list (204).
  • the video encoder may generate the candidate prediction motion vector by determining the motion information of the current PU based on the motion information of the indicated candidate prediction motion vector and then generating a predictive image block based on one or more reference blocks indicated by the motion information of the current PU. Associated predictive image blocks.
  • the video encoder may then select one of the candidate prediction motion vectors from the candidate prediction motion vector list (206).
  • the video encoder can select candidate prediction motion vectors in various ways. For example, a video encoder may select one of the candidate prediction motion vectors based on a code rate-distortion cost analysis of each of the predictive image blocks associated with the candidate prediction motion vector.
  • the video encoder may output a candidate prediction motion vector index (208).
  • the candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list.
  • the candidate prediction motion vector index may be represented as "merge_idx".
  • FIG. 6 is an implementation flowchart of an advanced motion vector prediction (AMVP) mode in an embodiment of the present application.
  • a video encoder eg, video encoder 20
  • the AMVP operation 210 may include: 211. Generate one or more motion vectors for a current prediction unit. 212. Generate a predictive video block for the current prediction unit. 213. Generate a candidate list for the current prediction unit. 214. Generate a motion vector difference. 215 Select a candidate from the candidate list. 216. Output a reference picture index, a candidate index, and a motion vector difference for selecting a candidate.
  • the candidate refers to a candidate motion vector or candidate motion information.
  • the video encoder may generate one or more motion vectors for the current PU (211).
  • the video encoder may perform integer motion estimation and fractional motion estimation to generate motion vectors for the current PU.
  • the current image may be associated with two reference image lists (List 0 and List 1).
  • the video encoder may generate a list 0 motion vector or a list 1 motion vector for the current PU.
  • the list 0 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 0.
  • the list 1 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 1.
  • the video encoder may generate a list 0 motion vector and a list 1 motion vector for the current PU.
  • the video encoder may generate predictive image blocks for the current PU (212).
  • the video encoder may generate predictive image blocks for the current PU based on one or more reference blocks indicated by one or more motion vectors for the current PU.
  • the video encoder may generate a list of candidate predicted motion vectors for the current PU (213).
  • the video decoder may generate a list of candidate prediction motion vectors for the current PU in various ways.
  • the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one or more of the possible implementations described below with respect to FIGS. 8 to 12.
  • the list of candidate prediction motion vectors may be limited to two candidate prediction motion vectors.
  • the list of candidate prediction motion vectors may include more candidate prediction motion vectors (eg, five candidate prediction motion vectors).
  • the video encoder may generate one or more motion vector differences (MVD) for each candidate prediction motion vector in the list of candidate prediction motion vectors (214).
  • the video encoder may generate a motion vector difference for the candidate prediction motion vector by determining a difference between the motion vector indicated by the candidate prediction motion vector and a corresponding motion vector of the current PU.
  • the video encoder may generate a single MVD for each candidate prediction motion vector. If the current PU is bi-predicted, the video encoder may generate two MVDs for each candidate prediction motion vector.
  • the first MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 0 motion vector of the current PU.
  • the second MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 1 motion vector of the current PU.
  • the video encoder may select one or more of the candidate prediction motion vectors from the candidate prediction motion vector list (215).
  • the video encoder may select one or more candidate prediction motion vectors in various ways. For example, a video encoder may select a candidate prediction motion vector with an associated motion vector that matches the motion vector to be encoded with minimal error, which may reduce the number of bits required to represent the motion vector difference for the candidate prediction motion vector.
  • the video encoder may output one or more reference image indexes for the current PU, one or more candidate prediction motion vector indexes, and one or more selected candidate motion vectors.
  • One or more motion vector differences of the predicted motion vector (216).
  • the video encoder may output a reference picture index ("ref_idx_10") for List 0 or for Reference image index of list 1 ("ref_idx_11").
  • the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may also output a list 0 motion vector or a list 1 motion vector MVD for the current PU.
  • the video encoder may output the reference picture index ("ref_idx_10") for List 0 and the list Reference image index of 1 ("ref_idx_11").
  • the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
  • the video encoder may also output the MVD of the list 0 motion vector for the current PU and the MVD of the list 1 motion vector for the current PU.
  • FIG. 7 is an implementation flowchart of motion compensation performed by a video decoder (such as video decoder 30) in an embodiment of the present application.
  • the video decoder may receive an indication of the selected candidate prediction motion vector for the current PU (222). For example, the video decoder may receive a candidate prediction motion vector index indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU.
  • the video decoder may receive the first candidate prediction motion vector index and the second candidate prediction motion vector index.
  • the first candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
  • the second candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
  • a single syntax element may be used to identify two candidate prediction motion vector indexes.
  • the video decoder may accept the candidate prediction motion indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU.
  • Vector index or accept an identifier indicating the position of the classification to which the selected candidate prediction motion vector belongs in the candidate prediction motion vector list of the current PU, and the candidate prediction motion vector of the position of the selected candidate prediction motion vector in its classification index.
  • the video decoder may generate a list of candidate predicted motion vectors for the current PU (224).
  • the video decoder may generate this candidate prediction motion vector list for the current PU in various ways.
  • the video decoder may use the techniques described below with reference to FIGS. 8 to 12 to generate a list of candidate prediction motion vectors for the current PU.
  • the video decoder may explicitly or implicitly set a reference image index identifying a reference image including a co-located PU, as described above Figure 5 describes this.
  • a type of candidate prediction motion vector may be indicated by an identification bit in the candidate prediction motion vector list to control the length of the candidate prediction motion vector list.
  • the video decoder may determine the current PU's based on the motion information indicated by one or more selected candidate prediction motion vectors in the candidate prediction motion vector list for the current PU.
  • Motion information (225). For example, if the motion information of the current PU is encoded using a merge mode, the motion information of the current PU may be the same as the motion information indicated by the selected candidate prediction motion vector. If the motion information of the current PU is encoded using the AMVP mode, the video decoder may use one or more MVDs indicated in the one or more motion vectors and the code stream indicated by the or the selected candidate prediction motion vector. To reconstruct one or more motion vectors of the current PU.
  • the reference image index and prediction direction identifier of the current PU may be the same as the reference image index and prediction direction identifier of the one or more selected candidate prediction motion vectors.
  • the video decoder may generate a predictive image block for the current PU based on one or more reference blocks indicated by the motion information of the current PU (226).
  • FIG. 8 is an exemplary schematic diagram of a coding unit (CU) and an adjacent position image block associated with the coding unit (CU) in the embodiment of the present application, illustrating CU250 and schematic candidate prediction motion vector positions 252A to 252E associated with CU250.
  • This application may collectively refer to the candidate prediction motion vector positions 252A to 252E as the candidate prediction motion vector positions 252.
  • the candidate prediction motion vector position 252 indicates a spatial candidate prediction motion vector in the same image as the CU 250.
  • the candidate prediction motion vector position 252A is positioned to the left of CU250.
  • the candidate prediction motion vector position 252B is positioned above the CU250.
  • the candidate prediction motion vector position 252C is positioned at the upper right of CU250.
  • the candidate prediction motion vector position 252D is positioned at the lower left of CU250.
  • the candidate prediction motion vector position 252E is positioned at the upper left of the CU250.
  • FIG. 8 is a schematic embodiment of a manner for providing a list of candidate prediction motion vectors that the inter prediction module 121 and the motion compensation module can generate. The embodiments will be explained below with reference to the inter prediction module 121, but it should be understood that the motion compensation module can implement the same technique and thus generate the same candidate prediction motion vector list.
  • FIG. 9 is a flowchart of constructing a candidate prediction motion vector list according to an embodiment of the present application.
  • the technique of FIG. 9 will be described with reference to a list including five candidate prediction motion vectors, but the techniques described herein may also be used with lists of other sizes.
  • the five candidate prediction motion vectors may each have an index (eg, 0 to 4).
  • the technique of FIG. 9 will be described with reference to a general video decoder.
  • a general video decoder may be, for example, a video encoder (such as video encoder 20) or a video decoder (such as video decoder 30).
  • the selected prediction motion vector list constructed based on the technology of the present application is described in detail in the following embodiments, and will not be repeated here.
  • the video decoder first considers four spatial candidate prediction motion vectors (902).
  • the four spatial candidate prediction motion vectors may include candidate prediction motion vector positions 252A, 252B, 252C, and 252D.
  • the four spatial candidate prediction motion vectors correspond to motion information of four PUs in the same image as the current CU (for example, CU250).
  • the video decoder may consider the four spatial candidate prediction motion vectors in the list in a particular order. For example, the candidate prediction motion vector position 252A may be considered first. If the candidate prediction motion vector position 252A is available, the candidate prediction motion vector position 252A may be assigned to index 0.
  • the video decoder may not include the candidate prediction motion vector position 252A in the candidate prediction motion vector list.
  • Candidate prediction motion vector positions may be unavailable for various reasons. For example, if the candidate prediction motion vector position is not within the current image, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is intra-predicted, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is in a slice different from the current CU, the candidate prediction motion vector position may not be available.
  • the video decoder may next consider the candidate prediction motion vector position 252B. If the candidate prediction motion vector position 252B is available and different from the candidate prediction motion vector position 252A, the video decoder may add the candidate prediction motion vector position 252B to the candidate prediction motion vector list.
  • the terms "same” and “different” refer to motion information associated with candidate predicted motion vector locations. Therefore, two candidate prediction motion vector positions are considered the same if they have the same motion information, and are considered different if they have different motion information. If the candidate prediction motion vector position 252A is not available, the video decoder may assign the candidate prediction motion vector position 252B to index 0.
  • the video decoder may assign the candidate prediction motion vector position 252 to index 1. If the candidate prediction motion vector position 252B is not available or the same as the candidate prediction motion vector position 252A, the video decoder skips the candidate prediction motion vector position 252B and does not include it in the candidate prediction motion vector list.
  • the candidate prediction motion vector position 252C is similarly considered by the video decoder for inclusion in the list. If the candidate prediction motion vector position 252C is available and not the same as the candidate prediction motion vector positions 252B and 252A, the video decoder assigns the candidate prediction motion vector position 252C to the next available index. If the candidate prediction motion vector position 252C is unavailable or different from at least one of the candidate prediction motion vector positions 252A and 252B, the video decoder does not include the candidate prediction motion vector position 252C in the candidate prediction motion vector list. Next, the video decoder considers the candidate prediction motion vector position 252D.
  • the video decoder assigns the candidate prediction motion vector position 252D to the next available index. If the candidate prediction motion vector position 252D is unavailable or different from at least one of the candidate prediction motion vector positions 252A, 252B, and 252C, the video decoder does not include the candidate prediction motion vector position 252D in the candidate prediction motion vector list.
  • candidate prediction motion vectors 252A to 252D for inclusion in the candidate prediction motion vector list, but in some embodiments, all candidate prediction motion vectors 252A to 252D may be first added to the candidate A list of predicted motion vectors, with duplicates removed from the list of candidate predicted motion vectors later.
  • the candidate prediction motion vector list may include four spatial candidate prediction motion vectors or the list may include less than four spatial candidate prediction motion vectors. If the list includes four spatial candidate prediction motion vectors (904, Yes), the video decoder considers temporal candidate prediction motion vectors (906).
  • the temporal candidate prediction motion vector may correspond to motion information of a co-located PU of a picture different from the current picture. If a temporal candidate prediction motion vector is available and different from the first four spatial candidate prediction motion vectors, the video decoder assigns the temporal candidate prediction motion vector to index 4.
  • the video decoder does not include the temporal candidate prediction motion vector in the candidate prediction motion vector list. Therefore, after the video decoder considers temporal candidate prediction motion vectors (906), the candidate prediction motion vector list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902 and the The temporal candidate prediction motion vector) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902). If the candidate prediction motion vector list includes five candidate prediction motion vectors (908, Yes), the video decoder completes building the list.
  • the video decoder may consider a fifth spatial candidate prediction motion vector (910).
  • the fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors at positions 252A, 252B, 252C, and 252D, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list.
  • the five-space candidate prediction motion vector is assigned to index 4.
  • the video decoder may not include the candidate prediction motion vector at position 252 in Candidate prediction motion vector list. So after considering the fifth spatial candidate prediction motion vector (910), the list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at box 902 and the fifth spatial candidate prediction considered at box 910) Motion vectors) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902).
  • the video decoder finishes generating the candidate prediction motion vector list. If the candidate prediction motion vector list includes four candidate prediction motion vectors (912, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes).
  • the video decoder may consider the fifth spatial candidate prediction motion vector (918).
  • the fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list, the The five-space candidate prediction motion vector is assigned to the next available index.
  • the video decoder may not include the candidate prediction motion vector at position 252E in Candidate prediction motion vector list.
  • the video decoder may then consider the temporal candidate prediction motion vector (920). If a temporal candidate prediction motion vector is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add the temporal candidate prediction motion vector to the candidate prediction motion vector list, the temporal candidate The predicted motion vector is assigned to the next available index. If the temporal candidate prediction motion vector is not available or is not different from one of the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may not include the temporal candidate prediction motion vector in the candidate prediction motion vector. List.
  • the candidate prediction motion vector list includes five candidate prediction motion vectors (922, Yes)
  • the video decoder finishes generating List of candidate prediction motion vectors. If the list of candidate prediction motion vectors includes less than five candidate prediction motion vectors (922, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes) until.
  • an additional merge candidate prediction motion vector may be artificially generated after the spatial candidate prediction motion vector and the temporal candidate prediction motion vector, so that the size of the merge candidate prediction motion vector list is fixed to the designation of the merge candidate prediction motion vector. Number (for example, five of the possible implementations of FIG. 9 above).
  • Additional merge candidate prediction motion vectors may include exemplary combined bi-predictive merge candidate prediction motion vectors (candidate prediction motion vector 1), scaled bi-directional predictive merge candidate prediction motion vectors (candidate prediction motion vector 2), and zero vectors Merge / AMVP candidate prediction motion vector (candidate prediction motion vector 3).
  • a spatial candidate prediction motion vector and a temporal candidate prediction motion vector may be directly included in the candidate prediction motion vector list, and an artificially generated additional merge candidate prediction motion vector is indicated in the candidate prediction motion vector list through an identification bit.
  • FIG. 10 is an exemplary schematic diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
  • the combined bi-directional predictive merge candidate prediction motion vector may be generated by combining the original merge candidate prediction motion vector.
  • two candidate prediction motion vectors (which have mvL0_A and ref0 or mvL1_B and ref0) among the original candidate prediction motion vectors may be used to generate a bidirectional predictive merge candidate prediction motion vector.
  • two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list.
  • the prediction type of one candidate prediction motion vector is List 0 unidirectional prediction
  • the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction.
  • mvL0_A and ref0 are picked from list 0
  • mvL1_B and ref0 are picked from list 1
  • a bidirectional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and MvL1_B and ref0) in Listing 1 and check whether it is different from the candidate prediction motion vectors that have been included in the candidate prediction motion vector list. If it is different, the video decoder may include the bi-directional predictive merge candidate prediction motion vector in the candidate prediction motion vector list.
  • FIG. 11 is an exemplary schematic diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
  • the scaled bi-directional predictive merge candidate prediction motion vector may be generated by scaling the original merge candidate prediction motion vector.
  • a candidate prediction motion vector (which may have mvL0_A and ref0 or mvL1_A and ref1) from the original candidate prediction motion vector may be used to generate a bidirectional predictive merge candidate prediction motion vector.
  • two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list.
  • the prediction type of one candidate prediction motion vector is List 0 unidirectional prediction
  • the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction.
  • mvL0_A and ref0 may be picked from list 0, and ref0 may be copied to the reference index ref0 ′ in list 1. Then, mvL0′_A may be calculated by scaling mvL0_A with ref0 and ref0 ′. The scaling may depend on the POC distance.
  • a bi-directional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and mvL0'_A and ref0 'in list 1) can be generated and checked if it is a duplicate. If it is not duplicate, it can be added to the merge candidate prediction motion vector list.
  • FIG. 12 is an exemplary schematic diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
  • the zero vector merge candidate prediction motion vector may be generated by combining the zero vector with a reference index that can be referred to. If the zero vector candidate prediction motion vector is not duplicated, it can be added to the merge candidate prediction motion vector list. For each generated merge candidate prediction motion vector, the motion information may be compared with the motion information of the previous candidate prediction motion vector in the list.
  • the pruning operation may include comparing one or more new candidate prediction motion vectors with candidate prediction motion vectors already in the candidate prediction motion vector list and not adding as candidates already in the candidate prediction motion vector list. Repeated new candidate prediction motion vector for prediction motion vector.
  • the pruning operation may include adding one or more new candidate prediction motion vectors to a list of candidate prediction motion vectors and removing duplicate candidate prediction motion vectors from the list later.
  • a newly generated candidate prediction motion vector may be used as a type of candidate motion vector, which is indicated by an identification bit in the original candidate prediction motion vector list Newly generated candidate prediction motion vector.
  • the code stream includes an identifier 1 indicating the category of the newly generated candidate prediction motion vector and the selected candidate motion vector is in the new Identification 2 of the position in the category of the generated candidate prediction motion vector.
  • the selected candidate motion vector is determined from the candidate prediction motion vector list according to the identifier 1 and the identifier 2 and a subsequent decoding process is performed.
  • the spatial candidate prediction mode is exemplified from five positions 252A to 252E shown in FIG. s position.
  • the spatial candidate prediction mode may further include, for example, a preset distance from the image block to be processed Within a distance, but not adjacent to the image block to be processed. Exemplarily, such positions may be shown as 252F to 252J in FIG. 13.
  • FIG. 13 is an exemplary schematic diagram of a coding unit and an adjacent position image block associated with the coding unit in the embodiment of the present application. The positions described in the image blocks that are in the same image frame as the image block to be processed and that have been reconstructed when the image block to be processed is not adjacent to the image block to be processed are within the range of such positions.
  • This type of location may be referred to as a non-adjacent image block in the spatial domain, and the first non-adjacent image block, the second non-adjacent image block, and the third non-adjacent image block in the spatial domain may be available.
  • the physical meaning of "available” can be referred As mentioned above, I will not repeat them here.
  • the candidate prediction motion mode list is checked and constructed in the following order. It should be understood that the check includes the “available” mentioned above. The process of checking and trimming will not be repeated here.
  • the candidate prediction mode list includes: a motion vector of a 252A position image block, a motion vector of a 252B position image block, a motion vector of a 252C position image block, a motion vector of a 252D position image block, and prediction from a selective time domain motion vector ( (ATMVP) technology, motion vector obtained from 252E position image block, motion vector obtained by spatio-temporal motion vector prediction (STMVP) technology.
  • ATMVP technology and STMVP technology are detailed in JVET-G1001-v1 section 2.3.1.1 and 2.3.1.2. This article introduces JVET-G1001-v1 in its entirety, and will not repeat them here.
  • the candidate prediction mode list includes the above 7 prediction motion vectors.
  • the number of prediction motion vectors included in the candidate prediction mode list may be less than 7, such as Take the first 5 to form the candidate prediction mode list, and also add the motion vectors constructed by the feasible embodiments described in FIGS. 10 to 12 to the candidate prediction mode list to make it contain more Predicted motion vector.
  • the first spatial domain non-adjacent image block, the second spatial domain non-adjacent image block, and the third spatial domain non-adjacent image block may be added to the candidate prediction mode list as image blocks to be processed. Predicted motion vector.
  • motion vector of the 252A position image block, the 252B position image block, the 252C position image block, the 252D position image block, the motion vector obtained by ATMVP technology, and the 252E position image block are MVL, MMU, MVUR, MVDL, MVA, MVUL, and MVS.
  • Motion vectors are MV0, MV1, and MV2 respectively, you can check and build a candidate prediction motion vector list in the following order:
  • Example 1 MVL, MMU, MVUR, MVDL, MV0, MV1, MV2, MVA, MVUL, MVS;
  • Example 2 MVL, MMU, MVUR, MVDL, MVA, MV0, MV1, MV2, MVUL, MVS;
  • Example 3 MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
  • Example 4 MVL, MMU, MVUR, MVDL, MVA, MVUL, MVS, MV0, MV1, MV2;
  • Example 5 MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MVS, MV2;
  • Example 6 MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MV2, MVS;
  • Example 7 MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
  • candidate prediction motion vectors may be used in the Merge mode or AMVP mode described above, or other prediction modes that obtain the predicted motion vectors of the image block to be processed, may be used at the encoding end, or may be used with the corresponding The encoding end is used consistently at the decoding end, without limitation.
  • the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and is consistent at the encoding and decoding ends, and the specific number is not limited.
  • examples 1 to 7 give examples of the composition of several feasible candidate prediction motion vector lists. Based on the motion vectors of non-contiguous image blocks in the spatial domain, there may be other composition methods of candidate prediction motion vector lists. The arrangement of candidate prediction motion vectors in the list is not limited.
  • This embodiment of the present application provides another method for constructing candidate prediction motion vector lists. Compared with the methods of candidate prediction motion vector lists such as Examples 1 to 7, this embodiment will determine candidate prediction motion vectors in other embodiments. Combined with the preset vector difference, a new candidate prediction motion vector is formed, which overcomes the shortcomings of low prediction accuracy of the prediction motion vector and improves coding efficiency.
  • the candidate prediction motion vector list of the image block to be processed includes two sub-lists: a first motion vector set and a vector difference set.
  • a first motion vector set For the composition of the first motion vector set, reference may be made to various configurations in the foregoing embodiments of the present invention.
  • the vector difference set includes one or more preset vector differences.
  • each vector difference in the vector difference set is added to the original target motion vector determined from the first motion vector set, and the added vector difference and the original target motion vector become a new one. Sport vector collection.
  • the candidate prediction motion vector list shown in FIG. 14A may include the vector difference set as a subset in the candidate prediction motion vector list, and the candidate prediction motion vector list is calculated by an identification bit (vector difference set calculation).
  • MV vector difference set calculation
  • each vector difference is indicated by an index in the vector difference set
  • a candidate prediction motion vector list constructed is shown in FIG. 14B.
  • the manner in which a class of candidate motion vectors is indicated in the predicted motion vector list provided by the technology of the present application can be used in the Merge mode or AMVP mode described above, or other predicted motion vectors of image blocks to be processed.
  • the prediction mode it can be used at the encoding end, or can be used at the decoding end consistent with the corresponding encoding end, without limitation.
  • the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and The encoding and decoding ends are consistent, and the specific number is not limited.
  • the method for decoding the predicted motion information provided in the embodiments of the present application will be described in detail with reference to the accompanying drawings.
  • a type of candidate motion information is indicated in the list to control the length of the list, and the prediction motion information provided by the embodiment of the present application is decoded
  • the method is developed based on this.
  • the method is executed by a decoding device, which may be the video decoder 200 in the video decoding system 1 shown in FIG. 1, or may be a functional unit in the video decoder 200. This application This is not specifically limited.
  • FIG. 15 is a schematic flowchart of an embodiment of the present application, which relates to a decoding method for predicting motion information, and specifically may include:
  • the decoding device parses the code stream to obtain a first identifier.
  • the code stream is sent by the encoding end after encoding the current image block.
  • the first identifier indicates the position of the selected candidate motion information when the encoding end encodes the current image block.
  • the first identifier is used by the decoding device to determine the The selected candidate motion information further predicts the motion information of the image block to be processed.
  • the first identifier may be a specific index of the selected candidate motion information. In this case, the first identifier may uniquely determine one candidate motion information.
  • the first identifier may be an identifier of a category to which the selected candidate motion information belongs.
  • the code stream further includes a fourth identifier to indicate that the selected candidate motion information is in its own category. Specific location.
  • the first identifier may be a fixed-length encoding method.
  • the first identifier may be a 1-bit identifier, and the types of indication are limited.
  • the first identifier may adopt a variable length encoding method.
  • the decoding device determines a target element from the first candidate set according to the first identifier.
  • the content of the first candidate set may include the following two possible implementations:
  • Elements in the first candidate set include at least one first candidate motion information and at least one second candidate set, and elements in the second candidate set include multiple second candidate motion information.
  • Elements in the first candidate set may include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset Motion information offset. New motion information may be generated according to the first motion information and a preset motion information offset.
  • the first candidate set may be a constructed candidate motion information list.
  • at least one first candidate motion information is directly included, and a plurality of second candidate motion information is included in the first candidate set in the form of a second candidate set.
  • the second candidate motion information and the first candidate motion information are different.
  • the first candidate motion information and the second candidate motion information included in each second candidate set may be candidate motion information determined by using different MV prediction modes, or may be different types of candidate motion information. This embodiment of the present application does not specifically limit this.
  • the first candidate motion information may be motion information acquired in a Merge manner
  • the second candidate motion information may be motion information acquired in an Affine Merge manner.
  • the first candidate motion information may be original candidate motion information
  • the second candidate motion information may be motion information generated according to the original candidate motion information
  • an identification bit in the list is used to indicate a candidate motion information set.
  • the identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application.
  • the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B.
  • the first identifier in the code stream indicates the identifier bit
  • it is determined that the target element is a candidate motion information set indicated by the identifier bit.
  • the candidate motion information set indicated by the identification bit includes a plurality of second candidate motion information. For the candidate motion information set pointed to by the identification bit, one of the candidate motion information is selected as the target motion information according to the further identification (the second identification in S1504), and used to predict the motion information of the image block to be processed.
  • an identification bit in the list is used to indicate a candidate motion information set.
  • the identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application.
  • the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B.
  • the first identifier in the code stream indicates the identifier bit
  • it is determined that the target element is a plurality of second candidate motion information indicated by the identifier bit.
  • the second candidate motion information includes a preset motion information offset.
  • one of the candidate motion information is selected according to the further identification (the second identification in S1504), and the target motion information is determined based on the selected second candidate motion information, To predict the motion information of the image block to be processed.
  • the Merge candidate list in the Merge candidate list, more than one flag is added, and each flag points to a specific candidate motion information set or a plurality of preset motion information is included. Offset motion information.
  • the first identifier in the code stream indicates a certain identifier bit
  • it is determined that the target element is the candidate motion information in the candidate motion information set indicated by the identifier bit, or according to multiple candidate motion information indicated by the identifier bit (including One of the preset motion information offsets) to determine the target motion information.
  • Figures 16A, 16B, and 16C introduce the identification (pointer) method in the Merge list to implement the introduction of candidates as a subset.
  • the length of the candidate list is greatly reduced, and the complexity of list reconstruction is reduced. Degree, which is helpful for simplifying the hardware implementation.
  • the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed. It should be noted that the definition of the motion information of the adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
  • the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed. It should be noted that the definition of motion information of non-adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
  • the method for acquiring the first motion information may be selected according to actual requirements, which is not specifically limited in the embodiment of the present application.
  • the value of the preset motion information offset used to obtain the second motion information may be a fixed value or a value selected from a set.
  • the content of the preset motion information offset in this embodiment of the present application Neither the form nor the form is specifically limited.
  • the first candidate motion information includes first motion information
  • at least one second candidate set is a plurality of second candidate sets
  • the plurality of second candidate sets includes at least one third candidate set and at least one A fourth candidate set
  • the elements of the third candidate set include motion information of spatial non-adjacent image blocks of a plurality of image blocks to be processed
  • the elements of the fourth candidate set include a plurality of motions based on the first motion information and a preset motion Information obtained from the information offset.
  • the at least one second candidate set is a plurality of second candidate sets
  • the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set.
  • the elements include motion information of spatially non-adjacent image blocks of a plurality of image blocks to be processed, and the elements in the sixth candidate set include a plurality of preset motion information offsets.
  • the coding codeword used to identify the first motion information is the shortest.
  • the first motion information does not include motion information obtained according to the ATMVP mode.
  • the first identifier may be an index in the first candidate set or an identifier classified in the motion information.
  • S1502 may be implemented in the following two cases:
  • the first identifier is an index in the first candidate set.
  • the decoding device in S1502 may determine an element at a position indicated by the first identifier in the first candidate set as a target element. Since the first candidate set includes at least one first candidate motion information and at least one second candidate set, the target element determined according to the first identifier may be the first candidate motion information or a second candidate set, depending on The content arranged at the position indicated by the first identifier.
  • the decoding device in S1502 may determine the element at the position indicated by the first identifier in the first candidate set as the target element. Since the first candidate set includes at least one first candidate motion information and a plurality of second candidate motion information, the target element determined according to the first identifier may be the first candidate motion information or may be based on a plurality of second candidate motion information. The information obtained depends on the content of the position arrangement indicated by the first identifier.
  • the first identifier is an identifier of candidate motion information classification.
  • the decoding device in S1502 determines the classification to which the target element belongs according to the first identifier.
  • the decoding device parses the bitstream to obtain a fourth identifier, the fourth identifier indicates a specific position of the target element in its classification, and uniquely determines the target element in its classification according to the fourth identifier. Specifically, if the first identifier indicates that the target element belongs to the classification of the first candidate motion information, one first candidate motion information is determined as the target element among the at least one first candidate motion information according to the fourth identifier. If the first identifier indicates that the target element belongs to a certain category of the second candidate motion information, a second candidate set or a second candidate motion information as the target element is determined according to the fourth identifier.
  • the first candidate motion information is Merge motion information
  • the first candidate set includes two second candidate sets
  • the second candidate motion information in one second candidate set is Affine Merge motion information of the first type.
  • the second candidate motion information in another second candidate set is Affine Merge motion information of the second type.
  • a configuration identifier of 0 indicates Merge motion information
  • an identifier of 1 indicates Affine Merge motion information.
  • the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, one second candidate set is determined as the target element among the two second candidate sets.
  • the first candidate motion information is Merge motion information
  • the first candidate set includes two second candidate sets
  • the second candidate motion information in one second candidate set corresponds to the first type of Affine Merge motion information.
  • a preset motion information offset, and the second candidate motion information in another second candidate set is a preset motion information offset of the second type of AffineMerge motion information.
  • a configuration identifier of 0 indicates Merge motion information
  • an identifier of 1 indicates Affine Merge motion information.
  • one Merge motion information is determined as a target element among at least one Merge motion information in the first candidate set. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, a second candidate set is determined from the two second candidate sets, and a target element is determined based on one of the second candidate motion information in the determined second candidate set.
  • the decoding device determines that the target element is the first candidate motion information, and executes S1503; in S1502, the decoding device determines that the target element is the second candidate set, or is obtained according to multiple second candidate motion information. , Then execute S1504.
  • the target motion information is used to predict the motion information of the image block to be processed.
  • the target motion information is used to predict the motion information of the image block to be processed, which can be specifically implemented as: using the target motion information as the motion information of the image block to be processed; or, using the target motion information as the predicted motion of the image block to be processed information.
  • a specific implementation of selecting target motion information to predict motion information of an image block to be processed may be selected according to actual requirements, which is not specifically limited here.
  • the code stream is parsed in S1504 to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information, which may be specifically implemented as: analyzing the code stream to obtain the second identifier, according to The second identifier determines target motion information from a plurality of second candidate motion information.
  • the second identifier may adopt a fixed-length encoding method.
  • the second identifier may be a 1-bit identifier, and the types of indication are limited.
  • the second identifier may adopt a variable-length encoding method.
  • the second identifier may be a plurality of bit identifiers.
  • determining the target motion information may be achieved by one of the following feasible implementation methods, but It is not limited to this.
  • the second candidate motion information when the first candidate motion information includes the first motion information, the second candidate motion information includes the second motion information, and the second motion information is based on the first motion information and a preset motion information offset.
  • the second identifier may be a specific position of the target motion information in the second candidate set, and the decoding device in S1504 determines the target motion information from a plurality of second candidate motion information according to the second identifier. It is implemented as: determining the second candidate motion information of the position indicated by the second identifier in the second candidate set as the target element as the target motion information.
  • the second identifier is that the target offset is between The specific position in the second candidate set.
  • the decoding device in S1504 determines the target motion information from the plurality of second candidate motion information according to the second identifier, which may be specifically implemented as: biasing from a plurality of preset motion information according to the second identifier
  • the target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
  • the method for decoding prediction motion information may further include: multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motions.
  • Information offset correspondingly, determining a target offset from a plurality of preset motion information offsets according to a second identifier, including: determining a target from a plurality of adjusted motion information offsets according to a second identifier Offset.
  • Determining the target motion information based on one of the plurality of second candidate motion information may be specifically implemented as follows: determining a motion information offset from a plurality of preset motion information offsets and multiplying the preset by a second identifier The coefficient is used as the target offset; the target motion information is determined based on the first motion information and the target offset.
  • the preset coefficient may be a fixed coefficient configured in the decoding device, or may be a coefficient carried in a code stream, which is not specifically limited in this embodiment of the present application.
  • the method for decoding prediction motion information provided in this application may further include S1505.
  • the third identifier includes a preset coefficient.
  • the elements in the first candidate set include the first candidate motion information and at least one second candidate set, or the elements in the first candidate set include the first candidate motion information And multiple second candidate motion information.
  • a type of candidate motion information set can be added as an element to the first candidate set, compared to The candidate motion information is directly added to the first candidate set, and the length of the first candidate set is greatly selected.
  • the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
  • the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image
  • the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the index 0 and a preset motion vector offset.
  • the candidate motion information corresponding to the first index 0 is forward prediction
  • the motion vector is (2, -3)
  • the reference frame POC is 2.
  • the preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1).
  • the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value.
  • the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image
  • the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the first index 0 and a preset motion vector offset.
  • the motion information of the candidate corresponding to the first index 0 is bidirectional prediction
  • the forward motion vector is (2, -3)
  • the reference frame POC is 2
  • the backward motion vector is (-2, -1)
  • the preset motion vector offsets are (1,0), (0, -1), (-1, 0), (0, 1).
  • the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value.
  • the motion information of the current image block is bidirectional prediction, and when the current frame POC is 3, the forward and backward reference frames POC are non-unidirectional than the current frame POC.
  • the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image. It is assumed that the candidate motion information indicated by the first index 0 is composed of sub-block motion information, and the candidate motion information corresponding to the first index 1 is not composed of sub-blocks.
  • the motion information is composed of motion information and forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
  • the first index 6 corresponds to the candidate motion information corresponding to the first index 1 and the preset motion vector bias. New motion information generated by shifting; preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1).
  • the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to the first index 1 and a preset motion vector offset, and then further decoded.
  • the maximum length of the Merge candidate list be 7
  • the first index 0-6 indicates each candidate space in the Merge list.
  • the first index 6 indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block.
  • the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-adjacent airspace candidate motion information in the set be as follows:
  • Second index 0 candidate 0: forward prediction
  • the motion vector is (2, -3)
  • the reference frame POC is 2.
  • Second index 1 Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
  • Second index 2 Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
  • Second index 3 Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
  • the first index value obtained by decoding is 6, it indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block, and then is further decoded to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of candidate 1 in the non-adjacent spatial domain candidate set is used as the motion information of the current block.
  • the candidate motion information corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
  • the first index 6 indicates new motion information generated based on candidate motion information corresponding to the first index 0 or motion information using non-adjacent spatial domain candidates as reference motion information of the current block.
  • the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-non-adjacent airspace candidate motion information in the set be as follows:
  • Second index 0 candidate 0: forward prediction, the motion vector is (-5, -3), and the reference frame POC is 2.
  • Second index 1 Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
  • Second index 2 Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
  • Second index 3 Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
  • Second index 4 Candidate 4: Forward prediction, the motion vector is (2, -3) + (1,0), and the reference frame POC is 2.
  • Second index 5 Candidate 5: Forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2.
  • Second index 6 candidate 6: forward prediction, the motion vector is (2, -3) + (-1,0), and the reference frame POC is 2.
  • Second index 7 candidate 7: forward prediction, the motion vector is (2, -3) + (0, 1), and the reference frame POC is 2.
  • the first index value obtained by decoding is 6, it indicates that the current block uses new motion information generated based on candidate motion information corresponding to the first index 0 or uses non-adjacent spatial candidate motion information as reference motion information of the current block. Then it is further decoded to obtain a second index value.
  • the motion information of candidate 0 forward prediction, motion vector is (-5, -3), and reference frame POC is 2) in the non-adjacent spatial candidate set.
  • the motion vector offset candidate 5 forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2) As the current block motion information.
  • the first index 0-6 indicates each candidate space in the Merge list.
  • the motion information of the candidate corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
  • the first index 6 indicates that the motion information adopted by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0. Offset according to a preset motion vector:
  • the second index value 0 indicates candidates with a spacing of 1
  • the 1 index indicates candidates with a spacing of 2
  • the third index value indicates a candidate index of a motion vector offset.
  • the first index value obtained by decoding is 6, it indicates that the motion information used by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0, and then further decoded to obtain a second index value.
  • the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
  • Second index 0 AFFINE candidate 0;
  • Second index 1 AFFINE candidate 1;
  • Second index 2 AFFINE candidate 2
  • Second index 3 AFFINE candidate 3;
  • the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
  • the neighboring spatial motion information candidate set includes four neighboring spatial motion information candidates:
  • Second index 0 neighboring spatial candidate 0;
  • Second index 1 adjacent airspace candidate 1;
  • Second index 2 adjacent airspace candidate 2
  • Second index 3 adjacent airspace candidate 3;
  • the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by using the neighboring space for the current block is the reference motion information, and then further decoded to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of the neighboring spatial domain candidate 1 is used as the motion information of the current block.
  • the first index 0-6 indicates each candidate space in the Merge list.
  • the first index 6 indicates that one candidate in the motion information candidate set obtained by using the neighboring time domain for the current block is reference motion information.
  • the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
  • Second index 0 adjacent time domain candidate 0;
  • Second index 1 adjacent time domain candidate 1;
  • Second index 2 Adjacent time domain candidate 2
  • Second index 3 adjacent time domain candidate 3;
  • the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of the neighboring time domain candidate 1 is used as the motion information of the current block.
  • the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
  • Second index 0 AFFINE candidate
  • Second index 1 ATMVP candidate
  • Second index 2 STMVP candidates
  • the first index value obtained by decoding is 6, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as the reference motion information, and then further decodes to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
  • spaces 0-5 in the list are motion information obtained by using Merge
  • space 6 is a motion information candidate set obtained by AFFINE.
  • the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information
  • the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information.
  • the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
  • Second index 0 AFFINE candidate 0;
  • Second index 1 AFFINE candidate 1;
  • Second index 2 AFFINE candidate 2
  • Second index 3 AFFINE candidate 3;
  • the first index value obtained by decoding when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value.
  • the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block;
  • the first index value obtained by decoding when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index.
  • the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
  • spaces 0-3 in the list are motion information obtained using Merge
  • space 4 is a motion information candidate set obtained using adjacent time domains
  • space 5 is a motion information candidate set composed of sub-block motion information
  • Space 6 is the candidate set of motion information obtained by AFFINE.
  • the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information
  • the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information
  • the first index 01 Indicates that one of the candidates of the motion information candidate set obtained by using the adjacent time domain to the current block is reference motion information
  • the first index 11 indicates that the current block uses one of the motion information candidate sets composed of the sub-block motion information
  • Candidates are reference motion information.
  • AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
  • Second identification 0 AFFINE candidate 0;
  • Second identification 1 AFFINE candidate 1;
  • Second identifier 2 AFFINE candidate 2;
  • Second identification 3 AFFINE candidate 3;
  • the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
  • Second index 0 adjacent time domain candidate 0;
  • Second index 1 adjacent time domain candidate 1;
  • Second index 2 Adjacent time domain candidate 2
  • Second index 3 adjacent time domain candidate 3;
  • the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
  • Second index 0 AFFINE candidate
  • Second index 1 ATMVP candidate
  • Second index 2 STMVP candidates
  • the first index value obtained by decoding when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index.
  • the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
  • the first index value obtained by decoding when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value.
  • the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
  • the first index value obtained by decoding is 01
  • it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain a second identification value.
  • the second identification value obtained by further decoding is 2, the motion information of the neighboring time-domain candidate 2 is used as the motion information of the current block.
  • the first index value obtained by decoding when the first index value obtained by decoding is 11, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as reference motion information, and then further decodes to obtain the second index value.
  • the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
  • An embodiment of the present application provides a decoding device for predicting motion information.
  • the device may be a video decoder, a video encoder, or a decoder.
  • the decoding apparatus for predicting motion information is configured to perform the steps performed by the decoding apparatus in the decoding method for predicting motion information.
  • the decoding apparatus for predicting motion information provided in the embodiment of the present application may include a module corresponding to a corresponding step.
  • the functional modules of the prediction motion information decoding device may be divided according to the foregoing method example.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules.
  • the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 17 illustrates a possible structural diagram of a decoding apparatus for predicting motion information involved in the foregoing embodiment.
  • the decoding apparatus 1700 for predicting motion information may include an analysis module 1701, a determination module 1702, and an assignment module 1703.
  • the functions of each module are as follows:
  • the analysis module 1701 is configured to parse a code stream to obtain a first identifier.
  • a determining module 1702 is configured to determine a target element from a first candidate set according to a first identifier, and the elements in the first candidate set include at least one first candidate motion information and at least one second candidate set.
  • the element includes a plurality of second candidate motion information, or the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset motion information offset.
  • the assignment module 1703 is configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed.
  • the analysis module 1701 is further configured to parse a code stream to obtain a second identifier when the target element is the second candidate set. According to the second identifier, the determination module 1702 is further configured to determine target motion information from multiple second candidate motion information. . Alternatively, the parsing module 1701 is configured to parse the code stream to obtain a second identifier when the target element is obtained according to multiple second candidate motion information, and determine the target motion based on one of the multiple second candidate motion information according to the second identifier. information.
  • the analysis module 1701 is configured to support the decoding device 1700 for predicting motion information to perform S1501, S1505, and the like in the above embodiments, and / or other processes used in the technology described herein.
  • the determining module 1702 is configured to support the decoding apparatus 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein.
  • the assignment module 1703 is configured to support the decoding device 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein.
  • the analysis module 1701 is further configured to: Code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
  • the decoding apparatus 1700 for predicting motion information may further include a calculation module 1704, configured to multiply the plurality of preset motion information offsets by the preset coefficients to obtain multiple Offset of the adjusted motion information.
  • the determination module 1702 is specifically configured to determine the target offset from the plurality of adjusted motion information offsets according to the second identifier.
  • FIG. 18 is a schematic structural block diagram of a decoding device 1800 for predicting motion information in an embodiment of the present application.
  • the decoding device 1800 for predicting motion information includes: a processor 1801 and a memory 1802 coupled to the processor; the processor 1801 is configured to execute the embodiment shown in FIG. 17 and various feasible implementations.
  • the processing module 1801 may be a processor or a controller, for example, it may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an ASIC, an FPGA, or other programmable A logic device, a transistor logic device, a hardware component, or any combination thereof. It may implement or execute various exemplary logical blocks, modules, and circuits described in connection with the present disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination including one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
  • the storage module 102 may be a memory.
  • the above-mentioned prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may both execute the above-mentioned prediction motion information decoding method shown in FIG. 15.
  • the prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may specifically be Video decoding device or other equipment with video codec function.
  • the decoding apparatus 1700 for predicting motion information and the decoding apparatus 1800 for predicting motion information may be used to perform image prediction in the decoding process.
  • An embodiment of the present application provides an inter prediction device.
  • the inter prediction device may be a video decoder, a video encoder, or a decoder.
  • the inter prediction apparatus is configured to perform the steps performed by the inter prediction apparatus in the above inter prediction method.
  • the inter prediction apparatus provided in the embodiment of the present application may include a module corresponding to a corresponding step.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules.
  • the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • the present application also provides a terminal, which includes: one or more processors, a memory, and a communication interface.
  • the memory and the communication interface are coupled to one or more processors; the memory is used to store computer program code, and the computer program code includes instructions.
  • the terminal executes the predicted motion information in the embodiment of the present application. Decoding method.
  • the terminal here can be a video display device, a smart phone, a portable computer, and other devices that can process or play videos.
  • the present application also provides a video decoder including a non-volatile storage medium and a central processing unit.
  • the non-volatile storage medium stores an executable program, and the central processing unit and the non-volatile storage unit The media is connected, and the executable program is executed to implement the decoding method of the predicted motion information in the embodiment of the present application.
  • the present application further provides a decoder, which includes a decoding apparatus for predicting motion information in the embodiment of the present application.
  • Another embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium includes one or more program codes, the one or more programs include instructions, and when a processor in a terminal executes the program code, At this time, the terminal executes the decoding method of the predicted motion information shown in FIG. 15.
  • a computer program product includes computer-executable instructions stored in a computer-readable storage medium; at least one processor of the terminal may be obtained from a computer.
  • the storage medium reads the computer execution instruction, and at least one processor executes the computer execution instruction to cause the terminal to execute a decoding method for predicting motion information as shown in FIG. 15.
  • all or part of them may be implemented by software, hardware, firmware, or any combination thereof.
  • a software program When implemented using a software program, it may appear in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions according to the embodiments of the present application are generated.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a DVD
  • a semiconductor medium for example, a solid state disk (Solid State Disk (SSD)
  • the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over a computer-readable medium as one or more instructions or code, and executed by a hardware-based processing unit.
  • the computer-readable medium may include a computer-readable storage medium or a communication medium, the computer-readable storage medium corresponding to a tangible medium such as a data storage medium, and the communication medium includes a computer program that facilitates, for example, transmission from one place to another according to a communication protocol Any media.
  • computer-readable media may illustratively correspond to (1) non-transitory, tangible computer-readable storage media, or (2) a communication medium such as a signal or carrier wave.
  • a data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application.
  • the computer program product may include a computer-readable medium.
  • the computer-readable storage medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store rendering instructions. Or any other medium in the form of a data structure with the desired code and accessible by a computer. Also, any connection is properly termed a computer-readable medium.
  • coaxial cable fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave
  • DSL digital subscriber line
  • coaxial Cables, fiber optic cables, twisted pairs, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media.
  • computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transitory, tangible storage media.
  • magnetic disks and optical discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), flexible disks, and Blu-ray discs, where magnetic discs typically reproduce data magnetically, and optical discs pass lasers The data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate arrays
  • processors may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein.
  • functionality described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
  • the techniques of this application can be implemented in a wide variety of devices or devices, including wireless handsets, integrated circuits (ICs), or collections of ICs (eg, chipset).
  • ICs integrated circuits
  • collections of ICs eg, chipset
  • Various components, modules, or units are described in this application to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily need to be implemented by different hardware units. More specifically, as described above, various units may be combined in a codec hardware unit or by interoperable hardware units (including one or more processors as described above) combined with appropriate software and / or firmware To provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiments of the present application relate to a decoding method and device for predicted motion information. Said method comprises: parsing a code stream to obtain a first identifier; according to the first identifier, determining from a first candidate set a target element, an element in the first candidate set comprising at least one piece of first candidate motion information and a plurality of pieces of second candidate motion information, the first candidate motion information including first motion information, and the second candidate motion information including a preset motion information offset; when the target element is the first candidate motion information, using the first candidate motion information as target motion information, the target motion information being used to predict motion information concerning an image block to be processed; and when the target element is obtained according to the plurality of pieces of second candidate motion information, parsing the code stream to obtain a second identifier, and according to the second identifier and on the basis of one of the plurality of pieces of second candidate motion information, determining the target motion information.

Description

一种预测运动信息的解码方法及装置Decoding method and device for predicting motion information
本申请要求于2018年09月13日提交国家知识产权局、申请号为201811068957.4、发明名称为“一种视频编解码的方法与装置”的中国专利申请、2018年10月26日提交国家知识产权局、申请号为201811264674.7、发明名称为“一种预测运动信息的解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires a Chinese patent application filed with the State Intellectual Property Office on September 13, 2018, with an application number of 201811068957.4 and an invention name of "a method and device for video encoding and decoding", and a national intellectual property right on October 26, 2018. Office, application number 201811264674.7, and the invention name is "a method and device for predicting motion information decoding" priority of the Chinese patent application, the entire contents of which are incorporated herein by reference.
技术领域Technical field
本申请涉及视频编解码技术领域,尤其涉及一种预测运动信息的解码方法及装置。The present application relates to the technical field of video encoding and decoding, and in particular, to a method and device for predicting motion information decoding.
背景技术Background technique
数字视频技术可广泛应用于各种装置中,包括数字电视、数字直播系统、无线广播系统、个人数字助理(personal digital assistant,PDA)、笔记本电脑、平板计算机、电子书阅读器、数字相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝或卫星无线电电话、视频电话会议装置、视频流式发射装置及其类似者。数字视频装置实施视频解码技术有效地发送、接收、编码、解码和/或存储数字视频信息。Digital video technology can be widely used in various devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), notebook computers, tablet computers, e-book readers, digital cameras, digital Recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio telephones, video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video decoding technology to effectively send, receive, encode, decode, and / or store digital video information.
在视频解码技术中,视频压缩技术尤为重要。视频压缩技术执行空间(图像内)预测和/或时间(图像间)预测,以减少或移除视频序列中固有的冗余信息。视频压缩的基本原理是,利用空域、时域和码字之间的相关性,尽可能去除冗余。目前流行的做法是采用基于块的混合视频编码框架,通过预测(包括帧内预测和帧间预测)、变换、量化、熵编码等步骤来实现视频编码压缩。In video decoding technology, video compression technology is particularly important. Video compression techniques perform spatial (intra-image) prediction and / or temporal (inter-image) prediction to reduce or remove redundant information inherent in a video sequence. The basic principle of video compression is to remove the redundancy as much as possible by using the correlation between the spatial, temporal and codewords. The current popular approach is to use a block-based hybrid video coding framework to achieve video coding compression through prediction (including intra prediction and inter prediction), transformation, quantization, and entropy coding.
帧间预测是利用视频时域的相关性,使用邻近已编码图像像素预测当前图像的像素,以达到有效去除视频时域冗余的目的。在进行帧间预测时,从候选运动信息列表中确定每一个图像块的预测运动信息,从而通过运动补偿过程生成其预测块。其中,运动信息包括参考图像信息和运动矢量。参考图像信息包括:单向/双向预测信息,参考图像列表,和参考图像列表对应的参考图像索引。运动矢量是指水平和竖直方向的位置偏移。Inter-prediction is to use the correlation of the video time domain to predict the pixels of the current image by using the pixels of adjacent coded images in order to effectively remove the video time domain redundancy. When performing inter prediction, the predicted motion information of each image block is determined from the candidate motion information list, thereby generating its prediction block through a motion compensation process. The motion information includes reference image information and motion vectors. The reference picture information includes unidirectional / bidirectional prediction information, a reference picture list, and a reference picture index corresponding to the reference picture list. Motion vectors refer to horizontal and vertical position shifts.
当前,帧间预测的方式众多,包括合并(Merge)模式、仿射合并(Affine Merge)模式、高级运动矢量预测(Advanced Motion Vector Prediction,AMVP)模式、仿射高级运动矢量预测(Affine AMVP)模式等等。At present, there are many inter prediction methods, including Merge mode, Affine Merge mode, Advanced Motion Vector Prediction (AMVP) mode, and Affine AMVP mode. and many more.
为了提高帧间预测的准确性,当引入更多的候选情况,会导致候选运动信息列表长度变的越来越长,对检测过程和硬件实现上都不利。In order to improve the accuracy of inter prediction, when more candidate situations are introduced, the length of the candidate motion information list becomes longer and longer, which is disadvantageous to the detection process and hardware implementation.
发明内容Summary of the Invention
本申请实施例提供一种预测运动信息的解码方法及装置,能够在引入更多的候选运动信息时,有效控制候选运动信息列表长度。The embodiments of the present application provide a decoding method and device for predicting motion information, which can effectively control the length of the candidate motion information list when more candidate motion information is introduced.
为达到上述目的,本申请实施例采用如下技术方案:To achieve the above purpose, the embodiments of the present application adopt the following technical solutions:
本申请实施例的第一方面,提供了一种预测运动信息的解码方法,包括:解析码流以获得第一标识;根据第一标识,从第一候选集合中确定目标元素,该第一候选集 合中的元素包括至少一个第一候选运动信息和多个第二候选运动信息,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量;当目标元素为第一候选运动信息时,将作为目标元素的第一候选运动信息作为目标运动信息,目标运动信息用来预测待处理图像块的运动信息;当目标元素根据所述多个第二候选运动信息获得时,解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息。A first aspect of the embodiments of the present application provides a method for decoding motion information prediction, including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first candidate The elements in the set include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset; when the target element When it is the first candidate motion information, the first candidate motion information as the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; when the target element is based on the plurality of second candidate motion information When obtained, the code stream is parsed to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information.
通过本申请提供的预测运动信息的解码方法,第一候选集合中的元素包括了第一候选运动信息以及多个第二候选运动信息,这样一来,多层候选集合的结构,当引入更多候选时,可以将一类候选运动信息的集合作为一个元素添加在第一候选集合中,相比于直接将候选运动信息加入第一候选集合,大大缩短了第一候选集合的长度。当第一候选集合为帧间预测的候选运动信息列表时,即使引起更多的候选,也可以很好的控制候选运动信息列表的长度,为检测过程和硬件实现提供便利。Through the decoding method for predicting motion information provided in this application, the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information. In this way, the structure of the multilayer candidate set, when more is introduced When a candidate is selected, a set of candidate motion information sets can be added as an element to the first candidate set. Compared with directly adding the candidate motion information to the first candidate set, the length of the first candidate set is greatly shortened. When the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
在第一方面的一种可行的实施方式中,第一标识可以为类别标识,用于指示目标元素所属的分类。In a feasible implementation manner of the first aspect, the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
在第一方面的一种可行的实施方式中,本申请实施例提供的预测运动信息的解码方法还可以包括:解析码流以获取第四标识,第四标识为目标元素在第一候选集合中,第一标识指示的类别中的索引。在该实现方式中,通过第四标识结合第一标识,唯一确定目标元素。In a feasible implementation manner of the first aspect, the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier. In this implementation manner, the target element is uniquely determined by combining the fourth identifier with the first identifier.
在第一方面的一种可行的实施方式中,第一候选运动信息包括待处理图像块的空域相邻图像块的运动信息。In a feasible implementation manner of the first aspect, the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
在第一方面的一种可行的实施方式中,第一候选运动信息可以为Merge模式生成的候选运动信息。In a feasible implementation manner of the first aspect, the first candidate motion information may be candidate motion information generated by a Merge mode.
在第一方面的一种可行的实施方式中,第二候选运动信息基于第一运动信息和预设的运动信息偏移量获得。In a feasible implementation manner of the first aspect, the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
在第一方面的一种可行的实施方式中,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息,包括:根据第二标识从多个预设的运动信息偏移量中确定目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the first aspect, determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier includes: biasing the plurality of preset motion information according to the second identifier. The target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
在第一方面的一种可行的实施方式中,在至少一个第一候选运动信息中,用于标识第一运动信息的编码码字最短。In a feasible implementation manner of the first aspect, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
在第一方面的一种可行的实施方式中,当目标元素根据所述多个第二候选运动信息获得时,本申请提供的预测运动信息的解码方法还可以包括:解析码流以获得第三标识,该第三标识包括预设系数。In a feasible implementation manner of the first aspect, when the target element is obtained according to the plurality of second candidate motion information, the method for decoding prediction motion information provided in the present application may further include: parsing a bitstream to obtain a third An identifier, and the third identifier includes a preset coefficient.
在第一方面的一种可行的实施方式中,在根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息之前,该方法还包括:将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量。In a feasible implementation manner of the first aspect, before determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier, the method further includes: converting a plurality of preset motion information The offset is multiplied by a preset coefficient to obtain a plurality of adjusted motion information offsets.
在第一方面的一种可行的实施方式中,目标运动信息用来预测待处理图像块的运动信息,包括:将目标运动信息作为待处理图像块的运动信息;或者,将目标运动信息作为待处理图像块的预测运动信息。在得到待处理图像块的运动信息或者预测运动信息后,进行运动补偿生成其图像块或者预测块。In a feasible implementation manner of the first aspect, the target motion information is used to predict motion information of an image block to be processed, and includes: using the target motion information as motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
在第一方面的一种可行的实施方式中,第二标识可以采用定长编码方式,这样可以节约标识占用的字节数。In a feasible implementation manner of the first aspect, the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
在第一方面的一种可行的实施方式中,第二标识可以采用变长编码方式,这样可以标识更多的候选运动信息。In a feasible implementation manner of the first aspect, the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
本申请实施例的第二方面,提供了另一种预测运动信息的解码方法,包括:解析码流以获得第一标识;根据第一标识,从第一候选集合中确定目标元素,该第一候选集合中的元素包括至少一个第一候选运动信息和至少一个第二候选集合,第二候选集合中的元素包括多个第二候选运动信息;当目标元素为第一候选运动信息时,将作为目标元素的第一候选运动信息作为目标运动信息,目标运动信息用来预测待处理图像块的运动信息;当目标元素为第二候选集合时,解析码流以获得第二标识,根据所述第二标识,从所述多个第二候选运动信息中确定所述目标运动信息。According to a second aspect of the embodiments of the present application, another decoding method for predicting motion information is provided, including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first The elements in the candidate set include at least one first candidate motion information and at least one second candidate set. The elements in the second candidate set include multiple second candidate motion information. When the target element is the first candidate motion information, it will be regarded as The first candidate motion information of the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed. When the target element is the second candidate set, the code stream is parsed to obtain a second identifier. Two identifiers, determining the target motion information from the plurality of second candidate motion information.
通过本申请提供的预测运动信息的解码方法,第一候选集合中的元素包括了第一候选运动信息以及至少一个第二候选集合,这样一来,多层候选集合的结构,当引入更多候选时,可以将一类候选运动信息的集合作为一个元素添加在第一候选集合中,相比于直接将候选运动信息加入第一候选集合,大大所选了第一候选集合的长度。当第一候选集合为帧间预测的候选运动信息列表时,即使引起更多的候选,也可以很好的控制候选运动信息列表的长度,为检测过程和硬件实现提供便利。Through the decoding method for predicting motion information provided in this application, the elements in the first candidate set include the first candidate motion information and at least one second candidate set. In this way, the structure of the multi-layer candidate set, when more candidates are introduced In this case, a type of candidate motion information set can be added as an element to the first candidate set. Compared with directly adding the candidate motion information to the first candidate set, the length of the first candidate set is greatly selected. When the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
在第二方面的一种可行的实施方式中,第一标识可以为类别标识,用于指示目标元素所属的分类。In a feasible implementation manner of the second aspect, the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
在第二方面的一种可行的实施方式中,本申请实施例提供的预测运动信息的解码方法还可以包括:解析码流以获取第四标识,第四标识为目标元素在第一候选集合中,第一标识指示的类别中的索引。在该实现方式中,通过第四标识结合第一标识,唯一确定目标元素。In a feasible implementation manner of the second aspect, the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier. In this implementation manner, the target element is uniquely determined by combining the fourth identifier with the first identifier.
在第二方面的一种可行的实施方式中,第一候选运动信息包括待处理图像块的空域相邻图像块的运动信息。In a feasible implementation manner of the second aspect, the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
在第二方面的一种可行的实施方式中,第一候选运动信息可以为Merge模式生成的候选运动信息。In a feasible implementation manner of the second aspect, the first candidate motion information may be candidate motion information generated by a Merge mode.
在第二方面的一种可行的实施方式中,第二候选运动信息包括待处理图像块的空域非相邻图像块的运动信息。In a feasible implementation manner of the second aspect, the second candidate motion information includes motion information of a spatial domain non-adjacent image block of the image block to be processed.
在第二方面的一种可行的实施方式中,第二候选运动信息可以为Affine Merge模式生成的候选运动信息。In a feasible implementation manner of the second aspect, the second candidate motion information may be candidate motion information generated by the Affine Merge mode.
在第二方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,第二候选运动信息包括第二运动信息,第二运动信息基于第一运动信息和预设的运动信息偏移量获得。In a feasible implementation manner of the second aspect, the first candidate motion information includes first motion information, the second candidate motion information includes second motion information, and the second motion information is based on the first motion information and preset motion information. Offset is obtained.
在第二方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量;对应的,根据第二标识,从多个第二候选运动信息中确定目标运动信息,包括:根据第二标识从多个预设的运动信息偏移量中确定目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the second aspect, the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset motion information offset; correspondingly, from a plurality of The determination of the target motion information in the second candidate motion information includes: determining a target offset from a plurality of preset motion information offsets according to a second identifier; and determining the target motion information based on the first motion information and the target offset.
在第二方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,第 一候选集合中包括的至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第三候选集合和至少一个第四候选集合,第三候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第四候选集合中的元素包括多个基于第一运动信息和预设的运动信息偏移量获得的运动信息。In a feasible implementation manner of the second aspect, the first candidate motion information includes first motion information, and at least one second candidate set included in the first candidate set is a plurality of second candidate sets, and a plurality of second candidates The set includes at least one third candidate set and at least one fourth candidate set. Elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed, and elements in the fourth candidate set include multiple Motion information obtained based on the first motion information and a preset motion information offset.
在第二方面的一种可行的实施方式中,在至少一个第一候选运动信息中,用于标识第一运动信息的编码码字最短。In a feasible implementation manner of the second aspect, among the at least one first candidate motion information, a coding codeword for identifying the first motion information is shortest.
在第二方面的一种可行的实施方式中,第一运动信息不包括根据可选时域运动矢量预测(alternative temporal motion vector prediction,ATMVP)模式获得的运动信息。In a feasible implementation manner of the second aspect, the first motion information does not include motion information obtained according to an alternative temporal motion vector prediction (alternative temporal vector prediction) mode.
在第二方面的一种可行的实施方式中,第一候选集合中包括的至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第五候选集合和至少一个第六候选集合,第五候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第六候选集合中的元素包括多个预设的运动信息偏移量。In a feasible implementation manner of the second aspect, at least one second candidate set included in the first candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one fifth candidate set and at least one The sixth candidate set, the elements in the fifth candidate set include motion information of spatial non-adjacent image blocks of the plurality of image blocks to be processed, and the elements in the sixth candidate set include a plurality of preset motion information offsets.
在第二方面的一种可行的实施方式中,当目标元素为第二候选集合时,本申请提供的预测运动信息的解码方法还可以包括:解析码流以获得第三标识,该第三标识包括预设系数。In a feasible implementation manner of the second aspect, when the target element is the second candidate set, the decoding method for predicting motion information provided in this application may further include: parsing the code stream to obtain a third identifier, where the third identifier Including preset coefficients.
在第二方面的一种可行的实施方式中,在根据第二标识从多个预设的运动信息偏移量中确定目标偏移量之前,还包括:将多个预设的运动信息偏移量和第三标识包括的预设系数相乘,以得到多个调整后的运动信息偏移量;对应的,根据第二标识从多个预设的运动信息偏移量中确定目标偏移量,包括:根据第二标识从按照预设系数调整后的多个调整后的运动信息偏移量中确定目标偏移量。In a feasible implementation manner of the second aspect, before determining the target offset from a plurality of preset motion information offsets according to the second identifier, the method further includes: shifting the plurality of preset motion information. And the preset coefficients included in the third identifier are multiplied to obtain a plurality of adjusted motion information offsets; correspondingly, the target offset is determined from the plurality of preset motion information offsets according to the second identifier Includes: determining a target offset amount from a plurality of adjusted motion information offset amounts adjusted according to a preset coefficient according to a second identifier.
在第二方面的一种可行的实施方式中,第二候选运动信息和第一候选运动信息不相同。具体的,第一候选运动信息与第二候选运动信息,可以为按照不同的帧间预测模式选取的候选运动信息。In a feasible implementation manner of the second aspect, the second candidate motion information and the first candidate motion information are different. Specifically, the first candidate motion information and the second candidate motion information may be candidate motion information selected according to different inter prediction modes.
在第二方面的一种可行的实施方式中,目标运动信息用来预测待处理图像块的运动信息,包括:将目标运动信息作为待处理图像块的运动信息;或者,将目标运动信息作为待处理图像块的预测运动信息。在得到待处理图像块的运动信息或者预测运动信息后,进行运动补偿生成其图像块或者预测块。In a feasible implementation manner of the second aspect, the target motion information is used to predict the motion information of the image block to be processed, and includes: using the target motion information as the motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
在第二方面的一种可行的实施方式中,第二标识可以采用定长编码方式,这样可以节约标识占用的字节数。In a feasible implementation manner of the second aspect, the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
在第二方面的一种可行的实施方式中,第二标识可以采用变长编码方式,这样可以标识更多的候选运动信息。In a feasible implementation manner of the second aspect, the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
需要说明的是,上述第一方面与第二方面提供的预测运动信息的解码方法,其具体实现可以相互参考,不再一一赘述。It should be noted that the specific implementation of the prediction motion information decoding methods provided in the first aspect and the second aspect can be referred to each other, and will not be described one by one.
本申请实施例的第三方面,提供了一种预测运动信息的解码装置,包括:解析模块,用于解析码流以获得第一标识;确定模块,用于根据第一标识,从第一候选集合中确定目标元素,第一候选集合中的元素包括至少一个第一候选运动信息和多个第二候选候选运动信息,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量;赋值模块,用于当目标元素为第一候选运动信息时,将第一候选运动信息作为目标运动信息,该目标运动信息用来预测待处理图像块的运动信息; 解析模块还用于,当目标元素根据多个第二候选运动信息获得时,解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息。According to a third aspect of the embodiments of the present application, a decoding apparatus for predicting motion information is provided, including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier. The target element is determined in the set. The elements in the first candidate set include at least one first candidate motion information and a plurality of second candidate motion information. The first candidate motion information includes the first motion information, and the second candidate motion information includes a preset. Offset of motion information; an assignment module, configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed; The module is further configured to: when the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and determine the target motion information based on one of the plurality of second candidate motion information according to the second identifier.
通过本申请提供的预测运动信息的解码装置,第一候选集合中的元素包括了第一候选运动信息以及多个第二候选运动信息,这样一来,多层候选集合的结构,当引入更多候选时,可以将一类候选运动信息的集合作为一个元素添加在第一候选集合中,相比于直接将候选运动信息加入第一候选集合,大大所选了第一候选集合的长度。当第一候选集合为帧间预测的候选运动信息列表时,即使引起更多的候选,也可以很好的控制候选运动信息列表的长度,为检测过程和硬件实现提供便利。Through the decoding apparatus for predicting motion information provided in this application, the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information. In this way, the structure of the multi-layer candidate set, when more is introduced When candidate is selected, a type of candidate motion information set can be added as an element to the first candidate set. Compared with directly adding the candidate motion information to the first candidate set, the length of the first candidate set is greatly selected. When the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
在第三方面的一种可行的实施方式中,第一候选运动信息可以包括待处理图像块的空域相邻图像块的运动信息。In a feasible implementation manner of the third aspect, the first candidate motion information may include motion information of a spatially adjacent image block of the image block to be processed.
在第三方面的一种可行的实施方式中,第二候选运动信息基于第一运动信息和预设的运动信息偏移量获得。In a feasible implementation manner of the third aspect, the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
在第三方面的一种可行的实施方式中,解析模块具体用于:根据第二标识从多个预设的运动信息偏移量中确定目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the third aspect, the analysis module is specifically configured to determine a target offset from a plurality of preset motion information offsets according to the second identifier; based on the first motion information and the target offset Determine the target motion information.
在第三方面的一种可行的实施方式中,在至少一个第一候选运动信息中,用于标识第一运动信息的编码码字最短。In a feasible implementation manner of the third aspect, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
在第三方面的一种可行的实施方式中,当目标元素根据多个第二候选运动信息获得时,解析模块还用于:解析码流以获得第三标识,第三标识包括预设系数。In a feasible implementation manner of the third aspect, when the target element is obtained according to multiple second candidate motion information, the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
在第三方面的一种可行的实施方式中,该装置还包括计算模块,用于将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量。In a feasible implementation manner of the third aspect, the device further includes a calculation module for multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motion information offsets. Shift amount.
在第三方面的一种可行的实施方式中,确定模块具体用于,根据第二标识从计算模块得到的多个调整后的运动信息偏移量中确定目标偏移量,再基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the third aspect, the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained by the calculation module according to the second identifier, and then based on the first motion The information and target offset determine target motion information.
在第三方面的一种可行的实施方式中,确定模块具体用于,将目标运动信息作为待处理图像块的运动信息;或者,将目标运动信息作为待处理图像块的预测运动信息。In a feasible implementation manner of the third aspect, the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
在第三方面的一种可行的实施方式中,第二标识采用定长编码方式。In a feasible implementation manner of the third aspect, the second identifier adopts a fixed-length encoding manner.
在第三方面的一种可行的实施方式中,第二标识采用变长编码方式。In a feasible implementation manner of the third aspect, the second identifier adopts a variable length coding method.
需要说明的是,本申请实施例第三方面提供的预测运动信息的解码装置,用于执行上述第一方面提供的预测运动信息的解码方法,具体实现相同,不再一一赘述。It should be noted that the apparatus for decoding prediction motion information provided in the third aspect of the embodiments of the present application is configured to execute the method for decoding prediction motion information provided in the first aspect, and the specific implementation is the same, and details are not repeated one by one.
本申请实施例的第四方面,提供了一种预测运动信息的解码装置,包括:解析模块,用于解析码流以获得第一标识;确定模块,用于根据第一标识,从第一候选集合中确定目标元素,第一候选集合中的元素包括至少一个第一候选运动信息和至少一个第二候选集合,第二候选集合中的元素包括多个第二候选运动信息;赋值模块,当目标元素为第一候选运动信息时,用于将第一候选运动信息作为目标运动信息,该目标运动信息用来预测待处理图像块的运动信息;解析模块还用于,当目标元素为第二候选集合时,解析码流以获得第二标识,确定模块还用于根据第二标识,从所述多个第二候选运动信息中确定目标运动信息。According to a fourth aspect of the embodiments of the present application, a decoding apparatus for predicting motion information is provided, including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier. The target element is determined in the set. The elements in the first candidate set include at least one first candidate motion information and at least one second candidate set. The elements in the second candidate set include a plurality of second candidate motion information; the assignment module, when the target When the element is the first candidate motion information, it is used to use the first candidate motion information as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; the analysis module is further configured to, when the target element is the second candidate During assembly, the code streams are parsed to obtain a second identifier, and the determining module is further configured to determine target motion information from the plurality of second candidate motion information according to the second identifier.
通过本申请提供的预测运动信息的解码装置,第一候选集合中的元素包括了第一 候选运动信息以及至少一个第二候选集合,这样一来,多层候选集合的结构,当引入更多候选时,可以将一类候选运动信息的集合作为一个元素添加在第一候选集合中,相比于直接将候选运动信息加入第一候选集合,大大所选了第一候选集合的长度。当第一候选集合为帧间预测的候选运动信息列表时,即使引起更多的候选,也可以很好的控制候选运动信息列表的长度,为检测过程和硬件实现提供便利。Through the decoding apparatus for predicting motion information provided in this application, the elements in the first candidate set include the first candidate motion information and at least one second candidate set. In this way, the structure of the multi-layer candidate set, when more candidates are introduced In this case, a type of candidate motion information set can be added as an element to the first candidate set. Compared with directly adding the candidate motion information to the first candidate set, the length of the first candidate set is greatly selected. When the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
在第四方面的一种可行的实施方式中,第一候选运动信息可以包括待处理图像块的空域相邻图像块的运动信息。In a feasible implementation manner of the fourth aspect, the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed.
在第四方面的一种可行的实施方式中,第二候选运动信息可以包括所述待处理图像块的空域非相邻图像块的运动信息。In a feasible implementation manner of the fourth aspect, the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed.
在第四方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,第二候选运动信息包括第二运动信息,第二运动信息基于第一运动信息和预设的运动信息偏移量获得。In a feasible implementation manner of the fourth aspect, the first candidate motion information includes first motion information, the second candidate motion information includes second motion information, and the second motion information is based on the first motion information and preset motion information. Offset is obtained.
在第四方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量;对应的,解析模块具体用于:根据第二标识从多个预设的运动信息偏移量中确定目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the fourth aspect, the first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset; correspondingly, the analysis module is specifically configured to: The second identifier determines a target offset from a plurality of preset motion information offsets; and determines the target motion information based on the first motion information and the target offset.
在第四方面的一种可行的实施方式中,第一候选运动信息包括第一运动信息,至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第三候选集合和至少一个第四候选集合,第三候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第四候选集合中的元素包括多个基于第一运动信息和预设的运动信息偏移量获得的运动信息。In a feasible implementation manner of the fourth aspect, the first candidate motion information includes first motion information, at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one third candidate Set and at least one fourth candidate set, the elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed, and the elements in the fourth candidate set include multiple based on the first motion information and Motion information obtained by a preset motion information offset.
在第四方面的一种可行的实施方式中,在至少一个第一候选运动信息中,用于标识第一运动信息的编码码字最短。In a feasible implementation manner of the fourth aspect, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
在第四方面的一种可行的实施方式中,第一运动信息不包括根据ATMVP模式获得的运动信息。In a feasible implementation manner of the fourth aspect, the first motion information does not include motion information obtained according to the ATMVP mode.
在第四方面的一种可行的实施方式中,至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第五候选集合和至少一个第六候选集合,第五候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第六候选集合中的元素包括多个预设的运动信息偏移量。In a feasible implementation manner of the fourth aspect, the at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set. The elements in the candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed, and the elements in the sixth candidate set include multiple preset motion information offsets.
在第四方面的一种可行的实施方式中,当目标元素为第二候选集合时,解析模块还用于:解析码流以获得第三标识,第三标识包括预设系数。In a feasible implementation manner of the fourth aspect, when the target element is the second candidate set, the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
在第四方面的一种可行的实施方式中,还包括计算模块,用于将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量;对应的,确定模块具体用于,根据第二标识从计算模块得到的多个调整后的运动信息偏移量中确定目标偏移量,再基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner of the fourth aspect, it further includes a calculation module, configured to multiply a plurality of preset motion information offsets by a preset coefficient to obtain a plurality of adjusted motion information offsets. Correspondingly, the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained from the calculation module according to the second identifier, and then determine the target motion based on the first motion information and the target offset. information.
在第四方面的一种可行的实施方式中,第二候选运动信息和第一候选运动信息不相同。In a feasible implementation manner of the fourth aspect, the second candidate motion information and the first candidate motion information are different.
在第四方面的一种可行的实施方式中,确定模块具体用于,将目标运动信息作为待处理图像块的运动信息;或者,将目标运动信息作为待处理图像块的预测运动信息。In a feasible implementation manner of the fourth aspect, the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
在第四方面的一种可行的实施方式中,第二标识采用定长编码方式。In a feasible implementation manner of the fourth aspect, the second identifier adopts a fixed-length encoding manner.
在第四方面的一种可行的实施方式中,第二标识采用变长编码方式。In a feasible implementation manner of the fourth aspect, the second identifier adopts a variable length coding method.
本申请实施例的第五方面,提供了一种预测运动信息的解码装置,包括:处理器和耦合于所述处理器的存储器;所述处理器用于执行上述第一方面或第二方面所述的预测运动信息的解码方法。A fifth aspect of the embodiments of the present application provides a decoding apparatus for predicting motion information, including: a processor and a memory coupled to the processor; the processor is configured to execute the first aspect or the second aspect. Decoding method for predicting motion information.
本申请实施例的第六方面,提供一种视频解码器,包括非易失性存储介质以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行如上述第一方面及或第二方面或任意一种可能的实现方式所述的预测运动信息的解码方法。According to a sixth aspect of the embodiments of the present application, a video decoder is provided, which includes a non-volatile storage medium and a central processing unit. The non-volatile storage medium stores executable programs, and the central processing unit and the The non-volatile storage medium is connected and executes the decoding method for predicting motion information according to the first aspect and / or the second aspect, or any one of the possible implementation manners.
本申请实施例的第七方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的预测运动信息的解码方法。According to a seventh aspect of the embodiments of the present application, a computer-readable storage medium is provided. The computer-readable storage medium stores instructions. When the instructions are run on a computer, the computer is caused to execute the first aspect or the first aspect. The decoding method for predicting motion information according to the second aspect.
本申请实施例的第八方面,提供了一种包含指令的计算机程序产品,当所述指令在计算机上运行时,使得计算机执行上述第一方面或第二方面所述的预测运动信息的解码方法。According to an eighth aspect of the embodiments of the present application, a computer program product including instructions is provided, and when the instructions are run on a computer, the computer is caused to execute the method for decoding motion prediction information described in the first or second aspect. .
应理解,本申请的第三至八方面与本申请的第一方面或第二方面的技术方案一致,各方面及对应的可实施的设计方式所取得的有益效果相似,不再赘述。It should be understood that the third to eighth aspects of the present application are consistent with the technical solutions of the first aspect or the second aspect of this application, and the beneficial effects obtained by each aspect and the corresponding implementable design manner are similar and will not be described again.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为示例性的可通过配置以用于本申请实施例的一种视频译码系统框图;FIG. 1 is an exemplary block diagram of a video decoding system that can be configured for use in an embodiment of the present application; FIG.
图2为示例性的可通过配置以用于本申请实施例的一种视频编码器的系统框图;FIG. 2 is an exemplary system block diagram of a video encoder that can be configured for use in an embodiment of the present application; FIG.
图3为示例性的可通过配置以用于本申请实施例的一种视频解码器的系统框图;FIG. 3 is an exemplary system block diagram of a video decoder that can be configured for use in embodiments of the present application; FIG.
图4为示例性的可通过配置以用于本申请实施例的一种帧间预测模块的框图;4 is a block diagram of an exemplary inter prediction module that can be configured for use in an embodiment of the present application;
图5为示例性的一种合并预测模式的实施流程图;5 is an exemplary implementation flowchart of a merge prediction mode;
图6为示例性的一种高级运动矢量预测模式的实施流程图;6 is an exemplary implementation flowchart of an advanced motion vector prediction mode;
图7为示例性的可通过配置以用于本申请实施例的一种由视频解码器执行的运动补偿的实施流程图;7 is an exemplary implementation flowchart of a motion compensation performed by a video decoder that can be configured for an embodiment of the present application;
图8为示例性的一种编码单元及与其关联的相邻位置图像块的示意图;FIG. 8 is a schematic diagram of an exemplary coding unit and adjacent position image blocks associated with the coding unit;
图9为示例性的一种构建候选预测运动矢量列表的实施流程图;9 is an exemplary implementation flowchart of constructing a candidate prediction motion vector list;
图10为示例性的一种将经过组合的候选运动矢量添加到合并模式候选预测运动矢量列表的实施示意图;10 is an exemplary implementation diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list;
图11为示例性的一种将经过缩放的候选运动矢量添加到合并模式候选预测运动矢量列表的实施示意图;11 is an exemplary implementation diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list;
图12为示例性的一种将零运动矢量添加到合并模式候选预测运动矢量列表的实施示意图;12 is an exemplary implementation diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list;
图13为示例性的另一种编码单元及与其关联的相邻位置图像块的示意图;FIG. 13 is a schematic diagram of another exemplary coding unit and adjacent position image blocks associated with the coding unit;
图14A为示例性的一种构建候选运动矢量集合方法的示意图;14A is a schematic diagram of an exemplary method for constructing a candidate motion vector set;
图14B为示例性的一种构建候选运动矢量集合方法的示意图;14B is a schematic diagram of an exemplary method for constructing a candidate motion vector set;
图15为本申请实施例中预测运动信息的解码方法的一个示意性流程图;15 is a schematic flowchart of a decoding method for predicting motion information according to an embodiment of the present application;
图16A为示例性的一种构建候选运动矢量集合方法的示意图;16A is a schematic diagram of an exemplary method for constructing a candidate motion vector set;
图16B为示例性的一种构建候选运动矢量集合方法的示意图;16B is a schematic diagram of an exemplary method for constructing a candidate motion vector set;
图16C为示例性的一种构建候选运动矢量集合方法的示意图;16C is a schematic diagram of an exemplary method for constructing a candidate motion vector set;
图17为本申请实施例中预测运动信息的解码装置的一个示意性框图;FIG. 17 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application; FIG.
图18为本申请实施例中预测运动信息的解码装置的一个示意性框图。FIG. 18 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application.
具体实施方式detailed description
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于限定特定顺序。The terms "first", "second", "third", and "fourth" in the description and claims of the present application and the above-mentioned drawings are used to distinguish different objects, rather than to define a specific order.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of the words "exemplary" or "for example" is intended to present the relevant concept in a concrete manner.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
图1为本申请实施例中所描述的一种实例的视频译码系统1的框图。如本文所使用,术语“视频译码器”一般是指视频编码器和视频解码器两者。在本申请中,术语“视频译码”或“译码”可一般地指代视频编码或视频解码。视频译码系统1的视频编码器100和视频解码器200用于根据多种新的帧间预测模式中的任一种预测当前经译码图像块或其子块的运动信息,例如运动矢量,使得预测出的运动矢量最大程度上接近使用运动估算方法得到的运动矢量,从而编码时无需传送运动矢量差值,从而进一步的改善编解码性能。FIG. 1 is a block diagram of a video decoding system 1 according to an example described in the embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the terms "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video decoding system 1 are configured to predict motion information, such as a motion vector, of a currently decoded image block or a sub-block thereof according to any one of multiple new inter prediction modes, The predicted motion vector is close to the motion vector obtained by using the motion estimation method to the greatest extent, so that it is not necessary to transmit the motion vector difference value during encoding, thereby further improving the encoding and decoding performance.
如图1中所示,视频译码系统1包含源装置10和目的地装置20。源装置10产生经编码视频数据。因此,源装置10可被称为视频编码装置。目的地装置20可对由源装置10所产生的经编码的视频数据进行解码。因此,目的地装置20可被称为视频解码装置。源装置10、目的地装置20或两个的各种实施方案可包含一或多个处理器以及耦合到所述一或多个处理器的存储器。所述存储器可包含但不限于RAM、ROM、EEPROM、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,如本文所描述。As shown in FIG. 1, the video decoding system 1 includes a source device 10 and a destination device 20. The source device 10 generates encoded video data. Therefore, the source device 10 may be referred to as a video encoding device. The destination device 20 may decode the encoded video data generated by the source device 10. Therefore, the destination device 20 may be referred to as a video decoding device. Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors. The memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other media that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
源装置10和目的地装置20可以包括各种装置,包含桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者。The source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called "smart" phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
目的地装置20可经由链路30从源装置10接收经编码视频数据。链路30可包括能够将经编码视频数据从源装置10移动到目的地装置20的一或多个媒体或装置。在一个实例中,链路30可包括使得源装置10能够实时将经编码视频数据直接发射到目的地装置20的一或多个通信媒体。在此实例中,源装置10可根据通信标准(例如无线通信协议)来调制经编码视频数据,且可将经调制的视频数据发射到目的地装置20。所述一或多个通信媒体可包含无线和/或有线通信媒体,例如射频(radio frequency,RF)频谱或一或多个物理传输线。所述一或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)。所述一或多个通 信媒体可包含路由器、交换器、基站或促进从源装置10到目的地装置20的通信的其它设备。The destination device 20 may receive the encoded video data from the source device 10 via the link 30. The link 30 may include one or more media or devices capable of moving the encoded video data from the source device 10 to the destination device 20. In one example, the link 30 may include one or more communication media enabling the source device 10 to directly transmit the encoded video data to the destination device 20 in real time. In this example, the source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 20. The one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet). The one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from the source device 10 to the destination device 20.
在另一实例中,可将经编码数据从输出接口140输出到存储装置40。类似地,可通过输入接口240从存储装置40存取经编码数据。存储装置40可包含多种分布式或本地存取的数据存储媒体中的任一者,例如硬盘驱动器、蓝光光盘、数字通用光盘(digital video disc,DVD)、只读光盘(compact disc read-only memory,CD-ROM)、快闪存储器、易失性或非易失性存储器,或用于存储经编码视频数据的任何其它合适的数字存储媒体。In another example, the encoded data may be output from the output interface 140 to the storage device 40. Similarly, the encoded data can be accessed from the storage device 40 through the input interface 240. The storage device 40 may include any of a variety of distributed or locally accessed data storage media, such as a hard disk drive, a Blu-ray disc, a digital video disc (DVD), and a compact disc-read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or any other suitable digital storage medium for storing encoded video data.
在另一实例中,存储装置40可对应于文件服务器或可保持由源装置10产生的经编码视频的另一中间存储装置。目的地装置20可经由流式传输或下载从存储装置40存取所存储的视频数据。文件服务器可为任何类型的能够存储经编码的视频数据并且将经编码的视频数据发射到目的地装置20的服务器。实例文件服务器包含网络服务器(例如,用于网站)、文件传输协议(file transfer protocol,FTP)服务器、网络附接式存储(network attached storage,NAS)装置或本地磁盘驱动器。目的地装置20可通过任何标准数据连接(包含因特网连接)来存取经编码视频数据。这可包含无线信道(例如,无线保真(wIreless-fidelity,Wi-Fi)连接)、有线连接(例如,数字用户线路(digital subscriber line,DSL)、电缆调制解调器等),或适合于存取存储在文件服务器上的经编码视频数据的两者的组合。经编码视频数据从存储装置40的传输可为流式传输、下载传输或两者的组合。In another example, the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 10. The destination device 20 may access the stored video data from the storage device 40 via streaming or download. The file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to the destination device 20. Example file servers include a network server (for example, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, or a local disk drive. The destination device 20 can access the encoded video data through any standard data connection, including an Internet connection. This may include wireless channels (e.g., wireless-fidelity (Wi-Fi) connections), wired connections (e.g., digital subscriber lines (DSL), cable modems, etc.), or suitable for accessing storage A combination of both encoded video data on a file server. The transmission of the encoded video data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
本申请实施例提供的预测运动信息的解码方法可应用于视频编解码以支持多种多媒体应用,例如空中电视广播、有线电视发射、卫星电视发射、串流视频发射(例如,经由因特网)、用于存储于数据存储媒体上的视频数据的编码、存储在数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频译码系统1可用于支持单向或双向视频传输以支持例如视频流式传输、视频回放、视频广播和/或视频电话等应用。The decoding method for predicting motion information provided in the embodiments of the present application can be applied to video encoding and decoding to support a variety of multimedia applications, such as air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (for example, via the Internet), Encoding video data stored on a data storage medium, decoding video data stored on a data storage medium, or other applications. In some examples, the video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
图1中所说明的视频译码系统1仅为实例,并且本申请的技术可适用于未必包含编码装置与解码装置之间的任何数据通信的视频译码设置(例如,视频编码或视频解码)。在其它实例中,数据从本地存储器检索、在网络上流式传输等等。视频编码装置可对数据进行编码并且将数据存储到存储器,和/或视频解码装置可从存储器检索数据并且对数据进行解码。在许多实例中,由并不彼此通信而是仅编码数据到存储器和/或从存储器检索数据且解码数据的装置执行编码和解码。The video decoding system 1 illustrated in FIG. 1 is merely an example, and the techniques of the present application can be applied to a video decoding setting (for example, video encoding or video decoding) that does not necessarily include any data communication between the encoding device and the decoding device. . In other examples, data is retrieved from local storage, streamed over a network, and so on. The video encoding device may encode the data and store the data to a memory, and / or the video decoding device may retrieve the data from the memory and decode the data. In many instances, encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
在图1的实例中,源装置10包含视频源120、视频编码器100和输出接口140。在一些实例中,输出接口140可包含调节器/解调器(调制解调器)和/或发射器。视频源120可包括视频捕获装置(例如,摄像机)、含有先前捕获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频馈入接口,和/或用于产生视频数据的计算机图形系统,或视频数据的此些来源的组合。In the example of FIG. 1, the source device 10 includes a video source 120, a video encoder 100, and an output interface 140. In some examples, the output interface 140 may include a regulator / demodulator (modem) and / or a transmitter. Video source 120 may include a video capture device (e.g., a camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and / or a computer for generating video data Graphics systems, or a combination of these sources of video data.
视频编码器100可对来自视频源120的视频数据进行编码。在一些实例中,源装置10经由输出接口140将经编码视频数据直接发射到目的地装置20。在其它实例中,经编码视频数据还可存储到存储装置40上,供目的地装置20以后存取来用于解码和/或播放。The video encoder 100 may encode video data from the video source 120. In some examples, the source device 10 transmits the encoded video data directly to the destination device 20 via the output interface 140. In other examples, the encoded video data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
在图1的实例中,目的地装置20包含输入接口240、视频解码器200和显示装置220。在一些实例中,输入接口240包含接收器和/或调制解调器。输入接口240可经由链路30和/或从存储装置40接收经编码视频数据。显示装置220可与目的地装置20集成或可在目的地装置20外部。一般来说,显示装置220显示经解码视频数据。显示装置220可包括多种显示装置,例如,液晶显示器(liquid crystal display,LCD)、等离子显示器、有机发光二极管(organic light-emitting diode,OLED)显示器或其它类型的显示装置。In the example of FIG. 1, the destination device 20 includes an input interface 240, a video decoder 200, and a display device 220. In some examples, the input interface 240 includes a receiver and / or a modem. The input interface 240 may receive the encoded video data via the link 30 and / or from the storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays decoded video data. The display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display, or other types of display devices.
尽管图1中未图示,但在一些方面,视频编码器100和视频解码器200可各自与音频编码器和解码器集成,且可包含适当的多路复用器-多路分用器单元或其它硬件和软件,以处置共同数据流或单独数据流中的音频和视频两者的编码。在一些实例中,如果适用的话,那么解复用器(MUX-DEMUX)单元可符合国际电信联盟(international telecommunication union,ITU)H.223多路复用器协议,或例如用户数据报协议(user datagram protocol,UDP)等其它协议。Although not illustrated in FIG. 1, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit Or other hardware and software to handle encoding of both audio and video in a common or separate data stream. In some examples, if applicable, the demultiplexer (MUX-DEMUX) unit may conform to the International Telecommunication Union (ITU) H.223 multiplexer protocol, or, for example, the user datagram protocol (user) datagram protocol, UDP) and other protocols.
视频编码器100和视频解码器200各自可实施为例如以下各项的多种电路中的任一者:一或多个微处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件来实施本申请,那么装置可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一或多个处理器在硬件中执行所述指令从而实施本申请技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可被视为一或多个处理器。视频编码器100和视频解码器200中的每一者可包含在一或多个编码器或解码器中,所述编码器或解码器中的任一者可集成为相应装置中的组合编码器/解码器(编码解码器)的一部分。Each of the video encoder 100 and the video decoder 200 may be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), and application specific integrated circuits. (application-specific integrated circuit (ASIC)), field programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder in a corresponding device / Decoder (codec).
本申请可大体上将视频编码器100称为将某些信息“发信号通知”或“发射”到例如视频解码器200的另一装置。术语“发信号通知”或“发射”可大体上指代用以对经压缩视频数据进行解码的语法元素和/或其它数据的传送。此传送可实时或几乎实时地发生。替代地,此通信可经过一段时间后发生,例如可在编码时在经编码码流中将语法元素存储到计算机可读存储媒体时发生,解码装置接着可在所述语法元素存储到此媒体之后的任何时间检索所述语法元素。This application may generally refer to video encoder 100 as "signaling" or "transmitting" certain information to another device, such as video decoder 200. The terms "signaling" or "transmitting" may generally refer to the transmission of syntax elements and / or other data to decode the compressed video data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur over a period of time, such as when a syntax element is stored in a coded stream to a computer-readable storage medium at the time of encoding, and the decoding device may then store the syntax element after the syntax element is stored on this medium Retrieve the syntax element at any time.
JCT-VC开发了H.265(高效率视频编码(high efficiency video coding,HEVC))标准。HEVC标准化基于称作HEVC测试模型(HEVC model,HM)的视频解码装置的演进模型。H.265的最新标准文档可从http://www.itu.int/rec/T-REC-H.265获得,最新版本的标准文档为H.265(12/16),该标准文档以全文引用的方式并入本文中。HM假设视频解码装置相对于ITU-TH.264/AVC的现有算法具有若干额外能力。例如,H.264提供9种帧内预测编码模式,而HM可提供多达35种帧内预测编码模式。JCT-VC has developed the H.265 (High Efficiency Video Coding (HEVC)) standard. The HEVC standardization is based on an evolution model of a video decoding device called a HEVC test model (HEVC model). The latest standard document of H.265 can be obtained from http://www.itu.int/rec/T-REC-H.265. The latest version of the standard document is H.265 (12/16). The standard document is in full text. The citation is incorporated herein. HM assumes that video decoding devices have several additional capabilities over existing algorithms of ITU-TH.264 / AVC. For example, H.264 provides 9 intra-prediction encoding modes, while HM provides up to 35 intra-prediction encoding modes.
JVET致力于开发H.266标准。H.266标准化的过程基于称作H.266测试模型的视频解码装置的演进模型。H.266的算法描述可从http://phenix.int-evry.fr/jvet获得,其中最新的算法描述包含于JVET-F1001-v2中,该算法描述文档以全文引用的方式并入本文中。同时,可从https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/获得JEM测 试模型的参考软件,同样以全文引用的方式并入本文中。JVET is committed to developing the H.266 standard. The process of H.266 standardization is based on the evolution model of the video decoding device called the H.266 test model. The algorithm description of H.266 can be obtained from http://phenix.int-evry.fr/jvet. The latest algorithm description is included in JVET-F1001-v2. The algorithm description document is incorporated herein by reference in its entirety. . At the same time, the reference software for the JEM test model is available from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/ and is also incorporated herein by reference in its entirety.
一般来说,HM的工作模型描述可将视频帧或图像划分成包含亮度及色度样本两者的树块或最大编码单元(largest coding unit,LCU)的序列,LCU也被称为编码树单元(coding tree unit,CTU)。树块具有与H.264标准的宏块类似的目的。条带包含按解码次序的数个连续树块。可将视频帧或图像分割成一个或多个条带。可根据四叉树将每一树块分裂成编码单元。例如,可将作为四叉树的根节点的树块分裂成四个子节点,且每一子节点可又为母节点且被分裂成另外四个子节点。作为四叉树的叶节点的最终不可分裂的子节点包括解码节点,例如,经解码视频块。与经解码码流相关联的语法数据可定义树块可分裂的最大次数,且也可定义解码节点的最小大小。Generally speaking, the working model description of HM can divide a video frame or image into a tree block or a sequence of the largest coding unit (LCU) that contains both luma and chroma samples. The LCU is also called the coding tree unit. (coding tree unit, CTU). The tree block has a similar purpose as the macro block of the H.264 standard. A slice contains several consecutive tree blocks in decoding order. A video frame or image can be split into one or more slices. Each tree block can be split into coding units according to a quadtree. For example, a tree block that is a root node of a quad tree may be split into four child nodes, and each child node may be a parent node and split into another four child nodes. The final indivisible child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded video blocks. The syntax data associated with the decoded codestream can define the maximum number of times a tree block can be split, and can also define the minimum size of a decoding node.
编码单元包含解码节点及预测块(prediction unit,PU)以及与解码节点相关联的变换单元(transform unit,TU)。CU的大小对应于解码节点的大小且形状必须为正方形。CU的大小的范围可为8×8像素直到最大64×64像素或更大的树块的大小。每一CU可含有一个或多个PU及一个或多个TU。例如,与CU相关联的语法数据可描述将CU分割成一个或多个PU的情形。分割模式在CU是被跳过或经直接模式编码、帧内预测模式编码或帧间预测模式编码的情形之间可为不同的。PU可经分割成形状为非正方形。例如,与CU相关联的语法数据也可描述根据四叉树将CU分割成一个或多个TU的情形。TU的形状可为正方形或非正方形。The coding unit includes a decoding node and a prediction unit (PU), and a transform unit (TU) associated with the decoding node. The size of the CU corresponds to the size of the decoding node and the shape must be square. The size of the CU can range from 8 × 8 pixels to a maximum 64 × 64 pixels or larger tree block size. Each CU may contain one or more PUs and one or more TUs. For example, the syntax data associated with a CU may describe a case where a CU is partitioned into one or more PUs. The partitioning mode may be different between cases where the CU is skipped or is encoded in direct mode, intra prediction mode, or inter prediction mode. The PU can be divided into non-square shapes. For example, the syntax data associated with a CU may also describe a case where a CU is partitioned into one or more TUs according to a quadtree. The shape of the TU can be square or non-square.
HEVC标准允许根据TU进行变换,TU对于不同CU来说可为不同的。TU通常基于针对经分割LCU定义的给定CU内的PU的大小而设定大小,但情况可能并非总是如此。TU的大小通常与PU相同或小于PU。在一些可行的实施方式中,可使用称作“残余四叉树”(residual qualtree,RQT)的四叉树结构将对应于CU的残余样本再分成较小单元。RQT的叶节点可被称作TU。可变换与TU相关联的像素差值以产生变换系数,变换系数可被量化。The HEVC standard allows transformation based on the TU, which can be different for different CUs. The TU is usually sized based on the size of the PUs within a given CU defined for the partitioned LCU, but this may not always be the case. The size of the TU is usually the same as or smaller than the PU. In some feasible implementations, a quad-tree structure called "residual quad-tree" (RQT) may be used to subdivide the residual samples corresponding to the CU into smaller units. The leaf node of RQT may be called TU. The pixel difference values associated with the TU may be transformed to produce a transformation coefficient, which may be quantized.
一般来说,PU包含与预测过程有关的数据。例如,在PU经帧内模式编码时,PU可包含描述PU的帧内预测模式的数据。作为另一可行的实施方式,在PU经帧间模式编码时,PU可包含界定PU的运动矢量的数据。例如,界定PU的运动矢量的数据可描述运动矢量的水平分量、运动矢量的垂直分量、运动矢量的分辨率(例如,四分之一像素精确度或八分之一像素精确度)、运动矢量所指向的参考图像,和/或运动矢量的参考图像列表(例如,列表0、列表1或列表C)。Generally speaking, the PU contains data related to the prediction process. For example, when a PU is intra-mode encoded, the PU may include data describing the intra-prediction mode of the PU. As another feasible implementation manner, when the PU is inter-mode encoded, the PU may include data defining a motion vector of the PU. For example, the data defining the motion vector of the PU may describe the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), motion vector The reference image pointed to, and / or the reference image list of the motion vector (eg, list 0, list 1 or list C).
一般来说,TU使用变换及量化过程。具有一个或多个PU的给定CU也可包含一个或多个TU。在预测之后,视频编码器100可计算对应于PU的残余值。残余值包括像素差值,像素差值可变换成变换系数、经量化且使用TU扫描以产生串行化变换系数以用于熵解码。本申请通常使用术语“视频块”来指CU的解码节点。在一些特定应用中,本申请也可使用术语“视频块”来指包含解码节点以及PU及TU的树块,例如,LCU或CU。Generally, TU uses transform and quantization processes. A given CU with one or more PUs may also contain one or more TUs. After prediction, video encoder 100 may calculate a residual value corresponding to the PU. The residual values include pixel differences that can be transformed into transform coefficients, quantized, and scanned using TU to generate serialized transform coefficients for entropy decoding. This application generally uses the term "video block" to refer to the decoding node of a CU. In some specific applications, the term “video block” may also be used in this application to refer to a tree block including a decoding node and a PU and a TU, such as an LCU or a CU.
视频序列通常包含一系列视频帧或图像。图像群组(group of picture,GOP)示例性地包括一系列、一个或多个视频图像。GOP可在GOP的头信息中、图像中的一者或多者的头信息中或在别处包含语法数据,语法数据描述包含于GOP中的图像的数目。图像的每一条带可包含描述相应图像的编码模式的条带语法数据。视频编码器100通 常对个别视频条带内的视频块进行操作以便编码视频数据。视频块可对应于CU内的解码节点。视频块可具有固定或变化的大小,且可根据指定解码标准而在大小上不同。A video sequence usually contains a series of video frames or images. A group of pictures (GOP) exemplarily includes a series, one or more video pictures. The GOP may include syntax data in the header information of the GOP, the header information of one or more of the pictures, or elsewhere, and the syntax data describes the number of pictures included in the GOP. Each slice of the image may contain slice syntax data describing the coding mode of the corresponding image. Video encoder 100 typically operates on video blocks within individual video slices to encode video data. A video block may correspond to a decoding node within a CU. Video blocks may have fixed or varying sizes, and may differ in size according to a specified decoding standard.
作为一种可行的实施方式,HM支持各种PU大小的预测。假定特定CU的大小为2N×2N,HM支持2N×2N或N×N的PU大小的帧内预测,及2N×2N、2N×N、N×2N或N×N的对称PU大小的帧间预测。HM也支持2N×nU、2N×nD、nL×2N及nR×2N的PU大小的帧间预测的不对称分割。在不对称分割中,CU的一方向未分割,而另一方向分割成25%及75%。对应于25%区段的CU的部分由“n”后跟着“上(Up)”、“下(Down)”、“左(Left)”或“右(Right)”的指示来指示。因此,例如,“2N×nU”指水平分割的2N×2NCU,其中2N×0.5NPU在上部且2N×1.5NPU在底部。As a feasible implementation, HM supports prediction of various PU sizes. Assuming the size of a specific CU is 2N × 2N, HM supports intra prediction of PU sizes of 2N × 2N or N × N, and symmetric PU sizes of 2N × 2N, 2N × N, N × 2N or N × N prediction. HM also supports asymmetric partitioning of PU-sized inter predictions of 2N × nU, 2N × nD, nL × 2N, and nR × 2N. In asymmetric partitioning, one direction of the CU is not partitioned, and the other direction is partitioned into 25% and 75%. The portion of the CU corresponding to the 25% section is indicated by an indication of "n" followed by "Up", "Down", "Left", or "Right". Therefore, for example, “2N × nU” refers to a horizontally divided 2N × 2NCU, where 2N × 0.5NPU is at the top and 2N × 1.5NPU is at the bottom.
在本申请中,“N×N”与“N乘N”可互换使用以指依照垂直维度及水平维度的视频块的像素尺寸,例如,16×16像素或16乘16像素。一般来说,16×16块将在垂直方向上具有16个像素(y=16),且在水平方向上具有16个像素(x=16)。同样地,N×N块一股在垂直方向上具有N个像素,且在水平方向上具有N个像素,其中N表示非负整数值。可将块中的像素排列成行及列。此外,块未必需要在水平方向上与在垂直方向上具有相同数目个像素。例如,块可包括N×M个像素,其中M未必等于N。In this application, “N × N” and “N times N” are used interchangeably to refer to the pixel size of a video block according to vertical and horizontal dimensions, for example, 16 × 16 pixels or 16 × 16 pixels. In general, a 16 × 16 block will have 16 pixels (y = 16) in the vertical direction and 16 pixels (x = 16) in the horizontal direction. Similarly, an N × N block has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value. Pixels in a block can be arranged in rows and columns. In addition, the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction. For example, a block may include N × M pixels, where M is not necessarily equal to N.
在使用CU的PU的帧内预测性或帧间预测性解码之后,视频编码器100可计算CU的TU的残余数据。PU可包括空间域(也称作像素域)中的像素数据,且TU可包括在将变换(例如,离散余弦变换(discrete cosine transform,DCT)、整数变换、小波变换或概念上类似的变换)应用于残余视频数据之后变换域中的系数。残余数据可对应于未经编码图像的像素与对应于PU的预测值之间的像素差。视频编码器100可形成包含CU的残余数据的TU,且接着变换TU以产生CU的变换系数。After the intra-predictive or inter-predictive decoding of the PU using the CU, the video encoder 100 may calculate the residual data of the TU of the CU. A PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may include transforming (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after being applied to the residual video data. The residual data may correspond to a pixel difference between a pixel of an uncoded image and a prediction value corresponding to a PU. The video encoder 100 may form a TU including residual data of a CU, and then transform the TU to generate a transform coefficient of the CU.
在任何变换以产生变换系数之后,视频编码器100可执行变换系数的量化。量化示例性地指对系数进行量化以可能减少用以表示系数的数据的量从而提供进一步压缩的过程。量化过程可减少与系数中的一些或全部相关联的位深度。例如,可在量化期间将n位值降值舍位到m位值,其中n大于m。After any transform to generate transform coefficients, video encoder 100 may perform quantization of the transform coefficients. Quantization exemplarily refers to the process of quantizing coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression. The quantization process may reduce the bit depth associated with some or all of the coefficients. For example, n-bit values may be rounded down to m-bit values during quantization, where n is greater than m.
JEM模型对视频图像的编码结构进行了进一步的改进,具体的,被称为“四叉树结合二叉树”(QTBT)的块编码结构被引入进来。QTBT结构摒弃了HEVC中的CU,PU,TU等概念,支持更灵活的CU划分形状,一个CU可以正方形,也可以是长方形。一个CTU首先进行四叉树划分,该四叉树的叶节点进一步进行二叉树划分。同时,在二叉树划分中存在两种划分模式,对称水平分割和对称竖直分割。二叉树的叶节点被称为CU,JEM的CU在预测和变换的过程中都不可以被进一步划分,也就是说JEM的CU,PU,TU具有相同的块大小。在现阶段的JEM中,CTU的最大尺寸为256×256亮度像素。The JEM model further improves the coding structure of video images. Specifically, a block coding structure called "Quad Tree Combined with Binary Tree" (QTBT) is introduced. The QTBT structure abandons the concepts of CU, PU, and TU in HEVC, and supports more flexible CU division shapes. A CU can be square or rectangular. A CTU first performs a quadtree partition, and the leaf nodes of the quadtree further perform a binary tree partition. At the same time, there are two partitioning modes in binary tree partitioning, symmetrical horizontal partitioning and symmetrical vertical partitioning. The leaf nodes of a binary tree are called CUs, and JEM's CUs cannot be further divided during the prediction and transformation process, that is, JEM's CU, PU, and TU have the same block size. In the current JEM, the maximum size of the CTU is 256 × 256 luminance pixels.
在一些可行的实施方式中,视频编码器100可利用预定义扫描次序来扫描经量化变换系数以产生可经熵编码的串行化向量。在其它可行的实施方式中,视频编码器100可执行自适应性扫描。在扫描经量化变换系数以形成一维向量之后,视频编码器100可根据上下文自适应性可变长度解码(context-based adaptive variable-length code,CAVLC)、上下文自适应性二进制算术解码(context-based adaptive binary arithmetic coding,CABAC)、基于语法的上下文自适应性二进制算术解码(syntax-based adaptive  binary arithmetic coding,SBAC)、概率区间分割熵(probability interval partitioning entropy,PIPE)解码或其他熵解码方法来熵解码一维向量。视频编码器100也可熵编码与经编码视频数据相关联的语法元素以供视频解码器200用于解码视频数据。In some feasible implementations, the video encoder 100 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that can be entropy encoded. In other possible implementations, the video encoder 100 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, the video encoder 100 may perform context-based adaptive variable-length decoding (CAVLC), context-adaptive binary arithmetic decoding (context-based based adaptive binary coding (CABAC), syntax-based adaptive binary binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) decoding, or other entropy decoding methods Entropy decodes a one-dimensional vector. Video encoder 100 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 200 to decode the video data.
为了执行CABAC,视频编码器100可将上下文模型内的上下文指派给待传输的符号。上下文可与符号的相邻值是否为非零有关。为了执行CAVLC,视频编码器100可选择待传输的符号的可变长度码。可变长度解码(variable-length code,VLC)中的码字可经构建以使得相对较短码对应于可能性较大的符号,而较长码对应于可能性较小的符号。以这个方式,VLC的使用可相对于针对待传输的每一符号使用相等长度码字达成节省码率的目的。基于指派给符号的上下文可以确定CABAC中的概率。To perform CABAC, video encoder 100 may assign a context within a context model to a symbol to be transmitted. Context can be related to whether adjacent values of a symbol are non-zero. To perform CAVLC, the video encoder 100 may select a variable length code of a symbol to be transmitted. Codewords in variable-length decoding (VLC) may be constructed such that relatively short codes correspond to more likely symbols, and longer codes correspond to less likely symbols. In this way, the use of VLC can achieve the goal of saving code rates relative to using equal length codewords for each symbol to be transmitted. The probability in CABAC can be determined based on the context assigned to the symbol.
在本申请实施例中,视频编码器可执行帧间预测以减少图像之间的时间冗余。如前文所描述,根据不同视频压缩编解码标准的规定,CU可具有一个或多个预测单元PU。换句话说,多个PU可属于CU,或者PU和CU的尺寸相同。在本文中当CU和PU尺寸相同时,CU的分割模式为不分割,或者即为分割为一个PU,且统一使用PU进行表述。当视频编码器执行帧间预测时,视频编码器可用信号通知视频解码器用于PU的运动信息。示例性的,PU的运动信息可以包括:参考图像索引、运动矢量和预测方向标识。运动矢量可指示PU的图像块(也称视频块、像素块、像素集合等)与PU的参考块之间的位移。PU的参考块可为类似于PU的图像块的参考图像的一部分。参考块可定位于由参考图像索引和预测方向标识指示的参考图像中。In the embodiment of the present application, the video encoder may perform inter prediction to reduce temporal redundancy between images. As described above, a CU may have one or more prediction units PU according to the provisions of different video compression codec standards. In other words, multiple PUs may belong to a CU, or PUs and CUs are the same size. In this article, when the size of the CU and the PU are the same, the CU's partitioning mode is not divided, or it is divided into one PU, and the PU is uniformly used for expression. When the video encoder performs inter prediction, the video encoder may signal the video decoder motion information for the PU. Exemplarily, the motion information of the PU may include: a reference image index, a motion vector, and a prediction direction identifier. A motion vector may indicate a displacement between an image block (also called a video block, a pixel block, a pixel set, etc.) of a PU and a reference block of the PU. The reference block of the PU may be a part of the reference picture similar to the image block of the PU. The reference block may be located in a reference image indicated by a reference image index and a prediction direction identifier.
为了减少表示PU的运动信息所需要的编码比特的数目,视频编码器可根据合并预测模式或高级运动矢量预测模式过程产生用于PU中的每一者的候选预测运动矢量(Motion Vector,MV)列表。用于PU的候选预测运动矢量列表中的每一候选预测运动矢量可指示运动信息,MV列表也可称之为候选运动信息列表。由候选预测运动矢量列表中的一些候选预测运动矢量指示的运动信息可基于其它PU的运动信息。如果候选预测运动矢量指示指定空间候选预测运动矢量位置或时间候选预测运动矢量位置中的一者的运动信息,则本申请可将所述候选预测运动矢量称作“原始”候选预测运动矢量。举例来说,对于合并(Merge)模式,在本文中也称为合并预测模式,可存在五个原始空间候选预测运动矢量位置和一个原始时间候选预测运动矢量位置。在一些实例中,视频编码器可通过组合来自不同原始候选预测运动矢量的部分运动矢量、修改原始候选预测运动矢量或仅插入零运动矢量作为候选预测运动矢量来产生额外候选预测运动矢量。这些额外候选预测运动矢量不被视为原始候选预测运动矢量且在本申请中可称作人工产生的候选预测运动矢量。To reduce the number of coding bits required to represent the motion information of the PU, the video encoder may generate candidate prediction motion vectors (Motion Vector, MV) for each of the PUs according to the merge prediction mode or advanced motion vector prediction mode process List. Each candidate prediction motion vector in the candidate prediction motion vector list for the PU may indicate motion information, and the MV list may also be referred to as a candidate motion information list. The motion information indicated by some candidate prediction motion vectors in the candidate prediction motion vector list may be based on the motion information of other PUs. If the candidate prediction motion vector indicates motion information specifying one of a spatial candidate prediction motion vector position or a temporal candidate prediction motion vector position, the present application may refer to the candidate prediction motion vector as an "original" candidate prediction motion vector. For example, for a merge mode, also referred to herein as a merge prediction mode, there may be five original spatial candidate prediction motion vector positions and one original temporal candidate prediction motion vector position. In some examples, the video encoder may generate additional candidate prediction motion vectors by combining partial motion vectors from different original candidate prediction motion vectors, modifying the original candidate prediction motion vectors, or inserting only zero motion vectors as candidate prediction motion vectors. These additional candidate prediction motion vectors are not considered as original candidate prediction motion vectors and may be referred to as artificially generated candidate prediction motion vectors in this application.
本申请的技术一般涉及用于在视频编码器处产生候选预测运动矢量列表的技术和用于在视频解码器处产生相同候选预测运动矢量列表的技术。视频编码器和视频解码器可通过实施用于构建候选预测运动矢量列表的相同技术来产生相同候选预测运动矢量列表。举例来说,视频编码器和视频解码器两者可构建具有相同数目的候选预测运动矢量(例如,五个候选预测运动矢量)的列表。视频编码器和解码器可首先考虑空间候选预测运动矢量(例如,同一图像中的相邻块),接着考虑时间候选预测运动矢量(例如,不同图像中的候选预测运动矢量),且最后可考虑人工产生的候选预测运动矢量直到将所要数目的候选预测运动矢量添加到列表为止。根据本申请的技术,可在候选预 测运动矢量列表构建期间,在候选预测运动矢量列表中通过标识位指示一种类型的候选预测运动矢量,以控制候选预测运动矢量列表的长度。举例来说,将空间候选预测运动矢量集合和对于时间候选预测运动矢量可作为原始候选预测运动矢量,当将人工产生的候选预测运动矢量添加到候选预测运动矢量的列表时,可在候选预测运动矢量列表中增加一个标识位空间,以指示人工产生的候选预测运动矢量集合。在编解码时,当选中一个标识位时,从该标识位指示的候选预测运动矢量集合中选取预测运动矢量。The techniques of this application generally relate to a technique for generating a list of candidate prediction motion vectors at a video encoder and a technique for generating the same list of candidate prediction motion vectors at a video decoder. The video encoder and video decoder may generate the same candidate prediction motion vector list by implementing the same techniques used to construct the candidate prediction motion vector list. For example, both a video encoder and a video decoder may build a list with the same number of candidate prediction motion vectors (eg, five candidate prediction motion vectors). Video encoders and decoders may first consider spatial candidate prediction motion vectors (e.g., neighboring blocks in the same image), then consider temporal candidate prediction motion vectors (e.g., candidate prediction motion vectors in different images), and finally consider The artificially generated candidate prediction motion vectors are added until a desired number of candidate prediction motion vectors are added to the list. According to the technology of the present application, during the construction of the candidate prediction motion vector list, a type of candidate prediction motion vector may be indicated in the candidate prediction motion vector list through an identification bit to control the length of the candidate prediction motion vector list. For example, the spatial candidate prediction motion vector set and the temporal candidate prediction motion vector can be used as the original candidate prediction motion vector. When an artificially generated candidate prediction motion vector is added to the list of candidate prediction motion vectors, the An identification bit space is added to the vector list to indicate an artificially generated candidate prediction motion vector set. During codec, when an identification bit is selected, a prediction motion vector is selected from a set of candidate prediction motion vectors indicated by the identification bit.
在产生用于CU的PU的候选预测运动矢量列表之后,视频编码器可从候选预测运动矢量列表选择候选预测运动矢量且在码流中输出候选预测运动矢量索引。选定候选预测运动矢量可为具有产生最紧密地匹配正被解码的目标PU的预测子的运动矢量的候选预测运动矢量。候选预测运动矢量索引可指示在候选预测运动矢量列表中选定候选预测运动矢量的位置。视频编码器还可基于由PU的运动信息指示的参考块产生用于PU的预测性图像块。可基于由选定候选预测运动矢量指示的运动信息确定PU的运动信息。举例来说,在合并模式中,PU的运动信息可与由选定候选预测运动矢量指示的运动信息相同。在AMVP模式中,PU的运动信息可基于PU的运动矢量差和由选定候选预测运动矢量指示的运动信息确定。视频编码器可基于CU的PU的预测性图像块和用于CU的原始图像块产生用于CU的一或多个残余图像块。视频编码器可接着编码一或多个残余图像块且在码流中输出一或多个残余图像块。After generating the candidate prediction motion vector list for the PU of the CU, the video encoder may select the candidate prediction motion vector from the candidate prediction motion vector list and output the candidate prediction motion vector index in the code stream. The selected candidate prediction motion vector may be a candidate prediction motion vector having a motion vector that most closely matches the predictor of the target PU being decoded. The candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list. The video encoder may also generate a predictive image block for the PU based on a reference block indicated by the motion information of the PU. The motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector. For example, in the merge mode, the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector. In the AMVP mode, the motion information of the PU may be determined based on the motion vector difference of the PU and the motion information indicated by the selected candidate prediction motion vector. The video encoder may generate one or more residual image blocks for the CU based on the predictive image blocks of the PU of the CU and the original image blocks for the CU. The video encoder may then encode one or more residual image blocks and output one or more residual image blocks in a code stream.
码流可包括识别PU的候选预测运动矢量列表中的选定候选预测运动矢量的数据,本文中称之为标识或者信号。该数据可以包括候选预测运动矢量列表中的索引,通过该索引确定出目标运动矢量;或者,通过该索引确定出目标运动矢量为某一类候选预测运动矢量,此时,该数据还包括指示选定候选预测运动矢量的数据在该类候选预测运动矢量中的具体位置的信息。视频解码器可解析码流,获取识别PU的候选预测运动矢量列表中的选定候选预测运动矢量的数据,根据该数据确定选定候选预测运动矢量的数据,基于由PU的候选预测运动矢量列表中的选定候选预测运动矢量指示的运动信息确定PU的运动信息。视频解码器可基于PU的运动信息识别用于PU的一或多个参考块。在识别PU的一或多个参考块之后,视频解码器可基于PU的一或多个参考块产生用于PU的预测性图像块。视频解码器可基于用于CU的PU的预测性图像块和用于CU的一或多个残余图像块来重构用于CU的图像块。The bitstream may include data identifying a selected candidate prediction motion vector in the candidate prediction motion vector list of the PU, which is referred to herein as an identifier or signal. The data may include an index in the candidate prediction motion vector list, and the target motion vector is determined through the index; or the target motion vector is determined to be a certain type of candidate prediction motion vector through the index. At this time, the data further includes an instruction selection Information on the specific position of the data of the candidate prediction motion vector in this type of candidate prediction motion vector. The video decoder can parse the bitstream to obtain data of the selected candidate prediction motion vector in the candidate prediction motion vector list identifying the PU, and determine the data of the selected candidate prediction motion vector based on the data. Based on the candidate prediction motion vector list of the PU The motion information indicated by the selected candidate prediction motion vector in determines the motion information of the PU. The video decoder may identify one or more reference blocks for the PU based on the motion information of the PU. After identifying one or more reference blocks of the PU, the video decoder may generate predictive image blocks for the PU based on the one or more reference blocks of the PU. The video decoder may reconstruct an image block for a CU based on a predictive image block for a PU of the CU and one or more residual image blocks for the CU.
为了易于解释,本申请可将位置或图像块描述为与CU或PU具有各种空间关系。此描述可解释为是指位置或图像块和与CU或PU相关联的图像块具有各种空间关系。此外,本申请可将视频解码器当前在解码的PU称作当前PU,也称为当前待处理图像块。本申请可将视频解码器当前在解码的CU称作当前CU。本申请可将视频解码器当前在解码的图像称作当前图像。应理解,本申请同时适用于PU和CU具有相同尺寸,或者PU即为CU的情况,统一使用PU来表示。For ease of explanation, the present application may describe a position or an image block as having various spatial relationships with a CU or a PU. This description can be interpreted to mean that the position or image block and the image block associated with the CU or PU have various spatial relationships. In addition, in this application, a PU currently being decoded by a video decoder may be referred to as a current PU, and may also be referred to as a current image block to be processed. This application may refer to the CU that the video decoder is currently decoding as the current CU. This application may refer to the image currently being decoded by the video decoder as the current image. It should be understood that this application is applicable to a case where the PU and the CU have the same size, or the PU is the CU, and the PU is used to represent the same.
如前文简短地描述,视频编码器100可使用帧间预测以产生用于CU的PU的预测性图像块和运动信息。在许多例子中,给定PU的运动信息可能与一或多个附近PU(即,其图像块在空间上或时间上在给定PU的图像块附近的PU)的运动信息相同或类似。因为附近PU经常具有类似运动信息,所以视频编码器100可参考附近PU的运动信息来编码给定PU的运动信息。参考附近PU的运动信息来编码给定PU的运动信 息可减少码流中指示给定PU的运动信息所需要的编码比特的数目。As briefly described previously, video encoder 100 may use inter prediction to generate predictive image blocks and motion information for a PU of a CU. In many examples, the motion information of a given PU may be the same or similar to the motion information of one or more nearby PUs (ie, PUs whose image blocks are spatially or temporally near the image blocks of the given PU). Because nearby PUs often have similar motion information, video encoder 100 may refer to the motion information of nearby PUs to encode motion information for a given PU. Encoding the motion information of a given PU with reference to the motion information of a nearby PU can reduce the number of encoded bits required to indicate the motion information of a given PU in the code stream.
视频编码器100可以各种方式参考附近PU的运动信息来编码给定PU的运动信息。举例来说,视频编码器100可指示给定PU的运动信息与附近PU的运动信息相同。本申请可使用合并模式来指代指示给定PU的运动信息与附近PU的运动信息相同或可从附近PU的运动信息导出。在另一可行的实施方式中,视频编码器100可计算用于给定PU的运动矢量差(Motion Vector Difference,MVD)。MVD指示给定PU的运动矢量与附近PU的运动矢量之间的差。视频编码器100可将MVD而非给定PU的运动矢量包括于给定PU的运动信息中。在码流中表示MVD比表示给定PU的运动矢量所需要的编码比特少。本申请可使用高级运动矢量预测模式指代通过使用MVD和识别候选者运动矢量的索引值来用信号通知解码端给定PU的运动信息。 Video encoder 100 may refer to motion information of nearby PUs in various ways to encode motion information for a given PU. For example, video encoder 100 may indicate that the motion information of a given PU is the same as the motion information of nearby PUs. This application may use a merge mode to refer to indicating that the motion information of a given PU is the same as that of nearby PUs or may be derived from the motion information of nearby PUs. In another feasible implementation, the video encoder 100 may calculate a Motion Vector Difference (MVD) for a given PU. MVD indicates the difference between the motion vector of a given PU and the motion vector of a nearby PU. Video encoder 100 may include MVD instead of a motion vector of a given PU in the motion information of a given PU. Representing MVD in the codestream requires fewer coding bits than representing the motion vector of a given PU. This application may use advanced motion vector prediction mode to refer to the motion information of a given PU by using the MVD and an index value identifying a candidate motion vector.
为了使用合并模式或AMVP模式来用信号通知解码端给定PU的运动信息,视频编码器100可产生用于给定PU的候选预测运动矢量列表。候选预测运动矢量列表可包括一或多个候选预测运动矢量。用于给定PU的候选预测运动矢量列表中的候选预测运动矢量中的每一者可指定运动信息。由每一候选预测运动矢量指示的运动信息可包括运动矢量、参考图像索引和预测方向标识。候选预测运动矢量列表中的候选预测运动矢量可包括“原始”候选预测运动矢量,其中每一者指示不同于给定PU的PU内的指定候选预测运动矢量位置中的一者的运动信息。In order to use the merge mode or the AMVP mode to signal the motion information of a given PU on the decoding side, the video encoder 100 may generate a list of candidate predicted motion vectors for a given PU. The candidate prediction motion vector list may include one or more candidate prediction motion vectors. Each of the candidate prediction motion vectors in the candidate prediction motion vector list for a given PU may specify motion information. The motion information indicated by each candidate prediction motion vector may include a motion vector, a reference image index, and a prediction direction identifier. The candidate prediction motion vectors in the candidate prediction motion vector list may include "raw" candidate prediction motion vectors, each of which indicates motion information that is different from one of the specified candidate prediction motion vector positions within a PU of a given PU.
在产生用于PU的候选预测运动矢量列表之后,视频编码器100可从用于PU的候选预测运动矢量列表选择候选预测运动矢量中的一者。举例来说,视频编码器可比较每一候选预测运动矢量与正被解码的PU且可选择具有所要码率-失真代价的候选预测运动矢量。视频编码器100可输出用于PU的候选预测运动矢量索引。候选预测运动矢量索引可识别选定候选预测运动矢量在候选预测运动矢量列表中的位置。After generating the candidate prediction motion vector list for the PU, the video encoder 100 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list for the PU. For example, a video encoder may compare each candidate prediction motion vector with the PU being decoded and may select a candidate prediction motion vector with a desired code rate-distortion cost. Video encoder 100 may output a candidate prediction motion vector index for a PU. The candidate prediction motion vector index may identify the position of the selected candidate prediction motion vector in the candidate prediction motion vector list.
此外,视频编码器100可基于由PU的运动信息指示的参考块产生用于PU的预测性图像块。可基于由用于PU的候选预测运动矢量列表中的选定候选预测运动矢量指示的运动信息确定PU的运动信息。举例来说,在合并模式中,PU的运动信息可与由选定候选预测运动矢量指示的运动信息相同。在AMVP模式中,可基于用于PU的运动矢量差和由选定候选预测运动矢量指示的运动信息确定PU的运动信息。视频编码器100可如前文所描述处理用于PU的预测性图像块。In addition, the video encoder 100 may generate a predictive image block for a PU based on a reference block indicated by motion information of the PU. The motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU. For example, in the merge mode, the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector. In the AMVP mode, motion information of a PU may be determined based on a motion vector difference for the PU and motion information indicated by a selected candidate prediction motion vector. Video encoder 100 may process predictive image blocks for a PU as described previously.
如前述,候选预测运动矢量列表中可以使用标识位指示一类候选预测运动矢量,以控制候选预测运动矢量列表的长度。此处不再进行赘述。As mentioned above, an identifier bit may be used in the candidate prediction motion vector list to indicate a type of candidate prediction motion vector to control the length of the candidate prediction motion vector list. I will not repeat them here.
当视频解码器200接收到码流时,视频解码器200可产生用于CU的PU中的每一者的候选预测运动矢量列表。由视频解码器200针对PU产生的候选预测运动矢量列表可与由视频编码器100针对PU产生的候选预测运动矢量列表相同。视频解码器200从码流中解析得到的语法元素可指示在PU的候选预测运动矢量列表中选定候选预测运动矢量的位置。在产生用于PU的候选预测运动矢量列表之后,视频解码器200可基于由PU的运动信息指示的一或多个参考块产生用于PU的预测性图像块。视频解码器200可基于解析码流获取的语法元素,从用于PU的候选预测运动矢量列表中的选定候选预测运动矢量指示的运动信息确定PU的运动信息。视频解码器200可基于用于PU的预测性图像块和用于CU的残余图像块重构用于CU的图像块。When video decoder 200 receives a code stream, video decoder 200 may generate a list of candidate predicted motion vectors for each of the PUs of the CU. The candidate prediction motion vector list generated by the video decoder 200 for the PU may be the same as the candidate prediction motion vector list generated by the video encoder 100 for the PU. The syntax element parsed by the video decoder 200 from the bitstream may indicate the position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU. After generating a list of candidate prediction motion vectors for the PU, the video decoder 200 may generate predictive image blocks for the PU based on one or more reference blocks indicated by the motion information of the PU. The video decoder 200 may determine the motion information of the PU from the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU based on the syntax element obtained by parsing the bitstream. Video decoder 200 may reconstruct an image block for a CU based on a predictive image block for a PU and a residual image block for a CU.
如前述,候选预测运动矢量列表中可以使用标识位指示一类候选预测运动矢量,在此情况下,视频解码器200在接收到码流后,先解析码流获取第一标识,第一标识指示PU的候选预测运动矢量列表中选定候选预测运动矢量的位置。其中,PU的候选预测运动矢量列表中包括至少一个第一候选运动矢量及至少一个第二候选集合,第二候选集合包括至少一个第二候选运动矢量。视频解码器200根据第一标识,从PU的候选预测运动矢量列表中确定第一标识对应的目标元素。若该目标元素为第一候选运动矢量,则视频解码器200将目标元素确定为该PU的目标运动矢量,采用目标运动信息来预测待处理图像块(PU)的运动信息进行后续的解码流程。若该目标元素为第二候选集合时,则视频解码器200解析码流以获得第二标识,第二标识用于标识选定的候选预测运动矢量在第一标识指示的第二候选集合中的位置;视频解码器200根据第二标识,从第一标识指示的第二候选集合中的多个第二候选运动矢量中,确定目标运动信息,采用目标运动信息来预测待处理图像块(PU)的运动信息进行后续的解码流程。As mentioned above, the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector. In this case, after receiving the code stream, the video decoder 200 first parses the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU. The candidate prediction motion vector list of the PU includes at least one first candidate motion vector and at least one second candidate set, and the second candidate set includes at least one second candidate motion vector. Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier. If the target element is the first candidate motion vector, the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is the second candidate set, the video decoder 200 parses the bitstream to obtain a second identifier, and the second identifier is used to identify a selected candidate prediction motion vector in the second candidate set indicated by the first identifier. Position; video decoder 200 determines target motion information from a plurality of second candidate motion vectors in a second candidate set indicated by the first ID according to the second identifier, and uses the target motion information to predict a to-be-processed image block (PU) The motion information is subsequently decoded.
如前述,候选预测运动矢量列表中可以使用标识位指示一类候选预测运动矢量,在此情况下,视频解码器200在接收到码流后,先解析码流获取第一标识,第一标识指示PU的候选预测运动矢量列表中选定候选预测运动矢量的位置。其中,PU的候选预测运动矢量列表中包括至少一个第一候选运动矢量及多个第二候选运动信息,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量。视频解码器200根据第一标识,从PU的候选预测运动矢量列表中确定第一标识对应的目标元素。若该目标元素为第一候选运动矢量,则视频解码器200将目标元素确定为该PU的目标运动矢量,采用目标运动信息来预测待处理图像块(PU)的运动信息进行后续的解码流程。若该目标元素为根据多个第二候选运动信息获得时,则视频解码器200解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息,采用目标运动信息来预测待处理图像块(PU)的运动信息进行后续的解码流程。As mentioned above, the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector. In this case, after receiving the code stream, the video decoder 200 first analyzes the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU. The PU candidate motion vector list includes at least one first candidate motion vector and multiple second candidate motion information. The first candidate motion information includes first motion information, and the second candidate motion information includes preset motion information. Shift amount. Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier. If the target element is the first candidate motion vector, the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is obtained according to a plurality of second candidate motion information, the video decoder 200 parses the bitstream to obtain a second identifier, and determines the target motion based on one of the plurality of second candidate motion information according to the second identifier. Information, the target motion information is used to predict the motion information of the image block (PU) to be processed for subsequent decoding processes.
需要说明的是,候选预测运动矢量列表中的候选运动矢量可以根据不同的模式获取,本申请不进行具体限定。It should be noted that the candidate motion vectors in the candidate prediction motion vector list may be obtained according to different modes, which are not specifically limited in this application.
应理解,在一种可行的实施方式中,在解码端,候选预测运动矢量列表的构建与从码流中解析选定候选预测运动矢量在候选预测运动矢量列表中的位置是相互独立,可以任意先后或者并行进行的。It should be understood that, in a feasible implementation manner, at the decoding end, the construction of the candidate prediction motion vector list and the parsing of the selected candidate prediction motion vector from the code stream in the candidate prediction motion vector list are independent of each other, and can be arbitrarily Sequentially or in parallel.
在另一种可行的实施方式中,在解码端,首先从码流中解析选定候选预测运动矢量在候选预测运动矢量列表中的位置,根据解析出来的位置构建候选预测运动矢量列表,在该实施方式中,不需要构建全部的候选预测运动矢量列表,只需要构建到该解析出来的位置处的候选预测运动矢量列表,即能够确定出该位置的候选预测运动矢量即可。举例来说,当解析码流得出选定的候选预测运动矢量为候选预测运动矢量列表中索引为3的候选预测运动矢量时,仅需要构建从索引为0到索引为3的候选预测运动矢量列表,即可确定索引为3的候选预测运动矢量,可以达到减小复杂度,提高解码效率的技术效果。In another feasible implementation manner, at the decoding end, the position of the selected candidate prediction motion vector in the candidate prediction motion vector list is first parsed from the code stream, and a candidate prediction motion vector list is constructed based on the parsed position. In the embodiment, it is not necessary to construct all candidate prediction motion vector lists, but only the candidate prediction motion vector list at the parsed position, that is, the candidate prediction motion vector of the position can be determined. For example, when the selected candidate predictive motion vector is obtained by parsing the bitstream and is a candidate predictive motion vector with an index of 3 in the candidate predictive motion vector list, only the candidate predictive motion vector from index 0 to index 3 needs to be constructed The list can determine the candidate predicted motion vector with the index of 3, which can achieve the technical effect of reducing complexity and improving decoding efficiency.
图2为本申请实施例中所描述的一种实例的视频编码器100的框图。视频编码器 100用于将视频输出到后处理实体41。后处理实体41表示可处理来自视频编码器100的经编码视频数据的视频实体的实例,例如媒体感知网络元件(MANE)或拼接/编辑装置。在一些情况下,后处理实体41可为网络实体的实例。在一些视频编码系统中,后处理实体41和视频编码器100可为单独装置的若干部分,而在其它情况下,相对于后处理实体41所描述的功能性可由包括视频编码器100的相同装置执行。在某一实例中,后处理实体41是图1的存储装置40的实例。FIG. 2 is a block diagram of a video encoder 100 according to an example described in the embodiment of the present application. The video encoder 100 is configured to output a video to the post-processing entity 41. The post-processing entity 41 represents an example of a video entity that can process the encoded video data from the video encoder 100, such as a media-aware network element (MANE) or a stitching / editing device. In some cases, the post-processing entity 41 may be an instance of a network entity. In some video encoding systems, the post-processing entity 41 and the video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to the post-processing entity 41 may be performed by the same device including the video encoder 100 carried out. In a certain example, the post-processing entity 41 is an example of the storage device 40 of FIG. 1.
在图2的实例中,视频编码器100包括预测处理单元108、滤波器单元106、经解码图像缓冲器(decoded picture buffer,DPB)107、求和器112、变换器101、量化器102和熵编码器103。预测处理单元108包括帧间预测器110和帧内预测器109。为了图像块重构,视频编码器100还包含反量化器104、反变换器105和求和器111。滤波器单元106既定表示一或多个环路滤波器,例如去块滤波器、自适应环路滤波器(adaptive loop filter,ALF)和样本自适应偏移(sample adaptive offset,SAO)滤波器。尽管在图2中将滤波器单元106示出为环路内滤波器,但在其它实现方式下,可将滤波器单元106实施为环路后滤波器。在一种示例下,视频编码器100还可以包括视频数据存储器、分割单元(图中未示意)。In the example of FIG. 2, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a decoded picture buffer (DPB) 107, a summer 112, a transformer 101, a quantizer 102, and entropy. Encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. The filter unit 106 is intended to represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter. Although the filter unit 106 is shown as an in-loop filter in FIG. 2, in other implementations, the filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory and a segmentation unit (not shown in the figure).
视频数据存储器可存储待由视频编码器100的组件编码的视频数据。可从视频源120获得存储在视频数据存储器中的视频数据。DPB 107可为参考图像存储器,其存储用于由视频编码器100在帧内、帧间译码模式中对视频数据进行编码的参考视频数据。视频数据存储器和DPB 107可由多种存储器装置中的任一者形成,例如包含同步动态随机存储器(synchronous dynamic random access memory,SDRAM)的动态随机存取存储器(dynamic random access memory,DRAM)、磁阻式RAM(magnetic random access memory,MRAM)、电阻式RAM(resistive random access memory,RRAM),或其它类型的存储器装置。视频数据存储器和DPB 107可由同一存储器装置或单独存储器装置提供。在各种实例中,视频数据存储器可与视频编码器100的其它组件一起在芯片上,或相对于那些组件在芯片外。The video data memory may store video data to be encoded by the components of the video encoder 100. The video data stored in the video data storage may be obtained from the video source 120. The DPB 107 may be a reference image memory that stores reference video data used by the video encoder 100 to encode video data in an intra-frame or inter-frame decoding mode. The video data memory and the DPB 107 can be formed by any of a variety of memory devices, such as a dynamic random access memory (SDRAM) including a synchronous dynamic random access memory (SDRAM), a magnetoresistance RAM (magnetic random access memory, MRAM), resistive RAM (resistive random access memory, RRAM), or other types of memory devices. Video data storage and DPB 107 can be provided by the same storage device or separate storage devices. In various examples, the video data memory may be on-chip with other components of video encoder 100 or off-chip relative to those components.
如图2所示,视频编码器100接收视频数据,并将所述视频数据存储在视频数据存储器中。分割单元将所述视频数据分割成若干图像块,而且这些图像块可以被进一步分割为更小的块,例如基于四叉树结构或者二叉树结构的图像块分割。此分割还可包含分割成条带(slice)、片(tile)或其它较大单元。视频编码器100通常说明编码待编码的视频条带内的图像块的组件。所述条带可分成多个图像块(并且可能分成被称作片的图像块集合)。预测处理单元108可选择用于当前图像块的多个可能的译码模式中的一者,例如多个帧内译码模式中的一者或多个帧间译码模式中的一者。预测处理单元108可将所得经帧内、帧间译码的块提供给求和器112以产生残差块,且提供给求和器111以重构用作参考图像的经编码块。As shown in FIG. 2, the video encoder 100 receives video data and stores the video data in a video data memory. The segmentation unit divides the video data into several image blocks, and these image blocks can be further divided into smaller blocks, such as image block segmentation based on a quad tree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger units. Video encoder 100 typically illustrates components that encode image blocks within a video slice to be encoded. The slice can be divided into multiple image patches (and possibly into a collection of image patches called slices). The prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. The prediction processing unit 108 may provide the obtained intra, inter-coded block to the summer 112 to generate a residual block, and to the summer 111 to reconstruct an encoded block used as a reference image.
预测处理单元108内的帧内预测器109可相对于与待编码当前块在相同帧或条带中的一或多个相邻块执行当前图像块的帧内预测性编码,以去除空间冗余。预测处理单元108内的帧间预测器110可相对于一或多个参考图像中的一或多个预测块执行当前图像块的帧间预测性编码以去除时间冗余。The intra predictor 109 within the prediction processing unit 108 may perform intra predictive encoding of the current image block with respect to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy. . The inter predictor 110 within the prediction processing unit 108 may perform inter predictive coding of the current image block with respect to one or more prediction blocks in the one or more reference images to remove temporal redundancy.
具体的,帧间预测器110可用于确定用于编码当前图像块的帧间预测模式。举例 来说,帧间预测器110可使用码率-失真分析来计算候选帧间预测模式集合中的各种帧间预测模式的码率-失真值,并从中选择具有最佳码率-失真特性的帧间预测模式。码率失真分析通常确定经编码块与经编码以产生所述经编码块的原始的未经编码块之间的失真(或误差)的量,以及用于产生经编码块的位码率(也就是说,位数目)。例如,帧间预测器110可确定候选帧间预测模式集合中编码所述当前图像块的码率失真代价最小的帧间预测模式为用于对当前图像块进行帧间预测的帧间预测模式。Specifically, the inter predictor 110 may be configured to determine an inter prediction mode for encoding a current image block. For example, the inter predictor 110 may use a rate-distortion analysis to calculate the rate-distortion values of various inter-prediction modes in the set of candidate inter-prediction modes, and select from them the best rate-distortion characteristics. Inter prediction mode. Code rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate (also That is, the number of bits). For example, the inter predictor 110 may determine that the inter prediction mode with the lowest code rate distortion cost of encoding the current image block in the candidate inter prediction mode set is the inter prediction mode used for inter prediction of the current image block.
帧间预测器110用于基于确定的帧间预测模式,预测当前图像块中一个或多个子块的运动信息(例如运动矢量),并利用当前图像块中一个或多个子块的运动信息(例如运动矢量)获取或产生当前图像块的预测块。帧间预测器110可在参考图像列表中的一者中定位所述运动向量指向的预测块。帧间预测器110还可产生与图像块和视频条带相关联的语法元素以供视频解码器200在对视频条带的图像块解码时使用。又或者,一种示例下,帧间预测器110利用每个子块的运动信息执行运动补偿过程,以生成每个子块的预测块,从而得到当前图像块的预测块;应当理解的是,这里的帧间预测器110执行运动估计和运动补偿过程。The inter predictor 110 is configured to predict motion information (such as a motion vector) of one or more sub-blocks in the current image block based on the determined inter prediction mode, and use the motion information (such as the motion vector) of one or more sub-blocks in the current image block. Motion vector) to obtain or generate a prediction block of the current image block. The inter predictor 110 may locate a prediction block pointed to by the motion vector in one of the reference image lists. The inter predictor 110 may also generate syntax elements associated with image blocks and video slices for use by the video decoder 200 when decoding image blocks of the video slice. In another example, the inter predictor 110 uses the motion information of each sub-block to perform a motion compensation process to generate a prediction block of each sub-block, thereby obtaining a prediction block of the current image block. It should be understood that the The inter predictor 110 performs motion estimation and motion compensation processes.
具体的,在为当前图像块选择帧间预测模式之后,帧间预测器110可将指示当前图像块的所选帧间预测模式的信息提供到熵编码器103,以便于熵编码器103编码指示所选帧间预测模式的信息。Specifically, after the inter prediction mode is selected for the current image block, the inter predictor 110 may provide information indicating the selected inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the instruction. Information on the selected inter prediction mode.
帧内预测器109可对当前图像块执行帧内预测。明确地说,帧内预测器109可确定用来编码当前块的帧内预测模式。举例来说,帧内预测器109可使用码率-失真分析来计算各种待测试的帧内预测模式的码率-失真值,并从待测试模式当中选择具有最佳码率-失真特性的帧内预测模式。在任何情况下,在为图像块选择帧内预测模式之后,帧内预测器109可将指示当前图像块的所选帧内预测模式的信息提供到熵编码器103,以便熵编码器103编码指示所选帧内预测模式的信息。The intra predictor 109 may perform intra prediction on the current image block. In particular, the intra predictor 109 may determine an intra prediction mode used to encode the current block. For example, the intra predictor 109 may use a rate-distortion analysis to calculate the rate-distortion values of various intra-prediction modes to be tested, and select the one with the best rate-distortion characteristics from the modes to be tested. Intra prediction mode. In any case, after the intra prediction mode is selected for the image block, the intra predictor 109 may provide information indicating the selected intra prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the indication Information on the selected intra prediction mode.
在预测处理单元108经由帧间预测、帧内预测产生当前图像块的预测块之后,视频编码器100通过从待编码的当前图像块减去所述预测块来形成残差图像块。求和器112表示执行此减法运算的一或多个组件。所述残差块中的残差视频数据可包含在一或多个(transform unit,TU)中,并应用于变换器101。变换器101使用例如离散余弦变换(discrete cosine transform,DCT)或概念上类似的变换等变换将残差视频数据变换成残差变换系数。变换器101可将残差视频数据从像素值域转换到变换域,例如频域。After the prediction processing unit 108 generates a prediction block of the current image block via inter prediction and intra prediction, the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. The summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more (transform units) and applied to the transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform. The transformer 101 may transform the residual video data from a pixel value domain to a transform domain, such as a frequency domain.
变换器101可将所得变换系数发送到量化器102。量化器102量化所述变换系数以进一步减小位码率。在一些实例中,量化器102可接着执行对包含经量化的变换系数的矩阵的扫描。或者,熵编码器103可执行扫描。The transformer 101 may send the obtained transform coefficients to a quantizer 102. A quantizer 102 quantizes the transform coefficients to further reduce the bit code rate. In some examples, the quantizer 102 may then perform a scan of a matrix containing the quantized transform coefficients. Alternatively, the entropy encoder 103 may perform scanning.
在量化之后,熵编码器103对经量化变换系数进行熵编码。举例来说,熵编码器103可执行上下文自适应可变长度编码(CAVLC)、上下文自适应二进制算术编码(CABAC)、基于语法的上下文自适应二进制算术编码(SBAC)、概率区间分割熵(PIPE)编码或另一熵编码方法或技术。在由熵编码器103熵编码之后,可将经编码码流发射到视频解码器200,或经存档以供稍后发射或由视频解码器200检索。熵编码器103还可对待编码的当前图像块的语法元素进行熵编码。After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 can perform context-adaptive variable-length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy (PIPE ) Coding or another entropy coding method or technique. After entropy encoding by the entropy encoder 103, the encoded code stream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoder 103 may also perform entropy coding on the syntax elements of the current image block to be coded.
反量化器104和反变化器105分别应用逆量化和逆变换以在像素域中重构所述残差块,例如以供稍后用作参考图像的参考块。求和器111将经重构的残差块添加到由帧间预测器110或帧内预测器109产生的预测块,以产生经重构图像块。滤波器单元106可以适用于经重构图像块以减小失真,诸如方块效应(block artifacts)。然后,该经重构图像块作为参考块存储在经解码图像缓冲器107中,可由帧间预测器110用作参考块以对后续视频帧或图像中的块进行帧间预测。The inverse quantizer 104 and the inverse changer 105 respectively apply inverse quantization and inverse transform to reconstruct the residual block in the pixel domain, for example, for later use as a reference block of a reference image. The summer 111 adds the reconstructed residual block to a prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block. The filter unit 106 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts. This reconstructed image block is then stored as a reference block in the decoded image buffer 107 and can be used by the inter predictor 110 as a reference block to perform inter prediction on subsequent video frames or blocks in the image.
应当理解的是,视频编码器100的其它的结构变化可用于编码视频流。例如,对于某些图像块或者图像帧,视频编码器100可以直接地量化残差信号而不需要经变换器101处理,相应地也不需要经反变换器105处理;或者,对于某些图像块或者图像帧,视频编码器100没有产生残差数据,相应地不需要经变换器101、量化器102、反量化器104和反变换器105处理;或者,视频编码器100可以将经重构图像块作为参考块直接地进行存储而不需要经滤波器单元106处理;或者,视频编码器100中量化器102和反量化器104可以合并在一起。It should be understood that other structural changes of the video encoder 100 may be used to encode a video stream. For example, for certain image blocks or image frames, the video encoder 100 may directly quantize the residual signal without processing by the transformer 101 and correspondingly does not need to be processed by the inverse transformer 105; or, for some image blocks Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transformer 101, quantizer 102, inverse quantizer 104, and inverse transformer 105; or, the video encoder 100 may convert the reconstructed image The blocks are stored directly as reference blocks without being processed by the filter unit 106; alternatively, the quantizer 102 and the inverse quantizer 104 in the video encoder 100 may be merged together.
图3为本申请实施例中所描述的一种实例的视频解码器200的框图。在图3的实例中,视频解码器200包括熵解码器203、预测处理单元208、反量化器204、反变换器205、求和器211、滤波器单元206以及DPB 207。预测处理单元208可以包括帧间预测器210和帧内预测器209。在一些实例中,视频解码器200可执行大体上与相对于来自图2的视频编码器100描述的编码过程互逆的解码过程。FIG. 3 is a block diagram of an example video decoder 200 described in the embodiment of the present application. In the example of FIG. 3, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a DPB 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially inverse to the encoding process described with respect to video encoder 100 from FIG. 2.
在解码过程中,视频解码器200从视频编码器100接收表示经编码视频条带的图像块和相关联的语法元素的经编码视频码流。视频解码器200可从网络实体42接收视频数据,可选的,还可以将所述视频数据存储在视频数据存储器(图中未示意)中。视频数据存储器可存储待由视频解码器200的组件解码的视频数据,例如经编码视频码流。存储在视频数据存储器中的视频数据,例如可从存储装置40、从相机等本地视频源、经由视频数据的有线或无线网络通信或者通过存取物理数据存储媒体而获得。视频数据存储器可作为用于存储来自经编码视频码流的经编码视频数据的经解码图像缓冲器(CPB)。因此,尽管在图3中没有示意出视频数据存储器,但视频数据存储器和DPB 207可以是同一个的存储器,也可以是单独设置的存储器。视频数据存储器和DPB 207可由多种存储器装置中的任一者形成,例如:包含同步DRAM(SDRAM)的动态随机存取存储器(DRAM)、磁阻式RAM(MRAM)、电阻式RAM(RRAM),或其它类型的存储器装置。在各种实例中,视频数据存储器可与视频解码器200的其它组件一起集成在芯片上,或相对于那些组件设置在芯片外。During the decoding process, the video decoder 200 receives from the video encoder 100 an encoded video codestream representing image blocks of the encoded video slice and associated syntax elements. The video decoder 200 may receive video data from the network entity 42, optionally, the video data may also be stored in a video data storage (not shown in the figure). The video data memory may store video data, such as an encoded video code stream, to be decoded by components of the video decoder 200. The video data stored in the video data storage can be obtained, for example, from the storage device 40, from a local video source such as a camera, via a wired or wireless network of video data, or by accessing a physical data storage medium. The video data memory can be used as a decoded image buffer (CPB) for storing encoded video data from the encoded video bitstream. Therefore, although the video data storage is not shown in FIG. 3, the video data storage and the DPB 207 may be the same storage, or may be separately provided storages. Video data memory and DPB 207 can be formed by any of a variety of memory devices, such as: dynamic random access memory (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), and resistive RAM (RRAM) , Or other types of memory devices. In various examples, the video data memory may be integrated on a chip with other components of the video decoder 200 or provided off-chip relative to those components.
网络实体42可例如为服务器、MANE、视频编辑器/剪接器,或用于实施上文所描述的技术中的一或多者的其它此装置。网络实体42可包括或可不包括视频编码器,例如视频编码器100。在网络实体42将经编码视频码流发送到视频解码器200之前,网络实体42可实施本申请中描述的技术中的部分。在一些视频解码系统中,网络实体42和视频解码器200可为单独装置的部分,而在其它情况下,相对于网络实体42描述的功能性可由包括视频解码器200的相同装置执行。在一些情况下,网络实体42可为图1的存储装置40的实例。The network entity 42 may be, for example, a server, a MANE, a video editor / splicer, or other such device for implementing one or more of the techniques described above. The network entity 42 may or may not include a video encoder, such as video encoder 100. Before the network entity 42 sends the encoded video code stream to the video decoder 200, the network entity 42 may implement some of the techniques described in this application. In some video decoding systems, the network entity 42 and the video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to the network entity 42 may be performed by the same device including the video decoder 200. In some cases, the network entity 42 may be an example of the storage device 40 of FIG. 1.
视频解码器200的熵解码器203对码流进行熵解码以产生经量化的系数和一些语 法元素。熵解码器203将语法元素转发到预测处理单元208。视频解码器200可接收在视频条带层级和/或图像块层级处的语法元素。The entropy decoder 203 of the video decoder 200 entropy decodes the code stream to produce quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at a video slice level and / or an image block level.
当视频条带被解码为经帧内解码(I)条带时,预测处理单元208的帧内预测器209可基于发信号通知的帧内预测模式和来自当前帧或图像的先前经解码块的数据而产生当前视频条带的图像块的预测块。当视频条带被解码为经帧间解码(即,B或P)条带时,预测处理单元208的帧间预测器210可基于从熵解码器203接收到的语法元素,确定用于对当前视频条带的当前图像块进行解码的帧间预测模式,基于确定的帧间预测模式,对所述当前图像块进行解码(例如执行帧间预测)。具体的,帧间预测器210可确定是否对当前视频条带的当前图像块采用新的帧间预测模式进行预测,如果语法元素指示采用新的帧间预测模式来对当前图像块进行预测,基于新的帧间预测模式(例如通过语法元素指定的一种新的帧间预测模式或默认的一种新的帧间预测模式)预测当前视频条带的当前图像块或当前图像块的子块的运动信息,从而通过运动补偿过程使用预测出的当前图像块或当前图像块的子块的运动信息来获取或生成当前图像块或当前图像块的子块的预测块。这里的运动信息可以包括参考图像信息和运动矢量,其中参考图像信息可以包括但不限于单向/双向预测信息,参考图像列表号和参考图像列表对应的参考图像索引。对于帧间预测,可从参考图像列表中的一者内的参考图像中的一者产生预测块。视频解码器200可基于存储在DPB 207中的参考图像来建构参考图像列表,即列表0和列表1。当前图像的参考帧索引可包含于参考帧列表0和列表1中的一或多者中。在一些实例中,可以是视频编码器100发信号通知指示是否采用新的帧间预测模式来解码特定块的特定语法元素,或者,也可以是发信号通知指示是否采用新的帧间预测模式,以及指示具体采用哪一种新的帧间预测模式来解码特定块的特定语法元素。应当理解的是,这里的帧间预测器210执行运动补偿过程。When a video slice is decoded into an intra decoded (I) slice, the intra predictor 209 of the prediction processing unit 208 may be based on the signaled intra prediction mode and the previously decoded block from the current frame or image. Data to generate prediction blocks for image blocks of the current video slice. When a video slice is decoded into an inter-decoded (ie, B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine, based on the syntax elements received from the entropy decoder 203, the An inter prediction mode in which a current image block of a video slice is decoded, and based on the determined inter prediction mode, the current image block is decoded (for example, inter prediction is performed). Specifically, the inter predictor 210 may determine whether to use the new inter prediction mode to predict the current image block of the current video slice. If the syntax element indicates that the new inter prediction mode is used to predict the current image block, based on A new inter prediction mode (for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode) predicts the current image block of the current video slice or a sub-block of the current image block. Motion information, so that the motion information of the current image block or a sub-block of the current image block is used to obtain or generate a prediction block of the current image block or a sub-block of the current image block through a motion compensation process. The motion information here may include reference image information and motion vectors, where the reference image information may include but is not limited to unidirectional / bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list. For inter prediction, a prediction block may be generated from one of reference pictures within one of the reference picture lists. The video decoder 200 may construct a reference image list, that is, a list 0 and a list 1, based on the reference images stored in the DPB 207. The reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1. In some examples, the video encoder 100 may signal whether to use a new inter prediction mode to decode a specific syntax element of a specific block, or may be a signal to indicate whether to use a new inter prediction mode. And indicating which new inter prediction mode is used to decode a specific syntax element of a specific block. It should be understood that the inter predictor 210 here performs a motion compensation process.
反量化器204将在码流中提供且由熵解码器203解码的经量化变换系数逆量化,即去量化。逆量化过程可包括:使用由视频编码器100针对视频条带中的每个图像块计算的量化参数来确定应施加的量化程度以及同样地确定应施加的逆量化程度。反变换器205将逆变换应用于变换系数,例如逆DCT、逆整数变换或概念上类似的逆变换过程,以便产生像素域中的残差块。The inverse quantizer 204 inverse quantizes, that is, dequantizes, the quantized transform coefficients provided in the code stream and decoded by the entropy decoder 203. The inverse quantization process may include using a quantization parameter calculated by the video encoder 100 for each image block in the video slice to determine the degree of quantization that should be applied and similarly to determine the degree of inverse quantization that should be applied. The inverse transformer 205 applies an inverse transform to transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process to generate a residual block in the pixel domain.
在帧间预测器210产生用于当前图像块或当前图像块的子块的预测块之后,视频解码器200通过将来自反变换器205的残差块与由帧间预测器210产生的对应预测块求和以得到重建的块,即经解码图像块。求和器211表示执行此求和操作的组件。在需要时,还可使用环路滤波器(在解码环路中或在解码环路之后)来使像素转变平滑或者以其它方式改进视频质量。滤波器单元206可以表示一或多个环路滤波器,例如去块滤波器、自适应环路滤波器(ALF)以及样本自适应偏移(SAO)滤波器。尽管在图3中将滤波器单元206示出为环路内滤波器,但在其它实现方式中,可将滤波器单元206实施为环路后滤波器。在一种示例下,滤波器单元206适用于重建块以减小块失真,并且该结果作为经解码视频流输出。并且,还可以将给定帧或图像中的经解码图像块存储在经解码图像缓冲器207中,经DPB 207存储用于后续运动补偿的参考图像。经DPB 207可为存储器的一部分,其还可以存储经解码视频,以供稍后在显示装置(例如图1的显示装置220)上呈现,或可与此类存储器分开。After the inter predictor 210 generates a prediction block for the current image block or a subblock of the current image block, the video decoder 200 works by comparing the residual block from the inverse transformer 205 with the corresponding prediction generated by the inter predictor 210 The blocks are summed to get the reconstructed block, that is, the decoded image block. The summer 211 represents a component that performs this summing operation. When needed, a loop filter (in or after the decoding loop) can also be used to smooth pixel transitions or otherwise improve video quality. The filter unit 206 may represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter. Although the filter unit 206 is shown as an in-loop filter in FIG. 3, in other implementations, the filter unit 206 may be implemented as a post-loop filter. In one example, the filter unit 206 is adapted to reconstruct a block to reduce block distortion, and the result is output as a decoded video stream. In addition, the decoded image block in a given frame or image can also be stored in the decoded image buffer 207, and the reference image used for subsequent motion compensation can be stored via the DPB 207. The DPB 207 may be part of the memory, which may also store the decoded video for later presentation on a display device (such as the display device 220 of FIG. 1), or may be separate from such memory.
应当理解的是,视频解码器200的其它结构变化可用于解码经编码视频码流。例如,视频解码器200可以不经滤波器单元206处理而生成输出视频流;或者,对于某些图像块或者图像帧,视频解码器200的熵解码器203没有解码出经量化的系数,相应地不需要经反量化器204和反变换器205处理。It should be understood that other structural changes of the video decoder 200 may be used to decode the encoded video code stream. For example, the video decoder 200 may generate an output video stream without being processed by the filter unit 206; or, for certain image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode the quantized coefficients, and accordingly, It does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
如前文所注明,本申请的技术示例性地涉及帧间解码。应理解,本申请的技术可通过本申请中所描述的视频解码器中的任一者进行,视频解码器包含(例如)如关于图1到3所展示及描述的视频编码器100及视频解码器200。即,在一种可行的实施方式中,关于图2所描述的帧间预测器110可在视频数据的块的编码期间在执行帧间预测时执行下文中所描述的特定技术。在另一可行的实施方式中,关于图3所描述的帧间预测器210可在视频数据的块的解码期间在执行帧间预测时执行下文中所描述的特定技术。因此,对一般性“视频编码器”或“视频解码器”的引用可包含视频编码器100、视频解码器200或另一视频编码或编码单元。As noted previously, the techniques of this application exemplarily involve inter-frame decoding. It should be understood that the techniques of this application may be performed by any of the video decoders described in this application. The video decoder includes, for example, the video encoder 100 and video decoding as shown and described with respect to FIGS. 1-3.器 200。 200. That is, in one feasible implementation, the inter predictor 110 described with respect to FIG. 2 may perform specific techniques described below when performing inter prediction during encoding of a block of video data. In another possible implementation, the inter predictor 210 described with respect to FIG. 3 may perform specific techniques described below when performing inter prediction during decoding of blocks of video data. Thus, a reference to a generic "video encoder" or "video decoder" may include video encoder 100, video decoder 200, or another video encoding or coding unit.
应当理解的是,本申请的视频编码器100和视频解码器200中,针对某个环节的处理结果可以经过进一步处理后,输出到下一个环节,例如,在插值滤波、运动矢量推导或环路滤波等环节之后,对相应环节的处理结果进一步进行Clip或移位shift等操作。It should be understood that, in the video encoder 100 and the video decoder 200 of the present application, the processing result for a certain link may be further processed and output to the next link, for example, in interpolation filtering, motion vector derivation or loop After filtering and other steps, the results of the corresponding steps are further clipped or shifted.
例如,按照相邻仿射编码块的运动矢量推导得到的当前图像块的控制点的运动矢量,可以经过进一步处理,本申请对此不做限定。例如,对运动矢量的取值范围进行约束,使其在一定的位宽内。假设允许的运动矢量的位宽为bitDepth,则运动矢量的范围为-2^(bitDepth-1)~2^(bitDepth-1)-1,其中“^”符号表示幂次方。如bitDepth为16,则取值范围为-32768~32767。如bitDepth为18,则取值范围为-131072~131071。可以通过以下两种方式进行约束:For example, the motion vector of the control point of the current image block derived according to the motion vector of the adjacent affine coding block may be further processed, which is not limited in this application. For example, the value range of the motion vector is restricted so that it is within a certain bit width. Assuming that the bit width of the allowed motion vector is bitDepth, the range of the motion vector is -2 ^ (bitDepth-1) to 2 ^ (bitDepth-1) -1, where the "^" symbol represents the power. If bitDepth is 16, the value ranges from -32768 to 32767. If bitDepth is 18, the value ranges from -131072 to 131071. Constraints can be implemented in two ways:
方式1,将运动矢量溢出的高位去除:Method 1: Remove the high bits of the motion vector overflow:
ux=(vx+2 bitDepth)%2 bitDepth ux = (vx + 2 bitDepth )% 2 bitDepth
vx=(ux>=2 bitDepth-1)?(ux-2 bitDepth):ux vx = (ux> = 2 bitDepth-1 )? (ux-2 bitDepth ): ux
uy=(vy+2 bitDepth)%2 bitDepth uy = (vy + 2 bitDepth )% 2 bitDepth
vy=(uy>=2 bitDepth-1)?(uy-2 bitDepth):uy vy = (uy> = 2 bitDepth-1 )? (uy-2 bitDepth ): uy
例如vx的值为-32769,通过以上公式得到的为32767。因为在计算机中,数值是以二进制的补码形式存储的,-32769的二进制补码为1,0111,1111,1111,1111(17位),计算机对于溢出的处理为丢弃高位,则vx的值为0111,1111,1111,1111,则为32767,与通过公式处理得到的结果一致。For example, the value of vx is -32769, and the value obtained by the above formula is 32767. Because in the computer, the value is stored in the two's complement form, and the two's complement of -32769 is 1,0111,1111,1111,1111 (17 bits). The computer treats the overflow as discarding the high order bits, so the value of vx For 0111, 1111, 1111, 1111, it is 32767, which is consistent with the result obtained by formula processing.
方法2,将运动矢量进行Clipping,如以下公式所示:Method 2: Clipping the motion vector, as shown in the following formula:
vx=Clip3(-2 bitDepth-1,2 bitDepth-1-1,vx) vx = Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vx)
vy=Clip3(-2 bitDepth-1,2 bitDepth-1-1,vy) vy = Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vy)
其中Clip3的定义为,表示将z的值钳位到区间[x,y]之间:The definition of Clip3 is to clamp the value of z to the interval [x, y]:
Figure PCTCN2019105711-appb-000001
Figure PCTCN2019105711-appb-000001
图4为本申请实施例中帧间预测模块121的一种示意性框图。帧间预测模块121, 示例性的,可以包括运动估计单元和运动补偿单元。在不同的视频压缩编解码标准中,PU和CU的关系各有不同。帧间预测模块121可根据多个分割模式将当前CU分割为PU。举例来说,帧间预测模块121可根据2N×2N、2N×N、N×2N和N×N分割模式将当前CU分割为PU。在其他实施例中,当前CU即为当前PU,不作限定。FIG. 4 is a schematic block diagram of an inter prediction module 121 according to an embodiment of the present application. The inter prediction module 121, for example, may include a motion estimation unit and a motion compensation unit. The relationship between PU and CU is different in different video compression codecs. The inter prediction module 121 may partition a current CU into a PU according to a plurality of partitioning modes. For example, the inter prediction module 121 may partition a current CU into a PU according to 2N × 2N, 2N × N, N × 2N, and N × N partition modes. In other embodiments, the current CU is the current PU, which is not limited.
帧间预测模块121可对PU中的每一者执行整数运动估计(Integer Motion Estimation,IME)且接着执行分数运动估计(Fraction Motion Estimation,FME)。当帧间预测模块121对PU执行IME时,帧间预测模块121可在一个或多个参考图像中搜索用于PU的参考块。在找到用于PU的参考块之后,帧间预测模块121可产生以整数精度指示PU与用于PU的参考块之间的空间位移的运动矢量。当帧间预测模块121对PU执行FME时,帧间预测模块121可改进通过对PU执行IME而产生的运动矢量。通过对PU执行FME而产生的运动矢量可具有子整数精度(例如,1/2像素精度、1/4像素精度等)。在产生用于PU的运动矢量之后,帧间预测模块121可使用用于PU的运动矢量以产生用于PU的预测性图像块。The inter prediction module 121 may perform integer motion estimation (IME) and then perform fractional motion estimation (FME) on each of the PUs. When the inter prediction module 121 performs IME on a PU, the inter prediction module 121 may search a reference block for a PU in one or more reference images. After the reference block for the PU is found, the inter prediction module 121 may generate a motion vector indicating the spatial displacement between the PU and the reference block for the PU with integer precision. When the inter prediction module 121 performs FME on the PU, the inter prediction module 121 may improve a motion vector generated by performing IME on the PU. A motion vector generated by performing FME on a PU may have sub-integer precision (eg, 1/2 pixel precision, 1/4 pixel precision, etc.). After generating a motion vector for the PU, the inter prediction module 121 may use the motion vector for the PU to generate a predictive image block for the PU.
在帧间预测模块121使用AMVP模式用信号通知解码端PU的运动信息的一些可行的实施方式中,帧间预测模块121可产生用于PU的候选预测运动矢量列表。候选预测运动矢量列表可包括一个或多个原始候选预测运动矢量和从原始候选预测运动矢量导出的一个或多个额外候选预测运动矢量。在产生用于PU的候选预测运动矢量列表之后,帧间预测模块121可从候选预测运动矢量列表选择候选预测运动矢量且产生用于PU的运动矢量差(MVD)。用于PU的MVD可指示由选定候选预测运动矢量指示的运动矢量与使用IME和FME针对PU产生的运动矢量之间的差。在这些可行的实施方式中,帧间预测模块121可输出识别选定候选预测运动矢量在候选预测运动矢量列表中的位置的候选预测运动矢量索引。帧间预测模块121还可输出PU的MVD。下文详细描述图6中,本申请实施例中高级运动矢量预测(AMVP)模式的一种可行的实施方式。In some feasible implementations of the inter prediction module 121 using the AMVP mode to signal the motion information of the decoding end PU, the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU. The candidate prediction motion vector list may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors. After generating the candidate prediction motion vector list for the PU, the inter prediction module 121 may select the candidate prediction motion vector from the candidate prediction motion vector list and generate a motion vector difference (MVD) for the PU. The MVD for a PU may indicate a difference between a motion vector indicated by a selected candidate prediction motion vector and a motion vector generated for the PU using IME and FME. In these feasible implementations, the inter prediction module 121 may output a candidate prediction motion vector index that identifies the position of the selected candidate prediction motion vector in the candidate prediction motion vector list. The inter prediction module 121 may also output the MVD of the PU. A detailed implementation of the advanced motion vector prediction (AMVP) mode in the embodiment of the present application in FIG. 6 is described in detail below.
除了通过对PU执行IME和FME来产生用于PU的运动信息外,帧间预测模块121还可对PU中的每一者执行合并(Merge)操作。当帧间预测模块121对PU执行合并操作时,帧间预测模块121可产生用于PU的候选预测运动矢量列表。用于PU的候选预测运动矢量列表可包括一个或多个原始候选预测运动矢量和从原始候选预测运动矢量导出的一个或多个额外候选预测运动矢量。候选预测运动矢量列表中的原始候选预测运动矢量可包括一个或多个空间候选预测运动矢量和时间候选预测运动矢量。空间候选预测运动矢量可指示当前图像中的其它PU的运动信息。时间候选预测运动矢量可基于不同于当前图像的对应的PU的运动信息。时间候选预测运动矢量还可称作时间运动矢量预测(TMVP)。In addition to generating motion information for the PU by performing IME and FME on the PU, the inter prediction module 121 may also perform a merge operation on each of the PUs. When the inter prediction module 121 performs a merge operation on the PU, the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU. The candidate prediction motion vector list for the PU may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors. The original candidate prediction motion vector in the candidate prediction motion vector list may include one or more spatial candidate prediction motion vectors and temporal candidate prediction motion vectors. The spatial candidate prediction motion vector may indicate motion information of other PUs in the current image. The temporal candidate prediction motion vector may be based on motion information of a corresponding PU different from the current picture. The temporal candidate prediction motion vector may also be referred to as temporal motion vector prediction (TMVP).
在产生候选预测运动矢量列表之后,帧间预测模块121可从候选预测运动矢量列表选择候选预测运动矢量中的一个。帧间预测模块121可接着基于由PU的运动信息指示的参考块产生用于PU的预测性图像块。在合并模式中,PU的运动信息可与由选定候选预测运动矢量指示的运动信息相同。下文描述的图5说明Merge示例性的流程图。After generating the candidate prediction motion vector list, the inter prediction module 121 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list. The inter prediction module 121 may then generate a predictive image block for the PU based on the reference block indicated by the motion information of the PU. In the merge mode, the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector. Figure 5 described below illustrates an exemplary flowchart for Merge.
根据本申请的技术,可在候选预测运动矢量列表构建期间,在候选预测运动矢量 列表中直接包括原始候选预测运动矢量,通过标识位指示一种类型的额外候选预测运动矢量,以控制候选预测运动矢量列表的长度。特别的,通过不同的标识位指示不同类型的额外候选预测运动矢量。在编解码时,当选中一个标识位时,从该标识位指示的额外候选预测运动矢量集合中选取预测运动矢量。其中,标识位指示的候选预测运动矢量可以为预设的运动信息偏移量。According to the technology of the present application, during the construction of the candidate predictive motion vector list, the original candidate predictive motion vector can be directly included in the candidate predictive motion vector list, and one type of additional candidate predictive motion vector is indicated through the identification bit to control the candidate predictive motion. The length of the vector list. In particular, different types of extra candidate prediction motion vectors are indicated by different identification bits. During codec, when an identification bit is selected, a prediction motion vector is selected from the set of extra candidate prediction motion vectors indicated by the identification bit. The candidate prediction motion vector indicated by the identification bit may be a preset motion information offset.
在基于IME和FME产生用于PU的预测性图像块和基于合并操作产生用于PU的预测性图像块之后,帧间预测模块121可选择通过FME操作产生的预测性图像块或者通过合并操作产生的预测性图像块。在一些可行的实施方式中,帧间预测模块121可基于通过FME操作产生的预测性图像块和通过合并操作产生的预测性图像块的码率-失真代价分析来选择用于PU的预测性图像块。After generating a predictive image block for a PU based on IME and FME and a predictive image block for a PU based on a merge operation, the inter prediction module 121 may select a predictive image block generated through the FME operation or a merge operation. Predictive image blocks. In some feasible implementations, the inter prediction module 121 may select a predictive image for a PU based on a code rate-distortion cost analysis of the predictive image block generated by the FME operation and the predictive image block generated by the merge operation. Piece.
在帧间预测模块121已选择通过根据分割模式中的每一者分割当前CU而产生的PU的预测性图像块之后(在一些实施方式中,编码树单元CTU划分为CU后,不会再进一步划分为更小的PU,此时PU等同于CU),帧间预测模块121可选择用于当前CU的分割模式。在一些实施方式中,帧间预测模块121可基于通过根据分割模式中的每一者分割当前CU而产生的PU的选定预测性图像块的码率-失真代价分析来选择用于当前CU的分割模式。帧间预测模块121可将与属于选定分割模式的PU相关联的预测性图像块输出到残差产生模块102。帧间预测模块121可将指示属于选定分割模式的PU的运动信息的语法元素输出到熵编码模块。After the inter prediction module 121 has selected a predictive image block of the PU generated by dividing the current CU according to each of the partitioning modes (in some embodiments, the coding tree unit CTU is divided into CUs, no further (It is divided into smaller PUs. At this time, the PU is equivalent to the CU.) The inter prediction module 121 may select a partitioning mode for the current CU. In some embodiments, the inter prediction module 121 may select a rate-distortion cost analysis for a selected predictive image block of the PU generated by segmenting the current CU according to each of the partitioning modes to select the Split mode. The inter prediction module 121 may output a predictive image block associated with a PU belonging to the selected partition mode to the residual generation module 102. The inter prediction module 121 may output a syntax element indicating motion information of a PU belonging to the selected partition mode to the entropy encoding module.
在图4的示意图中,帧间预测模块121包括IME模块180A到180N(统称为“IME模块180”)、FME模块182A到182N(统称为“FME模块182”)、合并模块184A到184N(统称为“合并模块184”)、PU模式决策模块186A到186N(统称为“PU模式决策模块186”)和CU模式决策模块188(也可以包括执行从CTU到CU的模式决策过程)。In the schematic diagram of FIG. 4, the inter prediction module 121 includes IME modules 180A to 180N (collectively referred to as "IME module 180"), FME modules 182A to 182N (collectively referred to as "FME module 182"), and merge modules 184A to 184N (collectively Are "merging module 184"), PU mode decision modules 186A to 186N (collectively referred to as "PU mode decision module 186") and CU mode decision module 188 (may also include performing a mode decision process from CTU to CU).
IME模块180、FME模块182和合并模块184可对当前CU的PU执行IME操作、FME操作和合并操作。图4的示意图中将帧间预测模块121说明为包括用于CU的每一分割模式的每一PU的单独IME模块180、FME模块182和合并模块184。在其它可行的实施方式中,帧间预测模块121不包括用于CU的每一分割模式的每一PU的单独IME模块180、FME模块182和合并模块184。The IME module 180, the FME module 182, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a PU of the current CU. The inter prediction module 121 is illustrated in the schematic diagram of FIG. 4 as including a separate IME module 180, an FME module 182, and a merging module 184 for each PU of each partitioning mode of the CU. In other feasible implementations, the inter prediction module 121 does not include a separate IME module 180, an FME module 182, and a merge module 184 for each PU of each partitioning mode of the CU.
如图4的示意图中所说明,IME模块180A、FME模块182A和合并模块184A可对通过根据2N×2N分割模式分割CU而产生的PU执行IME操作、FME操作和合并操作。PU模式决策模块186A可选择由IME模块180A、FME模块182A和合并模块184A产生的预测性图像块中的一者。As illustrated in the schematic diagram of FIG. 4, the IME module 180A, the FME module 182A, and the merge module 184A may perform IME operations, FME operations, and merge operations on a PU generated by dividing a CU according to a 2N × 2N split mode. The PU mode decision module 186A may select one of the predictive image blocks generated by the IME module 180A, the FME module 182A, and the merge module 184A.
IME模块180B、FME模块182B和合并模块184B可对通过根据N×2N分割模式分割CU而产生的左PU执行IME操作、FME操作和合并操作。PU模式决策模块186B可选择由IME模块180B、FME模块182B和合并模块184B产生的预测性图像块中的一者。The IME module 180B, the FME module 182B, and the merge module 184B may perform an IME operation, an FME operation, and a merge operation on a left PU generated by dividing a CU according to an N × 2N division mode. The PU mode decision module 186B may select one of the predictive image blocks generated by the IME module 180B, the FME module 182B, and the merge module 184B.
IME模块180C、FME模块182C和合并模块184C可对通过根据N×2N分割模式分割CU而产生的右PU执行IME操作、FME操作和合并操作。PU模式决策模块186C可选择由IME模块180C、FME模块182C和合并模块184C产生的预测性图像块中的一者。The IME module 180C, the FME module 182C, and the merge module 184C may perform an IME operation, an FME operation, and a merge operation on a right PU generated by dividing a CU according to an N × 2N division mode. The PU mode decision module 186C may select one of the predictive image blocks generated by the IME module 180C, the FME module 182C, and the merge module 184C.
IME模块180N、FME模块182N和合并模块184可对通过根据N×N分割模式分割CU而产生的右下PU执行IME操作、FME操作和合并操作。PU模式决策模块186N可选择由IME模块180N、FME模块182N和合并模块184N产生的预测性图像块中的一者。The IME module 180N, the FME module 182N, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a lower right PU generated by dividing a CU according to an N × N division mode. The PU mode decision module 186N may select one of the predictive image blocks generated by the IME module 180N, the FME module 182N, and the merge module 184N.
PU模式决策模块186可基于多个可能预测性图像块的码率-失真代价分析选择预测性图像块,且选择针对给定解码情形提供最佳码率-失真代价的预测性图像块。示例性的,对于带宽受限的应用,PU模式决策模块186可偏向选择增加压缩比的预测性图像块,而对于其它应用,PU模式决策模块186可偏向选择增加经重建视频质量的预测性图像块。在PU模式决策模块186选择用于当前CU的PU的预测性图像块之后,CU模式决策模块188选择用于当前CU的分割模式且输出属于选定分割模式的PU的预测性图像块和运动信息。The PU mode decision module 186 may select a predictive image block based on a code rate-distortion cost analysis of a plurality of possible predictive image blocks, and select a predictive image block that provides the best code rate-distortion cost for a given decoding situation. For example, for bandwidth-constrained applications, the PU mode decision module 186 may prefer to select predictive image blocks that increase the compression ratio, while for other applications, the PU mode decision module 186 may prefer to select predictive images that increase the quality of the reconstructed video. Piece. After the PU mode decision module 186 selects a predictive image block for the PU of the current CU, the CU mode decision module 188 selects a partition mode for the current CU and outputs the predictive image block and motion information of the PU belonging to the selected partition mode. .
图5为本申请实施例中合并模式的一种实施流程图。视频编码器(例如视频编码器20)可执行合并操作200。该合并操作200可以包括:202、产生用于当前预测单元的候选者列表。204、产生与候选者列表中的候选者相关联的预测性视频块。206、从候选者列表选择候选者。208、输出候选者。其中,候选者是指候选运动矢量或者候选运动信息。FIG. 5 is an implementation flowchart of a merge mode in an embodiment of the present application. A video encoder (eg, video encoder 20) may perform a merge operation 200. The merging operation 200 may include: 202. Generate a candidate list for a current prediction unit. 204. Generate a predictive video block associated with a candidate in the candidate list. 206. Select a candidate from the candidate list. 208. Output candidates. The candidate refers to a candidate motion vector or candidate motion information.
在其它可行的实施方式中,视频编码器可执行不同于合并操作200的合并操作。举例来说,在其它可行的实施方式中,视频编码器可执行合并操作,其中视频编码器执行比合并操作200多、少的步骤或与合并操作200不同的步骤。在其它可行的实施方式中,视频编码器可以不同次序或并行地执行合并操作200的步骤。编码器还可对以跳跃(skip)模式编码的PU执行合并操作200。In other feasible implementations, the video encoder may perform a merge operation different from the merge operation 200. For example, in other feasible implementations, the video encoder may perform a merge operation, where the video encoder performs more or fewer steps than the merge operation 200 or steps different from the merge operation 200. In other possible implementations, the video encoder may perform the steps of the merge operation 200 in a different order or in parallel. The encoder may also perform a merge operation 200 on a PU encoded in a skip mode.
在视频编码器开始合并操作200之后,视频编码器可产生用于当前PU的候选预测运动矢量列表(202)。视频编码器可以各种方式产生用于当前PU的候选预测运动矢量列表。举例来说,视频编码器可根据下文关于图8到图12描述的实例技术中的一者产生用于当前PU的候选预测运动矢量列表。其中,根据本申请的技术,用于当前PU的候选预测运动矢量列表中包括至少一个第一候选运动矢量以及至少一个第二候选运动矢量集合的标识。After the video encoder starts the merge operation 200, the video encoder may generate a list of candidate predicted motion vectors for the current PU (202). The video encoder may generate a list of candidate prediction motion vectors for the current PU in various ways. For example, the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one of the example techniques described below with respect to FIGS. 8-12. Wherein, according to the technology of the present application, the candidate prediction motion vector list for the current PU includes at least one first candidate motion vector and at least one second candidate motion vector set identifier.
如前文所述,用于当前PU的候选预测运动矢量列表可包括时间候选预测运动矢量。时间候选预测运动矢量可指示时域对应(co-located)的PU的运动信息。co-located的PU可在空间上与当前PU处于图像帧中的同一个位置,但在参考图像而非当前图像中。本申请可将包括时域对应的PU的参考图像称作相关参考图像。本申请可将相关参考图像的参考图像索引称作相关参考图像索引。如前文所描述,当前图像可与一个或多个参考图像列表(例如,列表0、列表1等)相关联。参考图像索引可通过指示在参考图像某一个参考图像列表中的位置来指示参考图像。在一些可行的实施方式中,当前图像可与组合参考图像列表相关联。As mentioned before, the candidate prediction motion vector list for the current PU may include a temporal candidate prediction motion vector. The temporal candidate prediction motion vector may indicate motion information of a co-located PU in the time domain. A co-located PU may be spatially in the same position in the image frame as the current PU, but in a reference picture instead of the current picture. In this application, a reference picture including a PU corresponding to the time domain may be referred to as a related reference picture. A reference image index of a related reference image may be referred to as a related reference image index in this application. As described previously, the current image may be associated with one or more reference image lists (eg, list 0, list 1, etc.). The reference image index may indicate a reference image by indicating a position in a reference image list of the reference image. In some feasible implementations, the current image may be associated with a combined reference image list.
在一些视频编码器中,相关参考图像索引为涵盖与当前PU相关联的参考索引源位置的PU的参考图像索引。在这些视频编码器中,与当前PU相关联的参考索引源位置邻接于当前PU左方或邻接于当前PU上方。在本申请中,如果与PU相关联的图像块包括特定位置,则PU可“涵盖”所述特定位置。在这些视频编码器中,如果参考索 引源位置不可用,则视频编码器可使用零的参考图像索引。In some video encoders, the related reference picture index is the reference picture index of the PU covering the reference index source position associated with the current PU. In these video encoders, the reference index source location associated with the current PU is adjacent to the left of the current PU or above the current PU. In this application, if an image block associated with a PU includes a specific location, the PU may "cover" the specific location. In these video encoders, if a reference index source location is not available, the video encoder can use a zero reference picture index.
然而,可存在以下例子:与当前PU相关联的参考索引源位置在当前CU内。在这些例子中,如果PU在当前CU上方或左方,则涵盖与当前PU相关联的参考索引源位置的PU可被视为可用。然而,视频编码器可需要存取当前CU的另一PU的运动信息以便确定含有co-located PU的参考图像。因此,这些视频编码器可使用属于当前CU的PU的运动信息(即,参考图像索引)以产生用于当前PU的时间候选预测运动矢量。换句话说,这些视频编码器可使用属于当前CU的PU的运动信息产生时间候选预测运动矢量。因此,视频编码器可能不能并行地产生用于当前PU和涵盖与当前PU相关联的参考索引源位置的PU的候选预测运动矢量列表。However, there may be examples where the reference index source location associated with the current PU is within the current CU. In these examples, if the PU is above or to the left of the current CU, the PU covering the reference index source location associated with the current PU may be considered available. However, the video encoder may need to access motion information of another PU of the current CU in order to determine a reference picture containing a co-located PU. Therefore, these video encoders may use motion information (ie, a reference picture index) of a PU belonging to the current CU to generate a temporal candidate prediction motion vector for the current PU. In other words, these video encoders may use temporal information of a PU belonging to the current CU to generate a temporal candidate prediction motion vector. Therefore, the video encoder may not be able to generate a list of candidate prediction motion vectors for the current PU and the PU covering the reference index source position associated with the current PU in parallel.
视频编码器可在不参考任何其它PU的参考图像索引的情况下显式地设定相关参考图像索引。此可使得视频编码器能够并行地产生用于当前PU和当前CU的其它PU的候选预测运动矢量列表。因为视频编码器显式地设定相关参考图像索引,所以相关参考图像索引不基于当前CU的任何其它PU的运动信息。在视频编码器显式地设定相关参考图像索引的一些可行的实施方式中,视频编码器可始终将相关参考图像索引设定为固定的预定义预设参考图像索引(例如0)。以此方式,视频编码器可基于由预设参考图像索引指示的参考帧中的co-located PU的运动信息产生时间候选预测运动矢量,且可将时间候选预测运动矢量包括于当前CU的候选预测运动矢量列表中。The video encoder may explicitly set the relevant reference picture index without referring to the reference picture index of any other PU. This may enable the video encoder to generate candidate prediction motion vector lists for the current PU and other PUs of the current CU in parallel. Because the video encoder explicitly sets the relevant reference picture index, the relevant reference picture index is not based on the motion information of any other PU of the current CU. In some feasible implementations where the video encoder explicitly sets the relevant reference picture index, the video encoder may always set the relevant reference picture index to a fixed, predefined preset reference picture index (eg, 0). In this way, the video encoder may generate a temporal candidate prediction motion vector based on the motion information of the co-located PU in the reference frame indicated by the preset reference picture index, and may include the temporal candidate prediction motion vector in the candidate prediction of the current CU List of motion vectors.
在视频编码器显式地设定相关参考图像索引的可行的实施方式中,视频编码器可显式地在语法结构(例如图像标头、条带标头、APS或另一语法结构)中用信号通知相关参考图像索引。在此可行的实施方式中,视频编码器可用信号通知解码端用于每一LCU(即CTU)、CU、PU、TU或其它类型的子块的相关参考图像索引。举例来说,视频编码器可用信号通知:用于CU的每一PU的相关参考图像索引等于“1”。In a feasible implementation where the video encoder explicitly sets the relevant reference picture index, the video encoder may be explicitly used in a syntax structure (e.g., image header, slice header, APS, or another syntax structure) The related reference picture index is signaled. In this feasible implementation manner, the video encoder may signal the decoder to the relevant reference picture index for each LCU (ie, CTU), CU, PU, TU, or other type of sub-block. For example, the video encoder may signal that the relevant reference picture index for each PU of the CU is equal to "1".
在一些可行的实施方式中,相关参考图像索引可经隐式地而非显式地设定。在这些可行的实施方式中,视频编码器可使用由涵盖当前CU外部的位置的PU的参考图像索引指示的参考图像中的PU的运动信息产生用于当前CU的PU的候选预测运动矢量列表中的每一时间候选预测运动矢量,即使这些位置并不严格地邻近当前PU。In some feasible implementations, the relevant reference image index may be set implicitly rather than explicitly. In these feasible implementations, the video encoder may use the motion information of the PU in the reference image indicated by the reference image index of the PU covering the location outside the current CU to generate a candidate prediction motion vector list for the PU of the current CU. Each time candidate predicts a motion vector, even if these locations are not strictly adjacent to the current PU.
在产生用于当前PU的候选预测运动矢量列表之后,视频编码器可产生与候选预测运动矢量列表中的候选预测运动矢量相关联的预测性图像块(204)。视频编码器可通过基于所指示候选预测运动矢量的运动信息确定当前PU的运动信息和接着基于由当前PU的运动信息指示的一个或多个参考块产生预测性图像块来产生与候选预测运动矢量相关联的预测性图像块。视频编码器可接着从候选预测运动矢量列表选择候选预测运动矢量中的一者(206)。视频编码器可以各种方式选择候选预测运动矢量。举例来说,视频编码器可基于对与候选预测运动矢量相关联的预测性图像块的每一者的码率-失真代价分析来选择候选预测运动矢量中的一者。After generating a list of candidate prediction motion vectors for the current PU, the video encoder may generate predictive image blocks associated with the candidate prediction motion vectors in the candidate prediction motion vector list (204). The video encoder may generate the candidate prediction motion vector by determining the motion information of the current PU based on the motion information of the indicated candidate prediction motion vector and then generating a predictive image block based on one or more reference blocks indicated by the motion information of the current PU. Associated predictive image blocks. The video encoder may then select one of the candidate prediction motion vectors from the candidate prediction motion vector list (206). The video encoder can select candidate prediction motion vectors in various ways. For example, a video encoder may select one of the candidate prediction motion vectors based on a code rate-distortion cost analysis of each of the predictive image blocks associated with the candidate prediction motion vector.
在选择候选预测运动矢量之后,视频编码器可输出候选预测运动矢量索引(208)。候选预测运动矢量索引可指示在候选预测运动矢量列表中选定候选预测运动矢量的位置。在一些可行的实施方式中,候选预测运动矢量索引可表示为“merge_idx”。After selecting the candidate prediction motion vector, the video encoder may output a candidate prediction motion vector index (208). The candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list. In some feasible implementations, the candidate prediction motion vector index may be represented as "merge_idx".
图6为本申请实施例中高级运动矢量预测(AMVP)模式的一种实施流程图。视频编码器(例如视频编码器20)可执行AMVP操作210。该AMVP操作210可以包括:211、 产生用于当前预测单元的一个或多个运动向量。212、产生用于当前预测单元的预测性视频块。213、产生用于当前预测单元的候选者列表。214、产生运动向量差。215从候选者列表选择候选者。216、输出参考图片索引、候选者索引,和用于选定候选者的运动向量差。其中,候选者是指候选运动矢量或者候选运动信息。FIG. 6 is an implementation flowchart of an advanced motion vector prediction (AMVP) mode in an embodiment of the present application. A video encoder (eg, video encoder 20) may perform AMVP operations 210. The AMVP operation 210 may include: 211. Generate one or more motion vectors for a current prediction unit. 212. Generate a predictive video block for the current prediction unit. 213. Generate a candidate list for the current prediction unit. 214. Generate a motion vector difference. 215 Select a candidate from the candidate list. 216. Output a reference picture index, a candidate index, and a motion vector difference for selecting a candidate. The candidate refers to a candidate motion vector or candidate motion information.
在视频编码器开始AMVP操作210之后,视频编码器可产生用于当前PU的一个或多个运动矢量(211)。视频编码器可执行整数运动估计和分数运动估计以产生用于当前PU的运动矢量。如前文所描述,当前图像可与两个参考图像列表(列表0和列表1)相关联。如果当前PU经单向预测,则视频编码器可产生用于当前PU的列表0运动矢量或列表1运动矢量。列表0运动矢量可指示当前PU的图像块与列表0中的参考图像中的参考块之间的空间位移。列表1运动矢量可指示当前PU的图像块与列表1中的参考图像中的参考块之间的空间位移。如果当前PU经双向预测,则视频编码器可产生用于当前PU的列表0运动矢量和列表1运动矢量。After the video encoder starts AMVP operation 210, the video encoder may generate one or more motion vectors for the current PU (211). The video encoder may perform integer motion estimation and fractional motion estimation to generate motion vectors for the current PU. As described earlier, the current image may be associated with two reference image lists (List 0 and List 1). If the current PU is unidirectionally predicted, the video encoder may generate a list 0 motion vector or a list 1 motion vector for the current PU. The list 0 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 0. The list 1 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 1. If the current PU is bi-predicted, the video encoder may generate a list 0 motion vector and a list 1 motion vector for the current PU.
在产生用于当前PU的一个或多个运动矢量之后,视频编码器可产生用于当前PU的预测性图像块(212)。视频编码器可基于由用于当前PU的一个或多个运动矢量指示的一个或多个参考块产生用于当前PU的预测性图像块。After generating one or more motion vectors for the current PU, the video encoder may generate predictive image blocks for the current PU (212). The video encoder may generate predictive image blocks for the current PU based on one or more reference blocks indicated by one or more motion vectors for the current PU.
另外,视频编码器可产生用于当前PU的候选预测运动矢量列表(213)。视频解码器可以各种方式产生用于当前PU的候选预测运动矢量列表。举例来说,视频编码器可根据下文关于图8到图12描述的可行的实施方式中的一个或多个产生用于当前PU的候选预测运动矢量列表。在一些可行的实施方式中,当视频编码器在AMVP操作210中产生候选预测运动矢量列表时,候选预测运动矢量列表可限于两个候选预测运动矢量。相比而言,当视频编码器在合并操作中产生候选预测运动矢量列表时,候选预测运动矢量列表可包括更多候选预测运动矢量(例如,五个候选预测运动矢量)。In addition, the video encoder may generate a list of candidate predicted motion vectors for the current PU (213). The video decoder may generate a list of candidate prediction motion vectors for the current PU in various ways. For example, the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one or more of the possible implementations described below with respect to FIGS. 8 to 12. In some feasible implementations, when the video encoder generates a list of candidate prediction motion vectors in the AMVP operation 210, the list of candidate prediction motion vectors may be limited to two candidate prediction motion vectors. In contrast, when a video encoder generates a list of candidate prediction motion vectors in a merge operation, the list of candidate prediction motion vectors may include more candidate prediction motion vectors (eg, five candidate prediction motion vectors).
在产生用于当前PU的候选预测运动矢量列表之后,视频编码器可产生用于候选预测运动矢量列表中的每一候选预测运动矢量的一个或多个运动矢量差(MVD)(214)。视频编码器可通过确定由候选预测运动矢量指示的运动矢量与当前PU的对应运动矢量之间的差来产生用于候选预测运动矢量的运动矢量差。After generating a list of candidate prediction motion vectors for the current PU, the video encoder may generate one or more motion vector differences (MVD) for each candidate prediction motion vector in the list of candidate prediction motion vectors (214). The video encoder may generate a motion vector difference for the candidate prediction motion vector by determining a difference between the motion vector indicated by the candidate prediction motion vector and a corresponding motion vector of the current PU.
如果当前PU经单向预测,则视频编码器可产生用于每一候选预测运动矢量的单一MVD。如果当前PU经双向预测,则视频编码器可产生用于每一候选预测运动矢量的两个MVD。第一MVD可指示候选预测运动矢量的运动矢量与当前PU的列表0运动矢量之间的差。第二MVD可指示候选预测运动矢量的运动矢量与当前PU的列表1运动矢量之间的差。If the current PU is unidirectionally predicted, the video encoder may generate a single MVD for each candidate prediction motion vector. If the current PU is bi-predicted, the video encoder may generate two MVDs for each candidate prediction motion vector. The first MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 0 motion vector of the current PU. The second MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 1 motion vector of the current PU.
视频编码器可从候选预测运动矢量列表选择候选预测运动矢量中的一个或多个(215)。视频编码器可以各种方式选择一个或多个候选预测运动矢量。举例来说,视频编码器可选择具有最小误差地匹配待编码的运动矢量的相关联运动矢量的候选预测运动矢量,此可减少表示用于候选预测运动矢量的运动矢量差所需的位数目。The video encoder may select one or more of the candidate prediction motion vectors from the candidate prediction motion vector list (215). The video encoder may select one or more candidate prediction motion vectors in various ways. For example, a video encoder may select a candidate prediction motion vector with an associated motion vector that matches the motion vector to be encoded with minimal error, which may reduce the number of bits required to represent the motion vector difference for the candidate prediction motion vector.
在选择一个或多个候选预测运动矢量之后,视频编码器可输出用于当前PU的一个或多个参考图像索引、一个或多个候选预测运动矢量索引,和用于一个或多个选定候选预测运动矢量的一个或多个运动矢量差(216)。After selecting one or more candidate prediction motion vectors, the video encoder may output one or more reference image indexes for the current PU, one or more candidate prediction motion vector indexes, and one or more selected candidate motion vectors. One or more motion vector differences of the predicted motion vector (216).
在当前图像与两个参考图像列表(列表0和列表1)相关联且当前PU经单向预测的 例子中,视频编码器可输出用于列表0的参考图像索引(“ref_idx_10”)或用于列表1的参考图像索引(“ref_idx_11”)。视频编码器还可输出指示用于当前PU的列表0运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置的候选预测运动矢量索引(“mvp_10_flag”)。或者,视频编码器可输出指示用于当前PU的列表1运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置的候选预测运动矢量索引(“mvp_11_flag”)。视频编码器还可输出用于当前PU的列表0运动矢量或列表1运动矢量的MVD。In examples where the current picture is associated with two reference picture lists (List 0 and List 1) and the current PU is unidirectionally predicted, the video encoder may output a reference picture index ("ref_idx_10") for List 0 or for Reference image index of list 1 ("ref_idx_11"). The video encoder may also output a candidate prediction motion vector index ("mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list. Alternatively, the video encoder may output a candidate prediction motion vector index ("mvp_11_flag") indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list. The video encoder may also output a list 0 motion vector or a list 1 motion vector MVD for the current PU.
在当前图像与两个参考图像列表(列表0和列表1)相关联且当前PU经双向预测的例子中,视频编码器可输出用于列表0的参考图像索引(“ref_idx_10”)和用于列表1的参考图像索引(“ref_idx_11”)。视频编码器还可输出指示用于当前PU的列表0运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置的候选预测运动矢量索引(“mvp_10_flag”)。另外,视频编码器可输出指示用于当前PU的列表1运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置的候选预测运动矢量索引(“mvp_11_flag”)。视频编码器还可输出用于当前PU的列表0运动矢量的MVD和用于当前PU的列表1运动矢量的MVD。In the example where the current picture is associated with two reference picture lists (List 0 and List 1) and the current PU is bi-predicted, the video encoder may output the reference picture index ("ref_idx_10") for List 0 and the list Reference image index of 1 ("ref_idx_11"). The video encoder may also output a candidate prediction motion vector index ("mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list. In addition, the video encoder may output a candidate prediction motion vector index ("mvp_11_flag") indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list. The video encoder may also output the MVD of the list 0 motion vector for the current PU and the MVD of the list 1 motion vector for the current PU.
图7为本申请实施例中由视频解码器(例如视频解码器30)执行的运动补偿的一种实施流程图。FIG. 7 is an implementation flowchart of motion compensation performed by a video decoder (such as video decoder 30) in an embodiment of the present application.
当视频解码器执行运动补偿操作220时,视频解码器可接收用于当前PU的选定候选预测运动矢量的指示(222)。举例来说,视频解码器可接收指示选定候选预测运动矢量在当前PU的候选预测运动矢量列表内的位置的候选预测运动矢量索引。When the video decoder performs motion compensation operation 220, the video decoder may receive an indication of the selected candidate prediction motion vector for the current PU (222). For example, the video decoder may receive a candidate prediction motion vector index indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU.
如果当前PU的运动信息是使用AMVP模式进行编码且当前PU经双向预测,则视频解码器可接收第一候选预测运动矢量索引和第二候选预测运动矢量索引。第一候选预测运动矢量索引指示用于当前PU的列表0运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置。第二候选预测运动矢量索引指示用于当前PU的列表1运动矢量的选定候选预测运动矢量在候选预测运动矢量列表中的位置。在一些可行的实施方式中,单一语法元素可用以识别两个候选预测运动矢量索引。If the motion information of the current PU is encoded using the AMVP mode and the current PU is bidirectionally predicted, the video decoder may receive the first candidate prediction motion vector index and the second candidate prediction motion vector index. The first candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list. The second candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list. In some feasible implementations, a single syntax element may be used to identify two candidate prediction motion vector indexes.
在一些可行的实施方式中,如果根据本申请的技术构建的候选预测运动矢量列表,视频解码器可接受指示选定候选预测运动矢量在当前PU的候选预测运动矢量列表内的位置的候选预测运动矢量索引,或者,接受指示选定候选预测运动矢量所属的分类在当前PU的候选预测运动矢量列表内的位置的标识,以及选定候选预测运动矢量在其所属分类中的位置的候选预测运动矢量索引。In some feasible implementation manners, if the candidate prediction motion vector list constructed according to the technology of the present application, the video decoder may accept the candidate prediction motion indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU. Vector index, or accept an identifier indicating the position of the classification to which the selected candidate prediction motion vector belongs in the candidate prediction motion vector list of the current PU, and the candidate prediction motion vector of the position of the selected candidate prediction motion vector in its classification index.
另外,视频解码器可产生用于当前PU的候选预测运动矢量列表(224)。视频解码器可以各种方式产生用于当前PU的此候选预测运动矢量列表。举例来说,视频解码器可使用下文参看图8到图12描述的技术来产生用于当前PU的候选预测运动矢量列表。当视频解码器产生用于候选预测运动矢量列表的时间候选预测运动矢量时,视频解码器可显式地或隐式地设定识别包括co-located PU的参考图像的参考图像索引,如前文关于图5所描述。根据本申请的技术,可在候选预测运动矢量列表构建期间,在候选预测运动矢量列表中通过标识位指示一种类型的候选预测运动矢量,以控制候选预测运动矢量列表的长度。In addition, the video decoder may generate a list of candidate predicted motion vectors for the current PU (224). The video decoder may generate this candidate prediction motion vector list for the current PU in various ways. For example, the video decoder may use the techniques described below with reference to FIGS. 8 to 12 to generate a list of candidate prediction motion vectors for the current PU. When the video decoder generates a temporal candidate prediction motion vector for a candidate prediction motion vector list, the video decoder may explicitly or implicitly set a reference image index identifying a reference image including a co-located PU, as described above Figure 5 describes this. According to the technology of the present application, during the construction of the candidate prediction motion vector list, a type of candidate prediction motion vector may be indicated by an identification bit in the candidate prediction motion vector list to control the length of the candidate prediction motion vector list.
在产生用于当前PU的候选预测运动矢量列表之后,视频解码器可基于由用于当前PU的候选预测运动矢量列表中的一个或多个选定候选预测运动矢量指示的运动信息确定当前PU的运动信息(225)。举例来说,如果当前PU的运动信息是使用合并模式而编码,则当前PU的运动信息可与由选定候选预测运动矢量指示的运动信息相同。如果当前PU的运动信息是使用AMVP模式而编码,则视频解码器可使用由所述或所述选定候选预测运动矢量指示的一个或多个运动矢量和码流中指示的一个或多个MVD来重建当前PU的一个或多个运动矢量。当前PU的参考图像索引和预测方向标识可与所述一个或多个选定候选预测运动矢量的参考图像索引和预测方向标识相同。在确定当前PU的运动信息之后,视频解码器可基于由当前PU的运动信息指示的一个或多个参考块产生用于当前PU的预测性图像块(226)。After generating the candidate prediction motion vector list for the current PU, the video decoder may determine the current PU's based on the motion information indicated by one or more selected candidate prediction motion vectors in the candidate prediction motion vector list for the current PU. Motion information (225). For example, if the motion information of the current PU is encoded using a merge mode, the motion information of the current PU may be the same as the motion information indicated by the selected candidate prediction motion vector. If the motion information of the current PU is encoded using the AMVP mode, the video decoder may use one or more MVDs indicated in the one or more motion vectors and the code stream indicated by the or the selected candidate prediction motion vector. To reconstruct one or more motion vectors of the current PU. The reference image index and prediction direction identifier of the current PU may be the same as the reference image index and prediction direction identifier of the one or more selected candidate prediction motion vectors. After determining the motion information of the current PU, the video decoder may generate a predictive image block for the current PU based on one or more reference blocks indicated by the motion information of the current PU (226).
图8为本申请实施例中编码单元(CU)及与其关联的相邻位置图像块的一种示例性示意图,说明CU250和与CU250相关联的示意性的候选预测运动矢量位置252A到252E的示意图。本申请可将候选预测运动矢量位置252A到252E统称为候选预测运动矢量位置252。候选预测运动矢量位置252表示与CU250在同一图像中的空间候选预测运动矢量。候选预测运动矢量位置252A定位于CU250左方。候选预测运动矢量位置252B定位于CU250上方。候选预测运动矢量位置252C定位于CU250右上方。候选预测运动矢量位置252D定位于CU250左下方。候选预测运动矢量位置252E定位于CU250左上方。图8为用以提供帧间预测模块121和运动补偿模块可产生候选预测运动矢量列表的方式的示意性实施方式。下文将参考帧间预测模块121解释实施方式,但应理解运动补偿模块可实施相同技术,且因此产生相同候选预测运动矢量列表。FIG. 8 is an exemplary schematic diagram of a coding unit (CU) and an adjacent position image block associated with the coding unit (CU) in the embodiment of the present application, illustrating CU250 and schematic candidate prediction motion vector positions 252A to 252E associated with CU250. . This application may collectively refer to the candidate prediction motion vector positions 252A to 252E as the candidate prediction motion vector positions 252. The candidate prediction motion vector position 252 indicates a spatial candidate prediction motion vector in the same image as the CU 250. The candidate prediction motion vector position 252A is positioned to the left of CU250. The candidate prediction motion vector position 252B is positioned above the CU250. The candidate prediction motion vector position 252C is positioned at the upper right of CU250. The candidate prediction motion vector position 252D is positioned at the lower left of CU250. The candidate prediction motion vector position 252E is positioned at the upper left of the CU250. FIG. 8 is a schematic embodiment of a manner for providing a list of candidate prediction motion vectors that the inter prediction module 121 and the motion compensation module can generate. The embodiments will be explained below with reference to the inter prediction module 121, but it should be understood that the motion compensation module can implement the same technique and thus generate the same candidate prediction motion vector list.
图9为本申请实施例中构建候选预测运动矢量列表的一种实施流程图。将参考包括五个候选预测运动矢量的列表描述图9的技术,但本文中所描述的技术还可与具有其它大小的列表一起使用。五个候选预测运动矢量可各自具有索引(例如,0到4)。将参考一般视频解码器描述图9的技术。一般视频解码器示例性的可以为视频编码器(例如视频编码器20)或视频解码器(例如视频解码器30)。对于基于本申请的技术构建的选预测运动矢量列表在下文实施例中详细描述,此处暂不进行赘述。FIG. 9 is a flowchart of constructing a candidate prediction motion vector list according to an embodiment of the present application. The technique of FIG. 9 will be described with reference to a list including five candidate prediction motion vectors, but the techniques described herein may also be used with lists of other sizes. The five candidate prediction motion vectors may each have an index (eg, 0 to 4). The technique of FIG. 9 will be described with reference to a general video decoder. A general video decoder may be, for example, a video encoder (such as video encoder 20) or a video decoder (such as video decoder 30). The selected prediction motion vector list constructed based on the technology of the present application is described in detail in the following embodiments, and will not be repeated here.
为了根据图9的实施方式重建候选预测运动矢量列表,视频解码器首先考虑四个空间候选预测运动矢量(902)。四个空间候选预测运动矢量可以包括候选预测运动矢量位置252A、252B、252C和252D。四个空间候选预测运动矢量对应于与当前CU(例如,CU250)在同一图像中的四个PU的运动信息。视频解码器可以特定次序考虑列表中的四个空间候选预测运动矢量。举例来说,候选预测运动矢量位置252A可被第一个考虑。如果候选预测运动矢量位置252A可用,则候选预测运动矢量位置252A可指派到索引0。如果候选预测运动矢量位置252A不可用,则视频解码器可不将候选预测运动矢量位置252A包括于候选预测运动矢量列表中。候选预测运动矢量位置可出于各种理由而不可用。举例来说,如果候选预测运动矢量位置不在当前图像内,则候选预测运动矢量位置可能不可用。在另一可行的实施方式中,如果候选预测运动矢量位置经帧内预测,则候选预测运动矢量位置可能不可用。在另一可行的实施方式中,如果候选预测运动矢量位置在与当前CU不同的条带中,则候选预测运动矢量位置可能不可用。To reconstruct a list of candidate prediction motion vectors according to the embodiment of FIG. 9, the video decoder first considers four spatial candidate prediction motion vectors (902). The four spatial candidate prediction motion vectors may include candidate prediction motion vector positions 252A, 252B, 252C, and 252D. The four spatial candidate prediction motion vectors correspond to motion information of four PUs in the same image as the current CU (for example, CU250). The video decoder may consider the four spatial candidate prediction motion vectors in the list in a particular order. For example, the candidate prediction motion vector position 252A may be considered first. If the candidate prediction motion vector position 252A is available, the candidate prediction motion vector position 252A may be assigned to index 0. If the candidate prediction motion vector position 252A is not available, the video decoder may not include the candidate prediction motion vector position 252A in the candidate prediction motion vector list. Candidate prediction motion vector positions may be unavailable for various reasons. For example, if the candidate prediction motion vector position is not within the current image, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is intra-predicted, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is in a slice different from the current CU, the candidate prediction motion vector position may not be available.
在考虑候选预测运动矢量位置252A之后,视频解码器可接下来考虑候选预测运动矢量位置252B。如果候选预测运动矢量位置252B可用且不同于候选预测运动矢量位置252A,则视频解码器可将候选预测运动矢量位置252B添加到候选预测运动矢量列表。在此特定上下文中,术语“相同”和“不同”指代与候选预测运动矢量位置相关联的运动信息。因此,如果两个候选预测运动矢量位置具有相同运动信息则被视为相同,且如果其具有不同运动信息则被视为不同。如果候选预测运动矢量位置252A不可用,则视频解码器可将候选预测运动矢量位置252B指派到索引0。如果候选预测运动矢量位置252A可用,则视频解码器可将候选预测运动矢量位置252指派到索引1。如果候选预测运动矢量位置252B不可用或与候选预测运动矢量位置252A相同,则视频解码器跳过候选预测运动矢量位置252B且不将其包括于候选预测运动矢量列表中。After considering the candidate prediction motion vector position 252A, the video decoder may next consider the candidate prediction motion vector position 252B. If the candidate prediction motion vector position 252B is available and different from the candidate prediction motion vector position 252A, the video decoder may add the candidate prediction motion vector position 252B to the candidate prediction motion vector list. In this particular context, the terms "same" and "different" refer to motion information associated with candidate predicted motion vector locations. Therefore, two candidate prediction motion vector positions are considered the same if they have the same motion information, and are considered different if they have different motion information. If the candidate prediction motion vector position 252A is not available, the video decoder may assign the candidate prediction motion vector position 252B to index 0. If the candidate prediction motion vector position 252A is available, the video decoder may assign the candidate prediction motion vector position 252 to index 1. If the candidate prediction motion vector position 252B is not available or the same as the candidate prediction motion vector position 252A, the video decoder skips the candidate prediction motion vector position 252B and does not include it in the candidate prediction motion vector list.
候选预测运动矢量位置252C由视频解码器类似地考虑以供包括于列表中。如果候选预测运动矢量位置252C可用且不与候选预测运动矢量位置252B和252A相同,则视频解码器将候选预测运动矢量位置252C指派到下一可用索引。如果候选预测运动矢量位置252C不可用或并非不同于候选预测运动矢量位置252A和252B中的至少一者,则视频解码器不将候选预测运动矢量位置252C包括于候选预测运动矢量列表中。接下来,视频解码器考虑候选预测运动矢量位置252D。如果候选预测运动矢量位置252D可用且不与候选预测运动矢量位置252A、252B和252C相同,则视频解码器将候选预测运动矢量位置252D指派到下一可用索引。如果候选预测运动矢量位置252D不可用或并非不同于候选预测运动矢量位置252A、252B和252C中的至少一者,则视频解码器不将候选预测运动矢量位置252D包括于候选预测运动矢量列表中。以上实施方式大体上描述示例性地考虑候选预测运动矢量252A到252D以供包括于候选预测运动矢量列表中,但在一些实施方施中,可首先将所有候选预测运动矢量252A到252D添加到候选预测运动矢量列表,稍后从候选预测运动矢量列表移除重复。The candidate prediction motion vector position 252C is similarly considered by the video decoder for inclusion in the list. If the candidate prediction motion vector position 252C is available and not the same as the candidate prediction motion vector positions 252B and 252A, the video decoder assigns the candidate prediction motion vector position 252C to the next available index. If the candidate prediction motion vector position 252C is unavailable or different from at least one of the candidate prediction motion vector positions 252A and 252B, the video decoder does not include the candidate prediction motion vector position 252C in the candidate prediction motion vector list. Next, the video decoder considers the candidate prediction motion vector position 252D. If the candidate prediction motion vector position 252D is available and is not the same as the candidate prediction motion vector position 252A, 252B, and 252C, the video decoder assigns the candidate prediction motion vector position 252D to the next available index. If the candidate prediction motion vector position 252D is unavailable or different from at least one of the candidate prediction motion vector positions 252A, 252B, and 252C, the video decoder does not include the candidate prediction motion vector position 252D in the candidate prediction motion vector list. The above embodiments generally describe exemplarily considering candidate prediction motion vectors 252A to 252D for inclusion in the candidate prediction motion vector list, but in some embodiments, all candidate prediction motion vectors 252A to 252D may be first added to the candidate A list of predicted motion vectors, with duplicates removed from the list of candidate predicted motion vectors later.
在视频解码器考虑前四个空间候选预测运动矢量之后,候选预测运动矢量列表可能包括四个空间候选预测运动矢量或者该列表可能包括少于四个空间候选预测运动矢量。如果列表包括四个空间候选预测运动矢量(904,是),则视频解码器考虑时间候选预测运动矢量(906)。时间候选预测运动矢量可对应于不同于当前图像的图像的co-located PU的运动信息。如果时间候选预测运动矢量可用且不同于前四个空间候选预测运动矢量,则视频解码器将时间候选预测运动矢量指派到索引4。如果时间候选预测运动矢量不可用或与前四个空间候选预测运动矢量中的一者相同,则视频解码器不将所述时间候选预测运动矢量包括于候选预测运动矢量列表中。因此,在视频解码器考虑时间候选预测运动矢量(906)之后,候选预测运动矢量列表可能包括五个候选预测运动矢量(框902处考虑的前四个空间候选预测运动矢量和框904处考虑的时间候选预测运动矢量)或可能包括四个候选预测运动矢量(框902处考虑的前四个空间候选预测运动矢量)。如果候选预测运动矢量列表包括五个候选预测运动矢量(908,是),则视频解码器完成构建列表。After the video decoder considers the first four spatial candidate prediction motion vectors, the candidate prediction motion vector list may include four spatial candidate prediction motion vectors or the list may include less than four spatial candidate prediction motion vectors. If the list includes four spatial candidate prediction motion vectors (904, Yes), the video decoder considers temporal candidate prediction motion vectors (906). The temporal candidate prediction motion vector may correspond to motion information of a co-located PU of a picture different from the current picture. If a temporal candidate prediction motion vector is available and different from the first four spatial candidate prediction motion vectors, the video decoder assigns the temporal candidate prediction motion vector to index 4. If the temporal candidate prediction motion vector is not available or is the same as one of the first four spatial candidate prediction motion vectors, the video decoder does not include the temporal candidate prediction motion vector in the candidate prediction motion vector list. Therefore, after the video decoder considers temporal candidate prediction motion vectors (906), the candidate prediction motion vector list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902 and the The temporal candidate prediction motion vector) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902). If the candidate prediction motion vector list includes five candidate prediction motion vectors (908, Yes), the video decoder completes building the list.
如果候选预测运动矢量列表包括四个候选预测运动矢量(908,否),则视频解码器可考虑第五个空间候选预测运动矢量(910)。第五空间候选预测运动矢量可(例如)对应于候选预测运动矢量位置252E。如果位置252E处的候选预测运动矢量可用且不同于 位置252A、252B、252C和252D处的候选预测运动矢量,则视频解码器可将第五空间候选预测运动矢量添加到候选预测运动矢量列表,第五空间候选预测运动矢量经指派到索引4。如果位置252E处的候选预测运动矢量不可用或并非不同于候选预测运动矢量位置252A、252B、252C和252D处的候选预测运动矢量,则视频解码器可不将位置252处的候选预测运动矢量包括于候选预测运动矢量列表中。因此在考虑第五个空间候选预测运动矢量(910)之后,列表可能包括五个候选预测运动矢量(框902处考虑的前四个空间候选预测运动矢量和框910处考虑的第五空间候选预测运动矢量)或可能包括四个候选预测运动矢量(框902处考虑的前四个空间候选预测运动矢量)。If the candidate prediction motion vector list includes four candidate prediction motion vectors (908, No), the video decoder may consider a fifth spatial candidate prediction motion vector (910). The fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors at positions 252A, 252B, 252C, and 252D, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list. The five-space candidate prediction motion vector is assigned to index 4. If the candidate prediction motion vector at position 252E is unavailable or different from the candidate prediction motion vector at candidate position 252A, 252B, 252C, and 252D, the video decoder may not include the candidate prediction motion vector at position 252 in Candidate prediction motion vector list. So after considering the fifth spatial candidate prediction motion vector (910), the list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at box 902 and the fifth spatial candidate prediction considered at box 910) Motion vectors) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902).
如果候选预测运动矢量列表包括五个候选预测运动矢量(912,是),则视频解码器完成产生候选预测运动矢量列表。如果候选预测运动矢量列表包括四个候选预测运动矢量(912,否),则视频解码器添加人工产生的候选预测运动矢量(914)直到列表包括五个候选预测运动矢量(916,是)为止。If the candidate prediction motion vector list includes five candidate prediction motion vectors (912, Yes), the video decoder finishes generating the candidate prediction motion vector list. If the candidate prediction motion vector list includes four candidate prediction motion vectors (912, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes).
如果在视频解码器考虑前四个空间候选预测运动矢量之后,列表包括少于四个空间候选预测运动矢量(904,否),则视频解码器可考虑第五空间候选预测运动矢量(918)。第五空间候选预测运动矢量可(例如)对应于候选预测运动矢量位置252E。如果位置252E处的候选预测运动矢量可用且不同于已包括于候选预测运动矢量列表中的候选预测运动矢量,则视频解码器可将第五空间候选预测运动矢量添加到候选预测运动矢量列表,第五空间候选预测运动矢量经指派到下一可用索引。如果位置252E处的候选预测运动矢量不可用或并非不同于已包括于候选预测运动矢量列表中的候选预测运动矢量中的一者,则视频解码器可不将位置252E处的候选预测运动矢量包括于候选预测运动矢量列表中。视频解码器可接着考虑时间候选预测运动矢量(920)。如果时间候选预测运动矢量可用且不同于已包括于候选预测运动矢量列表中的候选预测运动矢量,则视频解码器可将所述时间候选预测运动矢量添加到候选预测运动矢量列表,所述时间候选预测运动矢量经指派到下一可用索引。如果时间候选预测运动矢量不可用或并非不同于已包括于候选预测运动矢量列表中的候选预测运动矢量中的一者,则视频解码器可不将所述时间候选预测运动矢量包括于候选预测运动矢量列表中。If the list includes fewer than four spatial candidate prediction motion vectors after the video decoder considers the first four spatial candidate prediction motion vectors (904, No), the video decoder may consider the fifth spatial candidate prediction motion vector (918). The fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list, the The five-space candidate prediction motion vector is assigned to the next available index. If the candidate prediction motion vector at position 252E is unavailable or different from one of the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may not include the candidate prediction motion vector at position 252E in Candidate prediction motion vector list. The video decoder may then consider the temporal candidate prediction motion vector (920). If a temporal candidate prediction motion vector is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add the temporal candidate prediction motion vector to the candidate prediction motion vector list, the temporal candidate The predicted motion vector is assigned to the next available index. If the temporal candidate prediction motion vector is not available or is not different from one of the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may not include the temporal candidate prediction motion vector in the candidate prediction motion vector. List.
如果在考虑第五空间候选预测运动矢量(框918)和时间候选预测运动矢量(框920)之后,候选预测运动矢量列表包括五个候选预测运动矢量(922,是),则视频解码器完成产生候选预测运动矢量列表。如果候选预测运动矢量列表包括少于五个候选预测运动矢量(922,否),则视频解码器添加人工产生的候选预测运动矢量(914)直到列表包括五个候选预测运动矢量(916,是)为止。If, after considering the fifth spatial candidate prediction motion vector (block 918) and the temporal candidate prediction motion vector (block 920), the candidate prediction motion vector list includes five candidate prediction motion vectors (922, Yes), the video decoder finishes generating List of candidate prediction motion vectors. If the list of candidate prediction motion vectors includes less than five candidate prediction motion vectors (922, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes) until.
在一种可能的实现中,可在空间候选预测运动矢量和时间候选预测运动矢量之后人工产生额外合并候选预测运动矢量,以使合并候选预测运动矢量列表的大小固定为合并候选预测运动矢量的指定数目(例如前文图9的可行的实施方式中的五个)。额外合并候选预测运动矢量可包括示例性的经组合双向预测性合并候选预测运动矢量(候选预测运动矢量1)、经缩放双向预测性合并候选预测运动矢量(候选预测运动矢量2),和零向量Merge/AMVP候选预测运动矢量(候选预测运动矢量3)。根据本申请的技术,可以在候选预测运动矢量列表中直接包括空间候选预测运动矢量和时间候选预测运动矢量,在候选预测运动矢量列表中通过标识位指示人工产生的额外合并候选预测运动 矢量。In a possible implementation, an additional merge candidate prediction motion vector may be artificially generated after the spatial candidate prediction motion vector and the temporal candidate prediction motion vector, so that the size of the merge candidate prediction motion vector list is fixed to the designation of the merge candidate prediction motion vector. Number (for example, five of the possible implementations of FIG. 9 above). Additional merge candidate prediction motion vectors may include exemplary combined bi-predictive merge candidate prediction motion vectors (candidate prediction motion vector 1), scaled bi-directional predictive merge candidate prediction motion vectors (candidate prediction motion vector 2), and zero vectors Merge / AMVP candidate prediction motion vector (candidate prediction motion vector 3). According to the technology of the present application, a spatial candidate prediction motion vector and a temporal candidate prediction motion vector may be directly included in the candidate prediction motion vector list, and an artificially generated additional merge candidate prediction motion vector is indicated in the candidate prediction motion vector list through an identification bit.
图10为本申请实施例中将经过组合的候选运动矢量添加到合并模式候选预测运动矢量列表的一种示例性示意图。经组合双向预测性合并候选预测运动矢量可通过组合原始合并候选预测运动矢量而产生。具体来说,原始候选预测运动矢量中的两个候选预测运动矢量(其具有mvL0_A和ref0或mvL1_B和ref0)可用以产生双向预测性合并候选预测运动矢量。在图10中,两个候选预测运动矢量包括于原始合并候选预测运动矢量列表中。一候选预测运动矢量的预测类型为列表0单向预测,且另一候选预测运动矢量的预测类型为列表1单向预测。在此可行的实施方式中,mvL0_A和ref0是从列表0拾取,且mvL1_B和ref0是从列表1拾取,且接着可产生双向预测性合并候选预测运动矢量(其具有列表0中的mvL0_A和ref0以及列表1中的mvL1_B和ref0)并检查其是否不同于已包括于候选预测运动矢量列表中的候选预测运动矢量。如果其不同,则视频解码器可将双向预测性合并候选预测运动矢量包括于候选预测运动矢量列表中。FIG. 10 is an exemplary schematic diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application. The combined bi-directional predictive merge candidate prediction motion vector may be generated by combining the original merge candidate prediction motion vector. Specifically, two candidate prediction motion vectors (which have mvL0_A and ref0 or mvL1_B and ref0) among the original candidate prediction motion vectors may be used to generate a bidirectional predictive merge candidate prediction motion vector. In FIG. 10, two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list. The prediction type of one candidate prediction motion vector is List 0 unidirectional prediction, and the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction. In this feasible implementation, mvL0_A and ref0 are picked from list 0, and mvL1_B and ref0 are picked from list 1, and then a bidirectional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and MvL1_B and ref0) in Listing 1 and check whether it is different from the candidate prediction motion vectors that have been included in the candidate prediction motion vector list. If it is different, the video decoder may include the bi-directional predictive merge candidate prediction motion vector in the candidate prediction motion vector list.
图11为本申请实施例中将经过缩放的候选运动矢量添加到合并模式候选预测运动矢量列表的一种示例性示意图。经缩放双向预测性合并候选预测运动矢量可通过缩放原始合并候选预测运动矢量而产生。具体来说,来自原始候选预测运动矢量的一候选预测运动矢量(其可具有mvL0_A和ref0或mvL1_A和ref1)可用以产生双向预测性合并候选预测运动矢量。在图11的可行的实施方式中,两个候选预测运动矢量包括于原始合并候选预测运动矢量列表中。一候选预测运动矢量的预测类型为列表0单向预测,且另一候选预测运动矢量的预测类型为列表1单向预测。在此可行的实施方式中,mvL0_A和ref0可从列表0拾取,且ref0可复制到列表1中的参考索引ref0′。接着,可通过缩放具有ref0和ref0′的mvL0_A而计算mvL0′_A。缩放可取决于POC距离。接着,可产生双向预测性合并候选预测运动矢量(其具有列表0中的mvL0_A和ref0以及列表1中的mvL0′_A和ref0′)并检查其是否为重复的。如果其并非重复的,则可将其添加到合并候选预测运动矢量列表。FIG. 11 is an exemplary schematic diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application. The scaled bi-directional predictive merge candidate prediction motion vector may be generated by scaling the original merge candidate prediction motion vector. Specifically, a candidate prediction motion vector (which may have mvL0_A and ref0 or mvL1_A and ref1) from the original candidate prediction motion vector may be used to generate a bidirectional predictive merge candidate prediction motion vector. In the feasible implementation of FIG. 11, two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list. The prediction type of one candidate prediction motion vector is List 0 unidirectional prediction, and the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction. In this feasible implementation, mvL0_A and ref0 may be picked from list 0, and ref0 may be copied to the reference index ref0 ′ in list 1. Then, mvL0′_A may be calculated by scaling mvL0_A with ref0 and ref0 ′. The scaling may depend on the POC distance. Next, a bi-directional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and mvL0'_A and ref0 'in list 1) can be generated and checked if it is a duplicate. If it is not duplicate, it can be added to the merge candidate prediction motion vector list.
图12为本申请实施例中将零运动矢量添加到合并模式候选预测运动矢量列表的一种示例性示意图。零向量合并候选预测运动矢量可通过组合零向量与可经参考的参考索引而产生。如果零向量候选预测运动矢量并非重复的,则可将其添加到合并候选预测运动矢量列表。对于每一产生的合并候选预测运动矢量,运动信息可与列表中的前一候选预测运动矢量的运动信息比较。FIG. 12 is an exemplary schematic diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application. The zero vector merge candidate prediction motion vector may be generated by combining the zero vector with a reference index that can be referred to. If the zero vector candidate prediction motion vector is not duplicated, it can be added to the merge candidate prediction motion vector list. For each generated merge candidate prediction motion vector, the motion information may be compared with the motion information of the previous candidate prediction motion vector in the list.
在一种可行的实施方式中,如果新产生的候选预测运动矢量不同于已包括于候选预测运动矢量列表中的候选预测运动矢量,则将所产生的候选预测运动矢量添加到合并候选预测运动矢量列表。确定候选预测运动矢量是否不同于已包括于候选预测运动矢量列表中的候选预测运动矢量的过程有时称作修剪(pruning)。通过修剪,每一新产生的候选预测运动矢量可与列表中的现有候选预测运动矢量比较。在一些可行的实施方式中,修剪操作可包括比较一个或多个新候选预测运动矢量与已在候选预测运动矢量列表中的候选预测运动矢量和不添加为已在候选预测运动矢量列表中的候选预测运动矢量的重复的新候选预测运动矢量。在另一些可行的实施方式中,修剪操作可包括将一个或多个新候选预测运动矢量添加到候选预测运动矢量列表且稍后从所述列表 移除重复候选预测运动矢量。In a feasible implementation manner, if the newly generated candidate prediction motion vector is different from the candidate prediction motion vector already included in the candidate prediction motion vector list, the generated candidate prediction motion vector is added to the merge candidate prediction motion vector. List. The process of determining whether the candidate prediction motion vector is different from the candidate prediction motion vector already included in the candidate prediction motion vector list is sometimes referred to as pruning. With pruning, each newly generated candidate prediction motion vector can be compared with existing candidate prediction motion vectors in the list. In some feasible implementations, the pruning operation may include comparing one or more new candidate prediction motion vectors with candidate prediction motion vectors already in the candidate prediction motion vector list and not adding as candidates already in the candidate prediction motion vector list. Repeated new candidate prediction motion vector for prediction motion vector. In other feasible implementations, the pruning operation may include adding one or more new candidate prediction motion vectors to a list of candidate prediction motion vectors and removing duplicate candidate prediction motion vectors from the list later.
在上述图10-图12等各种可行的实施方式中,基于本申请的技术,可以将新产生的候选预测运动矢量作为一类候选运动矢量,在原始候选预测运动矢量列表中通过标识位指示新产生的候选预测运动矢量。在编码时,当选定的候选运动矢量为某一新产生的候选预测运动矢量,则在码流中包括指示新产生的候选预测运动矢量的类别的标识1以及选定的候选运动矢量在新产生的候选预测运动矢量的类别中位置的标识2。在解码时,根据标识1及标识2,从候选预测运动矢量列表中确定选定的候选运动矢量执行后续解码流程。In the various feasible implementation manners such as FIG. 10 to FIG. 12 described above, based on the technology of the present application, a newly generated candidate prediction motion vector may be used as a type of candidate motion vector, which is indicated by an identification bit in the original candidate prediction motion vector list Newly generated candidate prediction motion vector. When encoding, when the selected candidate motion vector is a newly generated candidate prediction motion vector, the code stream includes an identifier 1 indicating the category of the newly generated candidate prediction motion vector and the selected candidate motion vector is in the new Identification 2 of the position in the category of the generated candidate prediction motion vector. During decoding, the selected candidate motion vector is determined from the candidate prediction motion vector list according to the identifier 1 and the identifier 2 and a subsequent decoding process is performed.
在上述图5-图7、图9-图12等各种可行的实施方式中,空间候选预测模式示例性的来自图8所示的252A至252E的五个位置,即与待处理图像块邻接的位置。在上述图5-图7、图9-图12等各种可行的实施方式的基础上,在一些可行的实施方式中,空间候选预测模式示例性的还可以包括与待处理图像块相距预设距离以内,但不与待处理图像块邻接的位置。示例性的,该类位置可以如图13中的252F至252J所示。应理解,图13为本申请实施例中编码单元及与其关联的相邻位置图像块的一种示例性示意图。与所待处理图像块处于同一图像帧且处理所述待处理图像块时已完成重建的不与所述待处理图像块相邻的图像块所述的位置均在此类位置的范围内。In the various feasible implementation manners shown in FIG. 5 to FIG. 7, FIG. 9 to FIG. 12, and the like, the spatial candidate prediction mode is exemplified from five positions 252A to 252E shown in FIG. s position. Based on the foregoing various feasible implementation manners such as FIG. 5 to FIG. 7 and FIG. 9 to FIG. 12, in some feasible implementation manners, the spatial candidate prediction mode may further include, for example, a preset distance from the image block to be processed Within a distance, but not adjacent to the image block to be processed. Exemplarily, such positions may be shown as 252F to 252J in FIG. 13. It should be understood that FIG. 13 is an exemplary schematic diagram of a coding unit and an adjacent position image block associated with the coding unit in the embodiment of the present application. The positions described in the image blocks that are in the same image frame as the image block to be processed and that have been reconstructed when the image block to be processed is not adjacent to the image block to be processed are within the range of such positions.
不妨将该类位置称为空域非邻接图像块,不妨设其中第一空域非邻接图像块、第二空域非邻接图像块和第三空域非邻接图像块可用,其中“可用”的物理意义可以参考前文所述,不再赘述。同时,不妨设,所述空间候选预测模式取自图8所示的位置的预测模式时,按照如下顺序检查并构建候选预测运动模式列表,应理解,其中,检查包括前文提到的“可用”的检查以及修剪的过程,不再赘述。所述候选预测模式列表,包括:252A位置图像块的运动矢量、252B位置图像块的运动矢量、252C位置图像块的运动矢量、252D位置图像块的运动矢量、由选择性时域运动矢量预测(ATMVP)技术获得的运动矢量、252E位置图像块的运动矢量、由时空运动矢量预测(STMVP)技术获得的运动矢量。其中,ATMVP技术和STMVP技术在JVET-G1001-v1的第2.3.1.1节和第2.3.1.2节中有详细记录,本文将JVET-G1001-v1全文引入于此,不再赘述。应理解,示例性的,所述候选预测模式列表包括以上7个预测运动矢量,根据不同的具体实施方式,所述候选预测模式列表所包含的预测运动矢量的个数可能少于7个,比如取前5个构成所述候选预测模式列表,还可以将前文所述的图10-图12中各可行的实施方式所构建的运动矢量加入到所述候选预测模式列表中,使其包含更多的预测运动矢量。在一种可行的实施方式中,可以将上述第一空域非邻接图像块、第二空域非邻接图像块和第三空域非邻接图像块加入到所述候选预测模式列表中,作为待处理图像块的预测运动矢量。进一步的,不妨设252A位置图像块的运动矢量、252B位置图像块的运动矢量、252C位置图像块的运动矢量、252D位置图像块的运动矢量、由ATMVP技术获得的运动矢量、252E位置图像块的运动矢量、由STMVP技术获得的运动矢量分别为MVL、MVU、MVUR、MVDL、MVA、MVUL、MVS,设第一空域非邻接图像块、第二空域非邻接图像块和第三空域非邻接图像块的运动矢量分别为MV0、MV1、MV2,则可以按照如下顺序检查并构建候选预测运动矢量列表:This type of location may be referred to as a non-adjacent image block in the spatial domain, and the first non-adjacent image block, the second non-adjacent image block, and the third non-adjacent image block in the spatial domain may be available. The physical meaning of "available" can be referred As mentioned above, I will not repeat them here. At the same time, it may be set that when the spatial candidate prediction mode is taken from the position prediction mode shown in FIG. 8, the candidate prediction motion mode list is checked and constructed in the following order. It should be understood that the check includes the “available” mentioned above. The process of checking and trimming will not be repeated here. The candidate prediction mode list includes: a motion vector of a 252A position image block, a motion vector of a 252B position image block, a motion vector of a 252C position image block, a motion vector of a 252D position image block, and prediction from a selective time domain motion vector ( (ATMVP) technology, motion vector obtained from 252E position image block, motion vector obtained by spatio-temporal motion vector prediction (STMVP) technology. Among them, ATMVP technology and STMVP technology are detailed in JVET-G1001-v1 section 2.3.1.1 and 2.3.1.2. This article introduces JVET-G1001-v1 in its entirety, and will not repeat them here. It should be understood that, by way of example, the candidate prediction mode list includes the above 7 prediction motion vectors. According to different specific implementations, the number of prediction motion vectors included in the candidate prediction mode list may be less than 7, such as Take the first 5 to form the candidate prediction mode list, and also add the motion vectors constructed by the feasible embodiments described in FIGS. 10 to 12 to the candidate prediction mode list to make it contain more Predicted motion vector. In a feasible implementation manner, the first spatial domain non-adjacent image block, the second spatial domain non-adjacent image block, and the third spatial domain non-adjacent image block may be added to the candidate prediction mode list as image blocks to be processed. Predicted motion vector. Further, it may be desirable to set the motion vector of the 252A position image block, the 252B position image block, the 252C position image block, the 252D position image block, the motion vector obtained by ATMVP technology, and the 252E position image block. The motion vectors and motion vectors obtained by STMVP technology are MVL, MMU, MVUR, MVDL, MVA, MVUL, and MVS. Motion vectors are MV0, MV1, and MV2 respectively, you can check and build a candidate prediction motion vector list in the following order:
示例1:MVL、MVU、MVUR、MVDL、MV0、MV1、MV2、MVA、MVUL、 MVS;Example 1: MVL, MMU, MVUR, MVDL, MV0, MV1, MV2, MVA, MVUL, MVS;
示例2:MVL、MVU、MVUR、MVDL、MVA、MV0、MV1、MV2、MVUL、MVS;Example 2: MVL, MMU, MVUR, MVDL, MVA, MV0, MV1, MV2, MVUL, MVS;
示例3:MVL、MVU、MVUR、MVDL、MVA、MVUL、MV0、MV1、MV2、MVS;Example 3: MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
示例4:MVL、MVU、MVUR、MVDL、MVA、MVUL、MVS、MV0、MV1、MV2;Example 4: MVL, MMU, MVUR, MVDL, MVA, MVUL, MVS, MV0, MV1, MV2;
示例5:MVL、MVU、MVUR、MVDL、MVA、MV0、MVUL、MV1、MVS、MV2;Example 5: MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MVS, MV2;
示例6:MVL、MVU、MVUR、MVDL、MVA、MV0、MVUL、MV1、MV2、MVS;Example 6: MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MV2, MVS;
示例7:MVL、MVU、MVUR、MVDL、MVA、MVUL、MV0、MV1、MV2、MVS;Example 7: MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
应理解,上述候选预测运动矢量列表可以用于上文所述的Merge模式或者AMVP模式,或者其它获取待处理图像块的预测运动矢量的预测模式中,可以用于编码端,也可以和对应的编码端保持一致地用于解码端,不作限定,同时,候选预测运动矢量列表中候选预测运动矢量的数量也是预设的,并且在编解码端保持一致,具体的数量不做限定。It should be understood that the above list of candidate prediction motion vectors may be used in the Merge mode or AMVP mode described above, or other prediction modes that obtain the predicted motion vectors of the image block to be processed, may be used at the encoding end, or may be used with the corresponding The encoding end is used consistently at the decoding end, without limitation. At the same time, the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and is consistent at the encoding and decoding ends, and the specific number is not limited.
应理解,示例1至示例7示例性地给出了几种可行的候选预测运动矢量列表的组成方式,基于空域非邻接图像块的运动矢量,还可以有其他的候选预测运动矢量列表的组成方式以及列表中候选预测运动矢量的排列方式,不作限定。It should be understood that, examples 1 to 7 give examples of the composition of several feasible candidate prediction motion vector lists. Based on the motion vectors of non-contiguous image blocks in the spatial domain, there may be other composition methods of candidate prediction motion vector lists. The arrangement of candidate prediction motion vectors in the list is not limited.
本申请实施例提供了另一种构建候选预测运动矢量列表的方法,与示例1至示例7等候选预测运动矢量列表的方法相比,该实施例将在其他实施例中确定的候选预测运动矢量与预设的矢量差相结合,形成了新的候选预测运动矢量,克服了预测运动矢量的预测精度低的缺陷,提高了编码效率。This embodiment of the present application provides another method for constructing candidate prediction motion vector lists. Compared with the methods of candidate prediction motion vector lists such as Examples 1 to 7, this embodiment will determine candidate prediction motion vectors in other embodiments. Combined with the preset vector difference, a new candidate prediction motion vector is formed, which overcomes the shortcomings of low prediction accuracy of the prediction motion vector and improves coding efficiency.
在本申请的一种可行的实施方式中,如图14A所示,待处理图像块的候选预测运动矢量列表包括两个分列表:第一运动矢量集合和矢量差集合。其中,第一运动矢量集合的构成可以参考本文前述实施例中的各种构成方式,示例性的,比如H.265标准所规定的Merge模式或AMVP模式下候选运动矢量集合的构成方式。矢量差集合包括一个或多个预设的矢量差。In a feasible implementation manner of the present application, as shown in FIG. 14A, the candidate prediction motion vector list of the image block to be processed includes two sub-lists: a first motion vector set and a vector difference set. For the composition of the first motion vector set, reference may be made to various configurations in the foregoing embodiments of the present invention. For example, for example, the configuration manner of the candidate motion vector set in the Merge mode or the AMVP mode specified in the H.265 standard. The vector difference set includes one or more preset vector differences.
在一些可行的实施方式中,矢量差集合中的每个矢量差和从第一运动矢量集合中确定的原始目标运动矢量相加,相加后的矢量差和原始目标运动矢量一起,成为一个新的运动矢量集合。In some feasible implementation manners, each vector difference in the vector difference set is added to the original target motion vector determined from the first motion vector set, and the added vector difference and the original target motion vector become a new one. Sport vector collection.
在一些可行的实施方式中,图14A所示的候选预测运动矢量列表中可以将矢量差集合作为子集合包含于候选预测运动矢量列表中,候选预测运动矢量列表中通过标识位(矢量差集合计算的MV)来指示该矢量差集合,在矢量差集合中通过索引指示每个矢量差,构建的候选预测运动矢量列表如图14B所示。In some feasible implementation manners, the candidate prediction motion vector list shown in FIG. 14A may include the vector difference set as a subset in the candidate prediction motion vector list, and the candidate prediction motion vector list is calculated by an identification bit (vector difference set calculation). MV) to indicate the vector difference set, and each vector difference is indicated by an index in the vector difference set, and a candidate prediction motion vector list constructed is shown in FIG. 14B.
应理解,本申请技术提供的预测运动矢量列表中通过标识位指示一类候选运动矢量的方式,可以用于上文所述的Merge模式或者AMVP模式,或者其它获取待处理图 像块的预测运动矢量的预测模式中,可以用于编码端,也可以和对应的编码端保持一致地用于解码端,不作限定,同时,候选预测运动矢量列表中候选预测运动矢量的数量也是预设的,并且在编解码端保持一致,具体的数量不做限定。It should be understood that the manner in which a class of candidate motion vectors is indicated in the predicted motion vector list provided by the technology of the present application can be used in the Merge mode or AMVP mode described above, or other predicted motion vectors of image blocks to be processed. In the prediction mode, it can be used at the encoding end, or can be used at the decoding end consistent with the corresponding encoding end, without limitation. At the same time, the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and The encoding and decoding ends are consistent, and the specific number is not limited.
以下,结合附图对本申请实施例提供的预测运动信息的解码方法进行详细描述。按照本申请实施例的技术,编码端或者解码端在构建候选运动信息列表时,在列表中通过标识位指示一类候选运动信息以控制列表的长度,本申请实施例提供的预测运动信息的解码方法基于此展开,该方法由解码装置执行,该解码装置可以为图1所示的视频译码系统1中的视频解码器200,或者,也可以为视频解码器200中的功能单元,本申请对此不进行具体限定。Hereinafter, the method for decoding the predicted motion information provided in the embodiments of the present application will be described in detail with reference to the accompanying drawings. According to the technology of the embodiment of the present application, when the encoding end or the decoding end constructs a candidate motion information list, a type of candidate motion information is indicated in the list to control the length of the list, and the prediction motion information provided by the embodiment of the present application is decoded The method is developed based on this. The method is executed by a decoding device, which may be the video decoder 200 in the video decoding system 1 shown in FIG. 1, or may be a functional unit in the video decoder 200. This application This is not specifically limited.
图15为本申请实施例的示意性流程图,涉及一种预测运动信息的解码方法,具体可以包括:FIG. 15 is a schematic flowchart of an embodiment of the present application, which relates to a decoding method for predicting motion information, and specifically may include:
S1501、解码装置解析码流以获得第一标识。S1501. The decoding device parses the code stream to obtain a first identifier.
如前所述,码流由编码端对当前图像块进行编码后发送,第一标识指示编码端对当前图像块编码时从选定的候选运动信息的位置,第一标识用于解码装置确定该选定的候选运动信息,进而预测待处理图像块的运动信息。As described above, the code stream is sent by the encoding end after encoding the current image block. The first identifier indicates the position of the selected candidate motion information when the encoding end encodes the current image block. The first identifier is used by the decoding device to determine the The selected candidate motion information further predicts the motion information of the image block to be processed.
一种可能的实现中,第一标识可以为选定的候选运动信息的具体索引,这种情况下,第一标识即可唯一确定出一个候选运动信息。In a possible implementation, the first identifier may be a specific index of the selected candidate motion information. In this case, the first identifier may uniquely determine one candidate motion information.
一种可能的实现中,第一标识可以为选定的候选运动信息所属分类的标识,这种情况下,码流中还包括第四标识,以指示选定的候选运动信息在其所属分类中的具体位置。In a possible implementation, the first identifier may be an identifier of a category to which the selected candidate motion information belongs. In this case, the code stream further includes a fourth identifier to indicate that the selected candidate motion information is in its own category. Specific location.
需要说明的是,解析码流获取标识的具体实现本申请不进行具体限定,对于第一标识、第四标识在码流中的位置以及形式,本申请实施例对此也不进行具体限定。It should be noted that the specific implementation of obtaining the identifier by parsing the code stream is not specifically limited in this application, and the positions and forms of the first identifier and the fourth identifier in the code stream are not specifically limited in the embodiments of the present application.
可选的,第一标识可以采用定长编码方式。例如,第一标识可以为1个比特标识,其指示的种类有限。Optionally, the first identifier may be a fixed-length encoding method. For example, the first identifier may be a 1-bit identifier, and the types of indication are limited.
可选的,第一标识可以采用变长编码方式。Optionally, the first identifier may adopt a variable length encoding method.
S1502、解码装置根据第一标识,从第一候选集合中确定目标元素。S1502. The decoding device determines a target element from the first candidate set according to the first identifier.
具体的,第一候选集合的内容可以包括下述两种可能的实现:Specifically, the content of the first candidate set may include the following two possible implementations:
可能的实现1、第一候选集合中的元素包括至少一个第一候选运动信息和至少一个第二候选集合,第二候选集合中的元素包括多个第二候选运动信息。 Possible implementations 1. Elements in the first candidate set include at least one first candidate motion information and at least one second candidate set, and elements in the second candidate set include multiple second candidate motion information.
可能的实现2、第一候选集合中的元素可以包括至少一个第一候选运动信息和多个第二候选运动信息,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量。根据第一运动信息和预设的运动信息偏移量即可生成新的运动信息。 Possible implementation 2. Elements in the first candidate set may include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset Motion information offset. New motion information may be generated according to the first motion information and a preset motion information offset.
其中,第一候选集合可以为构建的候选运动信息列表。在第一候选集合中,直接包含至少一个第一候选运动信息,多个第二候选运动信息以第二候选集合的形式包含于第一候选集合中。The first candidate set may be a constructed candidate motion information list. In the first candidate set, at least one first candidate motion information is directly included, and a plurality of second candidate motion information is included in the first candidate set in the form of a second candidate set.
在一种可行的实施方式中,第二候选运动信息和第一候选运动信息不相同。In a feasible implementation manner, the second candidate motion information and the first candidate motion information are different.
示例性的,第一候选运动信息与每个第二候选集合中包括的第二候选运动信息,可以为采用不同的MV预测模式确定的候选运动信息,或者,可以为不同类型的候选 运动信息,本申请实施例对此不进行具体限定。Exemplarily, the first candidate motion information and the second candidate motion information included in each second candidate set may be candidate motion information determined by using different MV prediction modes, or may be different types of candidate motion information. This embodiment of the present application does not specifically limit this.
例如,第一候选运动信息可以为Merge方式获取的运动信息,第二候选运动信息可以为Affine Merge方式获取的运动信息。For example, the first candidate motion information may be motion information acquired in a Merge manner, and the second candidate motion information may be motion information acquired in an Affine Merge manner.
例如,第一候选运动信息可以为原始候选运动信息,第二候选运动信息可以为根据原始候选运动信息生成的运动信息。For example, the first candidate motion information may be original candidate motion information, and the second candidate motion information may be motion information generated according to the original candidate motion information.
示例性的,如图16A、图16B所示,示意了两种Merge候选列表。在图16A或图16B示意的Merge候选列表中,列表中的标识位用于指示一个候选运动信息集合。标识位可以位于列表中的任意位置,本申请实施例对此不进行具体限定。例如,标识位可以位于如图16A所示的列表中的末尾;或者,标识位可以位于如图16B所示的列表中的中部。当码流中的第一标识指示该标识位时,则确定目标元素为该标识位指示的候选运动信息集合。其中,标识位指示的候选运动信息集合中包括多个第二候选运动信息。再对标识位所指向的候选运动信息集合中,根据进一步的标识(S1504中的第二标识)选取其中一候选运动信息作为目标运动信息,用来预测待处理图像块的运动信息。Exemplarily, as shown in FIG. 16A and FIG. 16B, two types of Merge candidate lists are illustrated. In the Merge candidate list shown in FIG. 16A or FIG. 16B, an identification bit in the list is used to indicate a candidate motion information set. The identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application. For example, the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B. When the first identifier in the code stream indicates the identifier bit, it is determined that the target element is a candidate motion information set indicated by the identifier bit. The candidate motion information set indicated by the identification bit includes a plurality of second candidate motion information. For the candidate motion information set pointed to by the identification bit, one of the candidate motion information is selected as the target motion information according to the further identification (the second identification in S1504), and used to predict the motion information of the image block to be processed.
示例性的,如图16A、图16B所示,示意了两种Merge候选列表。在图16A或图16B示意的Merge候选列表中,列表中的标识位用于指示一个候选运动信息集合。标识位可以位于列表中的任意位置,本申请实施例对此不进行具体限定。例如,标识位可以位于如图16A所示的列表中的末尾;或者,标识位可以位于如图16B所示的列表中的中部。当码流中的第一标识指示该标识位时,则确定目标元素为该标识位指示的多个第二候选运动信息。其中,第二候选运动信息包括预设的运动信息偏移量。再对标识位所指向的多个第二候选运动信息中,根据进一步的标识(S1504中的第二标识)选取其中一候选运动信息,基于选取的第二候选运动信息,确定目标运动信息,用来预测待处理图像块的运动信息。Exemplarily, as shown in FIG. 16A and FIG. 16B, two types of Merge candidate lists are illustrated. In the Merge candidate list shown in FIG. 16A or FIG. 16B, an identification bit in the list is used to indicate a candidate motion information set. The identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application. For example, the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B. When the first identifier in the code stream indicates the identifier bit, it is determined that the target element is a plurality of second candidate motion information indicated by the identifier bit. The second candidate motion information includes a preset motion information offset. Then, among the plurality of second candidate motion information pointed by the identification bit, one of the candidate motion information is selected according to the further identification (the second identification in S1504), and the target motion information is determined based on the selected second candidate motion information, To predict the motion information of the image block to be processed.
在另一种可行的实施方式中,如图16C所示,在Merge候选列表中,增加不只1个标识位,每个标识位均指向一个特定候选运动信息集合或者多个包括预设的运动信息偏移量的运动信息。当码流中的第一标识指示某一个标识位时,则确定目标元素为该标识位指示的候选运动信息集合中的候选运动信息,或者,根据该标识位指示的多个候选运动信息(包括预设的运动信息偏移量)中的一个,确定目标运动信息。In another feasible implementation manner, as shown in FIG. 16C, in the Merge candidate list, more than one flag is added, and each flag points to a specific candidate motion information set or a plurality of preset motion information is included. Offset motion information. When the first identifier in the code stream indicates a certain identifier bit, it is determined that the target element is the candidate motion information in the candidate motion information set indicated by the identifier bit, or according to multiple candidate motion information indicated by the identifier bit (including One of the preset motion information offsets) to determine the target motion information.
图16A、图16B、图16C通过在Merge列表中引入标识位(指针)方式,以子集合方式实现候选的引入,当引入多候选情况,大大减少了候选列表的长度,减少了列表重构复杂度,有益于硬件实现简化。Figures 16A, 16B, and 16C introduce the identification (pointer) method in the Merge list to implement the introduction of candidates as a subset. When multiple candidates are introduced, the length of the candidate list is greatly reduced, and the complexity of list reconstruction is reduced. Degree, which is helpful for simplifying the hardware implementation.
在一种可行的实施方式中,第一候选运动信息可以包括待处理图像块的空域相邻图像块的运动信息。需要说明的是,对于空域相邻图像块的运动信息的定义,已经在前述内容进行了说明,此处不再进行赘述。In a feasible implementation manner, the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed. It should be noted that the definition of the motion information of the adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
在一种可行的实施方式中,第二候选运动信息可以包括待处理图像块的空域非相邻图像块的运动信息。需要说明的是,对于空域非相邻图像块的运动信息的定义,已经在前述内容进行了说明,此处不再进行赘述。In a feasible implementation manner, the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed. It should be noted that the definition of motion information of non-adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
其中,第一运动信息的获取方式可以根据实际需求选取,本申请实施例对此不进行具体限定。用于获取第二运动信息的预设的运动信息偏移量的取值,可以为定值也 可以是从一个集合中选取的值,本申请实施例对于预设的运动信息偏移量的内容及形式均不进行具体限定。The method for acquiring the first motion information may be selected according to actual requirements, which is not specifically limited in the embodiment of the present application. The value of the preset motion information offset used to obtain the second motion information may be a fixed value or a value selected from a set. For the content of the preset motion information offset in this embodiment of the present application, Neither the form nor the form is specifically limited.
在一种可行的实施方式中,第一候选运动信息包括第一运动信息,至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第三候选集合和至少一个第四候选集合,第三候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第四候选集合中的元素包括多个基于第一运动信息和预设的运动信息偏移量获得的运动信息。In a feasible implementation manner, the first candidate motion information includes first motion information, at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one third candidate set and at least one A fourth candidate set, the elements of the third candidate set include motion information of spatial non-adjacent image blocks of a plurality of image blocks to be processed, and the elements of the fourth candidate set include a plurality of motions based on the first motion information and a preset motion Information obtained from the information offset.
在一种可行的实施方式中,至少一个第二候选集合为多个第二候选集合,多个第二候选集合包括至少一个第五候选集合和至少一个第六候选集合,第五候选集合中的元素包括多个待处理图像块的空域非相邻图像块的运动信息,第六候选集合中的元素包括多个预设的运动信息偏移量。In a feasible implementation manner, the at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set. The elements include motion information of spatially non-adjacent image blocks of a plurality of image blocks to be processed, and the elements in the sixth candidate set include a plurality of preset motion information offsets.
在一种可行的实施方式中,在至少一个第一候选运动信息中,用于标识第一运动信息的编码码字最短。In a feasible implementation manner, among the at least one first candidate motion information, the coding codeword used to identify the first motion information is the shortest.
在一种可行的实施方式中,第一运动信息不包括根据ATMVP模式获得的运动信息。In a feasible implementation manner, the first motion information does not include motion information obtained according to the ATMVP mode.
如S1501中所述,第一标识可以为第一候选集合中的索引,也可以是运动信息所述分类的标识,根据具体内容,S1502可以实现为下述两种情况:As described in S1501, the first identifier may be an index in the first candidate set or an identifier classified in the motion information. According to specific content, S1502 may be implemented in the following two cases:
情况1、第一标识为第一候选集合中的索引。 Case 1. The first identifier is an index in the first candidate set.
一种可能的实现中,在情况1中,S1502中解码装置可以将第一候选集合中,第一标识指示的位置的元素确定为目标元素。由于第一候选集合包括至少一个第一候选运动信息和至少一个第二候选集合,所以,根据第一标识确定的目标元素可能是第一候选运动信息,也可能是一个第二候选集合,取决于第一标识指示的位置排列的内容。In a possible implementation, in case 1, the decoding device in S1502 may determine an element at a position indicated by the first identifier in the first candidate set as a target element. Since the first candidate set includes at least one first candidate motion information and at least one second candidate set, the target element determined according to the first identifier may be the first candidate motion information or a second candidate set, depending on The content arranged at the position indicated by the first identifier.
另一种可能的实现中,在情况1中,S1502中解码装置可以将第一候选集合中,第一标识指示的位置的元素确定为目标元素。由于第一候选集合包括至少一个第一候选运动信息和多个第二候选运动信息,所以,根据第一标识确定的目标元素可能是第一候选运动信息,也可能是根据多个第二候选运动信息获得,取决于第一标识指示的位置排列的内容。In another possible implementation, in case 1, the decoding device in S1502 may determine the element at the position indicated by the first identifier in the first candidate set as the target element. Since the first candidate set includes at least one first candidate motion information and a plurality of second candidate motion information, the target element determined according to the first identifier may be the first candidate motion information or may be based on a plurality of second candidate motion information. The information obtained depends on the content of the position arrangement indicated by the first identifier.
情况2、第一标识为候选运动信息分类的标识。 Case 2. The first identifier is an identifier of candidate motion information classification.
在情况2中,S1502中解码装置根据第一标识确定目标元素所属的分类。解码装置再解析码流以获取第四标识,第四标识指示目标元素在其分类中的具体位置,根据第四标识在其分类中唯一确定出目标元素。具体的,若第一标识指示目标元素属于第一候选运动信息的分类,根据第四标识在至少一个第一候选运动信息中确定一个第一候选运动信息作为目标元素。若第一标识指示目标元素属于某一种第二候选运动信息的分类,根据第四标识,确定出作为目标元素的一个第二候选集合或者一个第二候选运动信息。In case 2, the decoding device in S1502 determines the classification to which the target element belongs according to the first identifier. The decoding device then parses the bitstream to obtain a fourth identifier, the fourth identifier indicates a specific position of the target element in its classification, and uniquely determines the target element in its classification according to the fourth identifier. Specifically, if the first identifier indicates that the target element belongs to the classification of the first candidate motion information, one first candidate motion information is determined as the target element among the at least one first candidate motion information according to the fourth identifier. If the first identifier indicates that the target element belongs to a certain category of the second candidate motion information, a second candidate set or a second candidate motion information as the target element is determined according to the fourth identifier.
示例性的,假设第一候选运动信息为Merge运动信息,第一候选集合中包括两个第二候选集合,一个第二候选集合中的第二候选运动信息为第一类型的Affine Merge运动信息,另一个第二候选集合中的第二候选运动信息为第二类型的Affine Merge运动信息。配置标识为0指示Merge运动信息,标识为1指示Affine Merge运动信息。 若解码装置在S1501中解析码流获取的第一标识为0,则S1502中解码装置再解析码流获取第四标识。根据第四标识,在第一候选集合中的至少一个Merge运动信息中确定一个Merge运动信息作为目标元素。若解码装置在S1501中解析码流获取的第一标识为1,则S1502中解码装置再解析码流获取第四标识。根据第四标识,在两个第二候选集合中确定一个第二候选集合作为目标元素。Exemplarily, it is assumed that the first candidate motion information is Merge motion information, the first candidate set includes two second candidate sets, and the second candidate motion information in one second candidate set is Affine Merge motion information of the first type. The second candidate motion information in another second candidate set is Affine Merge motion information of the second type. A configuration identifier of 0 indicates Merge motion information, and an identifier of 1 indicates Affine Merge motion information. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, one Merge motion information is determined as a target element among at least one Merge motion information in the first candidate set. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, one second candidate set is determined as the target element among the two second candidate sets.
示例性的,假设第一候选运动信息为Merge运动信息,第一候选集合中包括两个第二候选集合,一个第二候选集合中的第二候选运动信息为第一类型的Affine Merge运动信息对应的预设的运动信息偏移量,另一个第二候选集合中的第二候选运动信息为第二类型的Affine Merge运动信息预设的运动信息偏移量。配置标识为0指示Merge运动信息,标识为1指示Affine Merge运动信息。若解码装置在S1501中解析码流获取的第一标识为0,则S1502中解码装置再解析码流获取第四标识。根据第四标识,在第一候选集合中的至少一个Merge运动信息中确定一个Merge运动信息作为目标元素。若解码装置在S1501中解析码流获取的第一标识为1,则S1502中解码装置再解析码流获取第四标识。根据第四标识,在两个第二候选集合中确定一个第二候选集合,基于确定的第二候选集合中第二候选运动信息中的一个,确定目标元素。Exemplarily, it is assumed that the first candidate motion information is Merge motion information, the first candidate set includes two second candidate sets, and the second candidate motion information in one second candidate set corresponds to the first type of Affine Merge motion information. A preset motion information offset, and the second candidate motion information in another second candidate set is a preset motion information offset of the second type of AffineMerge motion information. A configuration identifier of 0 indicates Merge motion information, and an identifier of 1 indicates Affine Merge motion information. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, one Merge motion information is determined as a target element among at least one Merge motion information in the first candidate set. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, a second candidate set is determined from the two second candidate sets, and a target element is determined based on one of the second candidate motion information in the determined second candidate set.
可选的,在S1502中,解码装置确定目标元素若为第一候选运动信息,则执行S1503;在S1502中,解码装置确定目标元素若为第二候选集合或者根据多个第二候选运动信息获得,则执行S1504。Optionally, in S1502, the decoding device determines that the target element is the first candidate motion information, and executes S1503; in S1502, the decoding device determines that the target element is the second candidate set, or is obtained according to multiple second candidate motion information. , Then execute S1504.
S1503、当目标元素为第一候选运动信息时,将第一候选运动信息作为目标运动信息。S1503. When the target element is the first candidate motion information, use the first candidate motion information as the target motion information.
其中,目标运动信息用来预测待处理图像块的运动信息。The target motion information is used to predict the motion information of the image block to be processed.
可选的,目标运动信息用来预测待处理图像块的运动信息,具体可以实现为:将目标运动信息作为待处理图像块的运动信息;或者,将目标运动信息作为待处理图像块的预测运动信息。在实际应用中,可以根据实际需求选取目标运动信息用来预测待处理图像块的运动信息的具体实现,此处不进行具体限定。Optionally, the target motion information is used to predict the motion information of the image block to be processed, which can be specifically implemented as: using the target motion information as the motion information of the image block to be processed; or, using the target motion information as the predicted motion of the image block to be processed information. In practical applications, a specific implementation of selecting target motion information to predict motion information of an image block to be processed may be selected according to actual requirements, which is not specifically limited here.
进一步的,对于待处理图像块的后续处理,已经在前述内容中进行了详细描述,此处不再进行赘述。Further, the subsequent processing of the image blocks to be processed has been described in detail in the foregoing, and is not repeated here.
S1504、解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息。S1504. Parse the bitstream to obtain a second identifier, and determine the target motion information based on one of the plurality of second candidate motion information according to the second identifier.
其中,S1504中解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息,具体可以实现为:解析码流以获得第二标识,根据第二标识,从多个第二候选运动信息中,确定目标运动信息。Wherein, the code stream is parsed in S1504 to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information, which may be specifically implemented as: analyzing the code stream to obtain the second identifier, according to The second identifier determines target motion information from a plurality of second candidate motion information.
需要说明的是,解析码流获取标识的具体实现本申请不进行具体限定,对于第二标识在码流中的位置以及形式,本申请实施例对此也不进行具体限定。It should be noted that the specific implementation of obtaining the identifier by analyzing the code stream is not specifically limited in this application, and the position and form of the second identifier in the code stream are not specifically limited in the embodiments of the present application.
可选的,第二标识可以采用定长编码方式。例如,第二标识可以为1个比特标识,其指示的种类有限。Optionally, the second identifier may adopt a fixed-length encoding method. For example, the second identifier may be a 1-bit identifier, and the types of indication are limited.
可选的,第二标识可以采用变长编码方式。例如,第二标识可以为多个比特标识。Optionally, the second identifier may adopt a variable-length encoding method. For example, the second identifier may be a plurality of bit identifiers.
可选的,根据第二候选运动信息的内容不同,S1504中根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息可以通过下述一种可行的实现方式实 现,但并不局限于此。Optionally, according to the content of the second candidate motion information, in S1504, according to the second identifier, based on one of the plurality of second candidate motion information, determining the target motion information may be achieved by one of the following feasible implementation methods, but It is not limited to this.
在一种可行的实施方式中,当第一候选运动信息包括第一运动信息,第二候选运动信息包括第二运动信息,第二运动信息基于第一运动信息和预设的运动信息偏移量获得,在该方式中,第二标识可以为目标运动信息在第二候选集合中的具体位置,S1504中解码装置根据第二标识,从多个第二候选运动信息中确定目标运动信息,具体可以实现为:将作为目标元素的第二候选集合中,第二标识指示的位置的第二候选运动信息确定为目标运动信息。In a feasible implementation manner, when the first candidate motion information includes the first motion information, the second candidate motion information includes the second motion information, and the second motion information is based on the first motion information and a preset motion information offset. Obtained, in this manner, the second identifier may be a specific position of the target motion information in the second candidate set, and the decoding device in S1504 determines the target motion information from a plurality of second candidate motion information according to the second identifier. It is implemented as: determining the second candidate motion information of the position indicated by the second identifier in the second candidate set as the target element as the target motion information.
在一种可行的实施方式中,当第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量,在该方式中,第二标识为目标偏移量在第二候选集合中的具体位置,S1504中解码装置根据第二标识,从多个第二候选运动信息中确定目标运动信息,可以具体实现为:根据第二标识从多个预设的运动信息偏移量中确定目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner, when the first candidate motion information includes the first motion information and the second candidate motion information includes a preset motion information offset, in this mode, the second identifier is that the target offset is between The specific position in the second candidate set. The decoding device in S1504 determines the target motion information from the plurality of second candidate motion information according to the second identifier, which may be specifically implemented as: biasing from a plurality of preset motion information according to the second identifier The target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
在一种可行的实施方式中,当第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量,在根据第二标识从多个预设的运动信息偏移量中确定目标偏移量之前,本申请提供的预测运动信息的解码方法还可以包括:将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量;对应的,根据第二标识从多个预设的运动信息偏移量中确定目标偏移量,包括:根据第二标识从多个调整后的运动信息偏移量中确定目标偏移量。In a feasible implementation manner, when the first candidate motion information includes the first motion information and the second candidate motion information includes a preset motion information offset, the second candidate motion information is offset from a plurality of preset motion information according to the second identifier. Before the target offset is determined in the displacement, the method for decoding prediction motion information provided in the present application may further include: multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motions. Information offset; correspondingly, determining a target offset from a plurality of preset motion information offsets according to a second identifier, including: determining a target from a plurality of adjusted motion information offsets according to a second identifier Offset.
在一种可行的实施方式中,当第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量,在该方式中,S1504中解码装置根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息,可以具体实现为:根据第二标识从多个预设的运动信息偏移量中确定一个运动信息偏移量并乘以预设系数后作为目标偏移量;基于第一运动信息和目标偏移量确定目标运动信息。In a feasible implementation manner, when the first candidate motion information includes the first motion information and the second candidate motion information includes a preset motion information offset, in this manner, the decoding device in S1504 according to the second identifier, Determining the target motion information based on one of the plurality of second candidate motion information may be specifically implemented as follows: determining a motion information offset from a plurality of preset motion information offsets and multiplying the preset by a second identifier The coefficient is used as the target offset; the target motion information is determined based on the first motion information and the target offset.
需要说明的是,预设系数可以为配置在解码装置内的固定系数,也可以为码流中携带的系数,本申请实施例对此不进行具体限定。It should be noted that the preset coefficient may be a fixed coefficient configured in the decoding device, or may be a coefficient carried in a code stream, which is not specifically limited in this embodiment of the present application.
进一步可选的,当预设系数为码流中携带的系数时,本申请提供的预测运动信息的解码方法还可以包括S1505。Further optionally, when the preset coefficient is a coefficient carried in a code stream, the method for decoding prediction motion information provided in this application may further include S1505.
S1505、解析码流以获得第三标识。S1505. Parse the bitstream to obtain a third identifier.
其中,第三标识包括预设系数。The third identifier includes a preset coefficient.
通过本申请提供的预测运动信息的解码方法,第一候选集合中的元素包括了第一候选运动信息以及至少一个第二候选集合,或者,第一候选集合中的元素包括了第一候选运动信息以及多个第二候选运动信息,这样一来,多层候选集合的结构,当引入更多候选时,可以将一类候选运动信息的集合作为一个元素添加在第一候选集合中,相比于直接将候选运动信息加入第一候选集合,大大所选了第一候选集合的长度。当第一候选集合为帧间预测的候选运动信息列表时,即使引起更多的候选,也可以很好的控制候选运动信息列表的长度,为检测过程和硬件实现提供便利。Through the decoding method for predicting motion information provided in this application, the elements in the first candidate set include the first candidate motion information and at least one second candidate set, or the elements in the first candidate set include the first candidate motion information And multiple second candidate motion information. In this way, when more candidates are introduced in the structure of the multi-layer candidate set, a type of candidate motion information set can be added as an element to the first candidate set, compared to The candidate motion information is directly added to the first candidate set, and the length of the first candidate set is greatly selected. When the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
示例性的,下面为本申请实施例的几个具体实施方式:Exemplarily, the following are specific implementations of the embodiments of the present application:
实施例1:Example 1:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。 其中,第一索引0-5对应的候选运动信息包括运动矢量及参考图像,第一索引6对应基于索引0所对应的候选运动信息以及预设运动矢量偏移生成的新运动信息。假设第一索引0所对应的候选运动信息为前向预测,运动矢量为(2,-3),参考帧POC为2。预设的运动矢量偏移为(1,0),(0,-1),(-1,0),(0,1)。当解析码流得到第一索引值为6时,表明当前图像块采用的运动信息为基于索引0所对应的候选运动信息及预设运动矢量偏移生成的新的运动信息,则进一步解码以获取第二索引值。当进一步解码得到的第二索引值为1时,则当前图像块的运动信息为前向预测,运动矢量为(2,-3)+(0,-1)=(2,-4),参考帧POC为2。Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image, and the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the index 0 and a preset motion vector offset. Assume that the candidate motion information corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2. The preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1). When the first index value obtained by parsing the bitstream is 6, it indicates that the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value. When the second index value obtained by further decoding is 1, the motion information of the current image block is forward prediction, and the motion vector is (2, -3) + (0, -1) = (2, -4), reference The frame POC is 2.
实施例2:Example 2:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。其中,第一索引0-5对应的候选运动信息包括运动矢量及参考图像,第一索引6对应基于第一索引0所对应的候选运动信息以及预设运动矢量偏移生成的新运动信息。假设第一索引0所对应的候选的运动信息为双向预测,前向运动矢量为(2,-3),参考帧POC为2,后向运动矢量为(-2,-1),参考帧POC为4。预设的运动矢量偏移为(1,0),(0,-1),(-1,0),(0,1)。当解析码流得到第一索引值为6时,表明当前图像块采用的运动信息为基于索引0所对应的候选运动信息及预设运动矢量偏移生成的新的运动信息,则进一步解码以获取第二索引值。当进一步解码得到的第二索引值为0时,则当前图像块的运动信息为双向预测,当前帧POC为3时,此时前后向参考帧POC较当前帧POC为非同向。则前向运动矢量为(2,-3)+(1,0)=(3,-3),参考帧POC为2,后向运动矢量为(-2,-1)-(1,0)=(-3,-1),参考帧POC为4;当前帧POC为6时,此时前后向参考帧POC较当前帧POC为同向。则前向运动矢量为(2,-3)+(1,0)=(3,-3),参考帧POC为2,后向运动矢量为(-2,-1)+(1,0)=(-1,-1),参考帧POC为4。Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image, and the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the first index 0 and a preset motion vector offset. Assume that the motion information of the candidate corresponding to the first index 0 is bidirectional prediction, the forward motion vector is (2, -3), the reference frame POC is 2, the backward motion vector is (-2, -1), and the reference frame POC Is 4. The preset motion vector offsets are (1,0), (0, -1), (-1, 0), (0, 1). When the first index value obtained by parsing the bitstream is 6, it indicates that the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value. When the second index value obtained by further decoding is 0, the motion information of the current image block is bidirectional prediction, and when the current frame POC is 3, the forward and backward reference frames POC are non-unidirectional than the current frame POC. Then the forward motion vector is (2, -3) + (1,0) = (3, -3), the reference frame POC is 2, and the backward motion vector is (-2, -1)-(1,0) = (-3, -1), the reference frame POC is 4; when the current frame POC is 6, the forward and backward reference frames POC are now in the same direction as the current frame POC. Then the forward motion vector is (2, -3) + (1,0) = (3, -3), the reference frame POC is 2, and the backward motion vector is (-2, -1) + (1,0) = (-1, -1), the reference frame POC is 4.
实施例3:Example 3:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。其中,第一索引0-5对应的候选运动信息包括运动矢量及参考图像,设第一索引0指示的候选运动信息由子块运动信息所构成,第一索引1所对应的候选运动信息不是由子块运动信息所构成且运动信息为前向预测,运动矢量为(2,-3),参考帧POC为2;第一索引6对应基于第一索引1所对应的候选运动信息以及预设运动矢量偏移生成的新运动信息;预设的运动矢量偏移为(1,0),(0,-1),(-1,0),(0,1)。当解析码流得到第一索引值为6时,表明当前图像块采用的运动信息为基于第一索引1所对应的候选运动信息及预设运动矢量偏移生成的新的运动信息,则进一步解码以获取第二索引值。当进一步解码得到的第二索引值为1时,则当前块的运动信息为前向预测,运动矢量为(2,-3)+(0,-1)=(2,-4),参考帧POC为2。Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image. It is assumed that the candidate motion information indicated by the first index 0 is composed of sub-block motion information, and the candidate motion information corresponding to the first index 1 is not composed of sub-blocks. The motion information is composed of motion information and forward prediction, the motion vector is (2, -3), and the reference frame POC is 2. The first index 6 corresponds to the candidate motion information corresponding to the first index 1 and the preset motion vector bias. New motion information generated by shifting; preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1). When the first index value obtained by parsing the bitstream is 6, it indicates that the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to the first index 1 and a preset motion vector offset, and then further decoded. To get the second index value. When the second index value obtained by further decoding is 1, the motion information of the current block is forward prediction, and the motion vector is (2, -3) + (0, -1) = (2, -4), the reference frame POC is 2.
实施例4:Example 4:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引6指示当前块采用非相邻空域候选的运动信息作为当前块的参考运动信息。设非相邻空域候选集合大小为4,非相邻空域候选集合是根据预设的检测顺序将可用的非相邻空域候选放入集合中,设集合中非相邻空域候选运动信息如下:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The first index 6 indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block. Let the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-adjacent airspace candidate motion information in the set be as follows:
第二索引0:候选0:前向预测,运动矢量为(2,-3),参考帧POC为2。Second index 0: candidate 0: forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
第二索引1:候选1:前向预测,运动矢量为(1,-3),参考帧POC为4。Second index 1: Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
第二索引2:候选2:后向预测,运动矢量为(2,-4),参考帧POC为2。Second index 2: Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
第二索引3:候选3:双向预测,前向运动矢量为(2,-3),参考帧POC为2,后向运动矢量为(2,-2),参考帧POC为4。Second index 3: Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
当解码得到第一索引值为6时,表明当前块采用非相邻空域候选的运动信息作为当前块的参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将非相邻空域候选集合中的候选1的运动信息作为当前块的运动信息。When the first index value obtained by decoding is 6, it indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block, and then is further decoded to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of candidate 1 in the non-adjacent spatial domain candidate set is used as the motion information of the current block.
实施例5:Example 5:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引0所对应的候选运动信息为前向预测,运动矢量为(2,-3),参考帧POC为2。第一索引6指示基于第一索引0所对应的候选运动信息生成的新的运动信息或采用非相邻空域候选的运动信息作为当前块的参考运动信息。设非相邻空域候选集合大小为4,非相邻空域候选集合是根据预设的检测顺序将可用的非相邻空域候选放入集合中,设集合中非非相邻空域候选运动信息如下:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The candidate motion information corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2. The first index 6 indicates new motion information generated based on candidate motion information corresponding to the first index 0 or motion information using non-adjacent spatial domain candidates as reference motion information of the current block. Let the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-non-adjacent airspace candidate motion information in the set be as follows:
第二索引0:候选0:前向预测,运动矢量为(-5,-3),参考帧POC为2。Second index 0: candidate 0: forward prediction, the motion vector is (-5, -3), and the reference frame POC is 2.
第二索引1:候选1:前向预测,运动矢量为(1,-3),参考帧POC为4。Second index 1: Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
第二索引2:候选2:后向预测,运动矢量为(2,-4),参考帧POC为2。Second index 2: Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
第二索引3:候选3:双向预测,前向运动矢量为(2,-3),参考帧POC为2,后向运动矢量为(2,-2),参考帧POC为4。Second index 3: Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
根据第一索引0所对应的候选的运动信息和预设的运动矢量偏移(1,0),(0,-1),(-1,0),(0,1),得到另外4个候选如下:According to the candidate motion information corresponding to the first index 0 and the preset motion vector offsets (1,0), (0, -1), (-1, 0), (0, 1), another 4 are obtained The candidates are as follows:
第二索引4:候选4:前向预测,运动矢量为(2,-3)+(1,0),参考帧POC为2。Second index 4: Candidate 4: Forward prediction, the motion vector is (2, -3) + (1,0), and the reference frame POC is 2.
第二索引5:候选5:前向预测,运动矢量为(2,-3)+(0,-1),参考帧POC为2。Second index 5: Candidate 5: Forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2.
第二索引6:候选6:前向预测,运动矢量为(2,-3)+(-1,0),参考帧POC为2。Second index 6: candidate 6: forward prediction, the motion vector is (2, -3) + (-1,0), and the reference frame POC is 2.
第二索引7:候选7:前向预测,运动矢量为(2,-3)+(0,1),参考帧POC为2。Second index 7: candidate 7: forward prediction, the motion vector is (2, -3) + (0, 1), and the reference frame POC is 2.
当解码得到第一索引值为6时,表明当前块采用基于第一索引0所对应的候选运动信息生成的新的运动信息或采用非相邻空域候选的运动信息作为当前块的参考运动信息,则进一步解码以获取第二索引值。当进一步解码得到的第二索引值为0时,则将非相邻空域候选集合中的候选0(前向预测,运动矢量为(-5,-3),参考帧POC为2)的运动信息作为当前块的运动信息。当进一步解码得到的第二索引值为5时,则将进行了运动矢量偏移的候选5(前向预测,运动矢量为(2,-3)+(0,-1),参考帧POC为2)作为当前块运动信息。When the first index value obtained by decoding is 6, it indicates that the current block uses new motion information generated based on candidate motion information corresponding to the first index 0 or uses non-adjacent spatial candidate motion information as reference motion information of the current block. Then it is further decoded to obtain a second index value. When the second index value obtained by further decoding is 0, the motion information of candidate 0 (forward prediction, motion vector is (-5, -3), and reference frame POC is 2) in the non-adjacent spatial candidate set. As the motion information of the current block. When the second index value obtained by further decoding is 5, the motion vector offset candidate 5 (forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2) As the current block motion information.
实施例6:Example 6:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。 第一索引0所对应的候选的运动信息为前向预测,运动矢量为(2,-3),参考帧POC为2。第一索引6指示当前块采用的运动信息为基于第一索引0所对应的候选运动信息生成的新的运动信息。根据预设的运动矢量偏移:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The motion information of the candidate corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2. The first index 6 indicates that the motion information adopted by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0. Offset according to a preset motion vector:
(1,0),(0,-1),(-1,0),(0,1);(1,0), (0, -1), (-1, 0), (0, 1);
(2,0),(0,-2),(-2,0),(0,2);(2,0), (0, -2), (-2,0), (0,2);
第二索引值0表示间距为1的候选,1表示间距为2的候选,第三索引值表示运动矢量偏移的候选索引。当解码得到第一索引值为6时,表明当前块采用的运动信息为基于第一索引0所对应的候选运动信息生成的新的运动信息,则进一步解码以获取第二索引值。当进一步解码得到的第二索引值,第三索引值分别为1,3时,则选择间距为2且索引为2的偏移运动矢量(-2,0)。则当前块的运动信息为前向预测,运动矢量为(2,-3)+(-2,0)=(0,-3),参考帧POC为2。The second index value 0 indicates candidates with a spacing of 1, the 1 index indicates candidates with a spacing of 2, and the third index value indicates a candidate index of a motion vector offset. When the first index value obtained by decoding is 6, it indicates that the motion information used by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0, and then further decoded to obtain a second index value. When the second index value obtained by further decoding and the third index value are 1, 3 respectively, an offset motion vector (-2, 0) with a pitch of 2 and an index of 2 is selected. Then the motion information of the current block is forward prediction, the motion vector is (2, -3) + (-2,0) = (0, -3), and the reference frame POC is 2.
实施例7:Example 7:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引6指示当前块采用AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息。设AFFINE运动信息候选集合包括4个AFFINE运动信息候选:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The first index 6 indicates that one of the candidates of the motion information candidate set obtained by using AFFINE for the current block is the reference motion information. Suppose the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
第二索引0:AFFINE候选0;Second index 0: AFFINE candidate 0;
第二索引1:AFFINE候选1;Second index 1: AFFINE candidate 1;
第二索引2:AFFINE候选2;Second index 2: AFFINE candidate 2;
第二索引3:AFFINE候选3;Second index 3: AFFINE candidate 3;
当解码得到第一索引值为6时,表明当前块采用AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将AFFINE候选1的运动信息作为当前块的运动信息。When the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
实施例8:Example 8:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引6指示当前块采用相邻空域得到的运动信息候选集合中的其中一个候选为参考运动信息。设相邻空域运动信息候选集合包括4个相邻空域运动信息候选:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The first index 6 indicates that one of the candidates of the motion information candidate set obtained by using the neighboring space in the current block is the reference motion information. It is assumed that the neighboring spatial motion information candidate set includes four neighboring spatial motion information candidates:
第二索引0:相邻空域候选0;Second index 0: neighboring spatial candidate 0;
第二索引1:相邻空域候选1;Second index 1: adjacent airspace candidate 1;
第二索引2:相邻空域候选2;Second index 2: adjacent airspace candidate 2;
第二索引3:相邻空域候选3;Second index 3: adjacent airspace candidate 3;
当解码得到第一索引值为6时,表明当前块采用相邻空域得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将相邻空域候选1的运动信息作为当前块的运动信息。When the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by using the neighboring space for the current block is the reference motion information, and then further decoded to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of the neighboring spatial domain candidate 1 is used as the motion information of the current block.
实施例9:Example 9:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引6指示当前块采用相邻时域得到的运动信息候选集合中的其中一个候选为参考运动信息。设相邻时域运动信息候选集合包括4个相邻时域运动信息候选:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The first index 6 indicates that one candidate in the motion information candidate set obtained by using the neighboring time domain for the current block is reference motion information. Assume that the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
第二索引0:相邻时域候选0;Second index 0: adjacent time domain candidate 0;
第二索引1:相邻时域候选1;Second index 1: adjacent time domain candidate 1;
第二索引2:相邻时域候选2;Second index 2: Adjacent time domain candidate 2;
第二索引3:相邻时域候选3;Second index 3: adjacent time domain candidate 3;
当解码得到第一索引值为6时,表明当前块采用相邻时域得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将相邻时域候选1的运动信息作为当前块的运动信息。When the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of the neighboring time domain candidate 1 is used as the motion information of the current block.
实施例10:Example 10:
设Merge候选列表最大长度为7,第一索引0-6指示了Merge列表中各候选空间。第一索引6指示当前块采用由子块运动信息所构成的运动信息候选集合中的其中一个候选为参考运动信息。设由子块运动信息所构成的运动信息候选集合包括AFFINE运动信息候选、ATMVP、STMVP候选:Let the maximum length of the Merge candidate list be 7, and the first index 0-6 indicates each candidate space in the Merge list. The first index 6 indicates that the current block adopts one of the motion information candidate sets composed of the sub-block motion information as the reference motion information. It is assumed that the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
第二索引0:AFFINE候选;Second index 0: AFFINE candidate;
第二索引1:ATMVP候选;Second index 1: ATMVP candidate;
第二索引2:STMVP候选;Second index 2: STMVP candidates;
当解码得到第一索引值为6时,表明当前块采用由子块运动信息所构成的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将ATMVP候选的运动信息作为当前块的运动信息。When the first index value obtained by decoding is 6, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as the reference motion information, and then further decodes to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
实施例11:Example 11:
在Merge候选空间中,该列表中空间0-5为采用Merge得到的运动信息,空间6为AFFINE得到的运动信息候选集合。设第一索引0指示当前块采用Merge得到的运动信息为参考运动信息,第一索引1指示了指示当前块采用AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息。设AFFINE运动信息候选集合包括4个AFFINE运动信息候选:In the Merge candidate space, spaces 0-5 in the list are motion information obtained by using Merge, and space 6 is a motion information candidate set obtained by AFFINE. Let the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information, and the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information. Suppose the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
第二索引0:AFFINE候选0;Second index 0: AFFINE candidate 0;
第二索引1:AFFINE候选1;Second index 1: AFFINE candidate 1;
第二索引2:AFFINE候选2;Second index 2: AFFINE candidate 2;
第二索引3:AFFINE候选3;Second index 3: AFFINE candidate 3;
一种情况,当解码得到第一索引值为1时,表明当前块采用AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二标识值。进一步解码得到的第二标识值为1时,则将AFFINE候选1的运动信息作为当前块的运动信息;In one case, when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value. When the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block;
一种情况,当解码得到第一索引值为0时,表明当前块采用Merge得到的运动信息为参考运动信息,则进一步解码以获取第四索引。进一步解码得到的第四索引值为2时,则将Merge候选列表中空间2的运动信息作为当前块的运动信息。In one case, when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index. When the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
实施例12:Example 12:
在Merge候选空间中,该列表中空间0-3为采用Merge得到的运动信息,空间4为采用相邻时域得到的运动信息候选集合,空间5为由子块运动信息所构成的运动信息候选集合,空间6为AFFINE得到的运动信息候选集合。设第一索引0指示当前块采用Merge得到的运动信息为参考运动信息,第一索引1指示了指示当前块采用 AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息,第一索引01指示了指示当前块采用相邻时域得到的运动信息候选集合中的其中一个候选为参考运动信息;第一索引11指示了当前块采用由子块运动信息所构成的运动信息候选集合中的其中一个候选为参考运动信息。In the Merge candidate space, spaces 0-3 in the list are motion information obtained using Merge, space 4 is a motion information candidate set obtained using adjacent time domains, and space 5 is a motion information candidate set composed of sub-block motion information. Space 6 is the candidate set of motion information obtained by AFFINE. Let the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information, and the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information, and the first index 01 Indicates that one of the candidates of the motion information candidate set obtained by using the adjacent time domain to the current block is reference motion information; the first index 11 indicates that the current block uses one of the motion information candidate sets composed of the sub-block motion information; Candidates are reference motion information.
设AFFINE运动信息候选集合包括4个AFFINE运动信息候选:Suppose the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
第二标识0:AFFINE候选0;Second identification 0: AFFINE candidate 0;
第二标识1:AFFINE候选1;Second identification 1: AFFINE candidate 1;
第二标识2:AFFINE候选2;Second identifier 2: AFFINE candidate 2;
第二标识3:AFFINE候选3;Second identification 3: AFFINE candidate 3;
设相邻时域运动信息候选集合包括4个相邻时域运动信息候选:Assume that the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
第二索引0:相邻时域候选0;Second index 0: adjacent time domain candidate 0;
第二索引1:相邻时域候选1;Second index 1: adjacent time domain candidate 1;
第二索引2:相邻时域候选2;Second index 2: Adjacent time domain candidate 2;
第二索引3:相邻时域候选3;Second index 3: adjacent time domain candidate 3;
设由子块运动信息所构成的运动信息候选集合包括AFFINE运动信息候选、ATMVP、STMVP候选:It is assumed that the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
第二索引0:AFFINE候选;Second index 0: AFFINE candidate;
第二索引1:ATMVP候选;Second index 1: ATMVP candidate;
第二索引2:STMVP候选;Second index 2: STMVP candidates;
一种情况,当解码得到第一索引值为0时,表明当前块采用Merge得到的运动信息为参考运动信息,则进一步解码以获取第四索引。进一步解码得到的第四索引值为2时,则将Merge候选列表中空间2的运动信息作为当前块的运动信息。In one case, when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index. When the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
一种情况,当解码得到第一索引值为1时,表明当前块采用AFFINE得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二标识值。进一步解码得到的第二标识值为1时,则将AFFINE候选1的运动信息作为当前块的运动信息。In one case, when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value. When the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
一种情况,当解码得到第一索引值为01时,表明当前块采用相邻时域得到的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二标识值。进一步解码得到的第二标识值为2时,则将相邻时域候选2的运动信息作为当前块的运动信息。In one case, when the first index value obtained by decoding is 01, it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain a second identification value. When the second identification value obtained by further decoding is 2, the motion information of the neighboring time-domain candidate 2 is used as the motion information of the current block.
一种情况,当解码得到第一索引值为11时,表明当前块采用由子块运动信息所构成的运动信息候选集合中的其中一个候选为参考运动信息,则进一步解码以获取第二索引值。进一步解码得到的第二索引值为1时,则将ATMVP候选的运动信息作为当前块的运动信息。In one case, when the first index value obtained by decoding is 11, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as reference motion information, and then further decodes to obtain the second index value. When the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
本申请实施例提供一种预测运动信息的解码装置,该装置可以为视频解码器,也可以为视频编码器,还可以为解码器。具体的,预测运动信息的解码装置用于执行以上预测运动信息的解码方法中的解码装置所执行的步骤。本申请实施例提供的预测运动信息的解码装置可以包括相应步骤所对应的模块。An embodiment of the present application provides a decoding device for predicting motion information. The device may be a video decoder, a video encoder, or a decoder. Specifically, the decoding apparatus for predicting motion information is configured to perform the steps performed by the decoding apparatus in the decoding method for predicting motion information. The decoding apparatus for predicting motion information provided in the embodiment of the present application may include a module corresponding to a corresponding step.
本申请实施例可以根据上述方法示例对预测运动信息的解码装置进行功能模块的 划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the embodiment of the present application, the functional modules of the prediction motion information decoding device may be divided according to the foregoing method example. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. . The above integrated modules may be implemented in the form of hardware or software functional modules. The division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
在采用对应各个功能划分各个功能模块的情况下,图17示出上述实施例中所涉及的预测运动信息的解码装置的一种可能的结构示意图。如图9所示,预测运动信息的解码装置1700可以包括解析模块1701、确定模块1702、赋值模块1703。具体的,各模块功能如下:In a case where each functional module is divided corresponding to each function, FIG. 17 illustrates a possible structural diagram of a decoding apparatus for predicting motion information involved in the foregoing embodiment. As shown in FIG. 9, the decoding apparatus 1700 for predicting motion information may include an analysis module 1701, a determination module 1702, and an assignment module 1703. Specifically, the functions of each module are as follows:
解析模块1701,用于解析码流以获得第一标识。The analysis module 1701 is configured to parse a code stream to obtain a first identifier.
确定模块1702,用于根据第一标识,从第一候选集合中确定目标元素,第一候选集合中的元素包括至少一个第一候选运动信息和至少一个第二候选集合,第二候选集合中的元素包括多个第二候选运动信息,或者,第一候选运动信息包括第一运动信息,第二候选运动信息包括预设的运动信息偏移量。A determining module 1702 is configured to determine a target element from a first candidate set according to a first identifier, and the elements in the first candidate set include at least one first candidate motion information and at least one second candidate set. The element includes a plurality of second candidate motion information, or the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset motion information offset.
赋值模块1703,当目标元素为第一候选运动信息时,用于将第一候选运动信息作为目标运动信息,目标运动信息用来预测待处理图像块的运动信息。The assignment module 1703 is configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed.
解析模块1701,当目标元素为第二候选集合时,还用于解析码流以获得第二标识,根据第二标识,确定模块1702还用于从多个第二候选运动信息中确定目标运动信息。或者,解析模块1701用于当目标元素根据多个第二候选运动信息获得时,解析码流以获得第二标识,根据第二标识,基于多个第二候选运动信息中的一个,确定目标运动信息。The analysis module 1701 is further configured to parse a code stream to obtain a second identifier when the target element is the second candidate set. According to the second identifier, the determination module 1702 is further configured to determine target motion information from multiple second candidate motion information. . Alternatively, the parsing module 1701 is configured to parse the code stream to obtain a second identifier when the target element is obtained according to multiple second candidate motion information, and determine the target motion based on one of the multiple second candidate motion information according to the second identifier. information.
其中,解析模块1701用于支持该预测运动信息的解码装置1700执行上述实施例中的S1501及S1505等,和/或用于本文所描述的技术的其它过程。确定模块1702用于支持该预测运动信息的解码装置1700执行上述实施例中的S1502等,和/或用于本文所描述的技术的其它过程。赋值模块1703用于支持该预测运动信息的解码装置1700执行上述实施例中的S1502等,和/或用于本文所描述的技术的其它过程。The analysis module 1701 is configured to support the decoding device 1700 for predicting motion information to perform S1501, S1505, and the like in the above embodiments, and / or other processes used in the technology described herein. The determining module 1702 is configured to support the decoding apparatus 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein. The assignment module 1703 is configured to support the decoding device 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein.
在一种可行的实施方式中,当确定模块1702确定的目标元素为第二候选集合时,或者,当目标元素根据所述多个第二候选运动信息获得时,解析模块1701还用于:解析码流以获得第三标识,第三标识包括预设系数。In a feasible implementation manner, when the target element determined by the determination module 1702 is a second candidate set, or when the target element is obtained according to the plurality of second candidate motion information, the analysis module 1701 is further configured to: Code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
进一步的,如图17所示,预测运动信息的解码装置1700还可以包括计算模块1704,用于将所述多个预设的运动信息偏移量和所述预设系数相乘,以得到多个调整后的运动信息偏移量。对应的,确定模块1702,具体用于根据所述第二标识从所述多个调整后的运动信息偏移量中确定所述目标偏移量。Further, as shown in FIG. 17, the decoding apparatus 1700 for predicting motion information may further include a calculation module 1704, configured to multiply the plurality of preset motion information offsets by the preset coefficients to obtain multiple Offset of the adjusted motion information. Correspondingly, the determination module 1702 is specifically configured to determine the target offset from the plurality of adjusted motion information offsets according to the second identifier.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant content of each step involved in the above method embodiment can be referred to the functional description of the corresponding functional module, which will not be repeated here.
虽然关于视频编码器100及视频解码器200已描述本申请的特定方面,但应理解,本申请的技术可通过许多其它视频编码和/或编码单元、处理器、处理单元、例如编码器/解码器(CODEC)的基于硬件的编码单元及类似者来应用。此外,应理解,仅作为可行的实施方式而提供关于图17所展示及描述的步骤。即,图17的可行的实施方式中所展示的步骤无需必定按图17中所展示的次序执行,且可执行更少、额外或替代步骤。Although specific aspects of the application have been described with respect to video encoder 100 and video decoder 200, it should be understood that the techniques of this application may be implemented by many other video encoding and / or encoding units, processors, processing units, such as encoders / decoders (CODEC) hardware-based coding unit and the like. Furthermore, it should be understood that the steps shown and described with respect to FIG. 17 are provided only as a possible implementation. That is, the steps shown in the feasible embodiment of FIG. 17 need not necessarily be performed in the order shown in FIG. 17, and fewer, additional, or alternative steps may be performed.
在采用集成的单元的情况下,图18为本申请实施例中的预测运动信息的解码设备1800的一种示意性结构框图。具体的,预测运动信息的解码装置1800包括:处理器1801和耦合于所述处理器的存储器1802;所述处理器1801用于执行图17所示的实施例以及各种可行的实施方式。In the case of using an integrated unit, FIG. 18 is a schematic structural block diagram of a decoding device 1800 for predicting motion information in an embodiment of the present application. Specifically, the decoding device 1800 for predicting motion information includes: a processor 1801 and a memory 1802 coupled to the processor; the processor 1801 is configured to execute the embodiment shown in FIG. 17 and various feasible implementations.
其中,处理模块1801可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),ASIC,FPGA或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。存储模块102可以是存储器。Among them, the processing module 1801 may be a processor or a controller, for example, it may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an ASIC, an FPGA, or other programmable A logic device, a transistor logic device, a hardware component, or any combination thereof. It may implement or execute various exemplary logical blocks, modules, and circuits described in connection with the present disclosure. The processor may also be a combination that implements computing functions, such as a combination including one or more microprocessors, a combination of a DSP and a microprocessor, and so on. The storage module 102 may be a memory.
其中,上述方法实施例涉及的各场景的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。Wherein, all relevant content of each scenario involved in the foregoing method embodiment can be referred to the functional description of the corresponding functional module, which will not be repeated here.
上述预测运动信息的解码装置1700和预测运动信息的解码装置1800均可执行上述图15所示的预测运动信息的解码方法,预测运动信息的解码装置1700和预测运动信息的解码装置1800具体可以是视频解码装置或者其他具有视频编解码功能的设备。预测运动信息的解码装置1700和预测运动信息的解码装置1800可以用于在解码过程中进行图像预测。The above-mentioned prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may both execute the above-mentioned prediction motion information decoding method shown in FIG. 15. The prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may specifically be Video decoding device or other equipment with video codec function. The decoding apparatus 1700 for predicting motion information and the decoding apparatus 1800 for predicting motion information may be used to perform image prediction in the decoding process.
本申请实施例提供一种帧间预测装置,该帧间预测装置可以为视频解码器,也可以为视频编码器,还可以为解码器。具体的,帧间预测装置用于执行以上帧间预测方法中的帧间预测装置所执行的步骤。本申请实施例提供的帧间预测装置可以包括相应步骤所对应的模块。An embodiment of the present application provides an inter prediction device. The inter prediction device may be a video decoder, a video encoder, or a decoder. Specifically, the inter prediction apparatus is configured to perform the steps performed by the inter prediction apparatus in the above inter prediction method. The inter prediction apparatus provided in the embodiment of the present application may include a module corresponding to a corresponding step.
本申请实施例可以根据上述方法示例对帧间预测装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the embodiment of the present application, functional modules of the inter prediction device may be divided according to the foregoing method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The above integrated modules may be implemented in the form of hardware or software functional modules. The division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
本申请还提供一种终端,该终端包括:一个或多个处理器、存储器、通信接口。该存储器、通信接口与一个或多个处理器耦合;存储器用于存储计算机程序代码,计算机程序代码包括指令,当一个或多个处理器执行指令时,终端执行本申请实施例的预测运动信息的解码方法。The present application also provides a terminal, which includes: one or more processors, a memory, and a communication interface. The memory and the communication interface are coupled to one or more processors; the memory is used to store computer program code, and the computer program code includes instructions. When the one or more processors execute the instructions, the terminal executes the predicted motion information in the embodiment of the present application. Decoding method.
这里的终端可以是视频显示设备,智能手机,便携式电脑以及其它可以处理视频或者播放视频的设备。The terminal here can be a video display device, a smart phone, a portable computer, and other devices that can process or play videos.
本申请还提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的预测运动信息的解码方法。The present application also provides a video decoder including a non-volatile storage medium and a central processing unit. The non-volatile storage medium stores an executable program, and the central processing unit and the non-volatile storage unit The media is connected, and the executable program is executed to implement the decoding method of the predicted motion information in the embodiment of the present application.
本申请还提供一种解码器,所述解码器包括本申请实施例中的预测运动信息的解码装置。The present application further provides a decoder, which includes a decoding apparatus for predicting motion information in the embodiment of the present application.
本申请另一实施例还提供一种计算机可读存储介质,该计算机可读存储介质包括一个或多个程序代码,该一个或多个程序包括指令,当终端中的处理器在执行该程序 代码时,该终端执行如图15所示的预测运动信息的解码方法。Another embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium includes one or more program codes, the one or more programs include instructions, and when a processor in a terminal executes the program code, At this time, the terminal executes the decoding method of the predicted motion information shown in FIG. 15.
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;终端的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得终端实施执行如图15所示的预测运动信息的解码方法。In another embodiment of the present application, a computer program product is also provided. The computer program product includes computer-executable instructions stored in a computer-readable storage medium; at least one processor of the terminal may be obtained from a computer. The storage medium reads the computer execution instruction, and at least one processor executes the computer execution instruction to cause the terminal to execute a decoding method for predicting motion information as shown in FIG. 15.
在上述实施例中,可以全部或部分的通过软件,硬件,固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式出现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。In the above embodiments, all or part of them may be implemented by software, hardware, firmware, or any combination thereof. When implemented using a software program, it may appear in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions according to the embodiments of the present application are generated.
所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质,(例如,软盘,硬盘、磁带)、光介质(例如,DVD)或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.
此外,应理解,取决于可行的实施方式,本文中所描述的方法中的任一者的特定动作或事件可按不同序列执行,可经添加、合并或一起省去(例如,并非所有所描述的动作或事件为实践方法所必要的)。此外,在特定可行的实施方式中,动作或事件可(例如)经由多线程处理、中断处理或多个处理器来同时而非顺序地执行。另外,虽然出于清楚的目的将本申请的特定方面描述为通过单一模块或单元执行,但应理解,本申请的技术可通过与视频解码器相关联的单元或模块的组合执行。In addition, it should be understood that depending on the feasible implementation, a particular action or event of any of the methods described herein may be performed in a different sequence, may be added, merged, or omitted together (e.g., not all described Actions or events are necessary for practical methods). Furthermore, in certain possible implementations, actions or events may be performed simultaneously, for example, via multi-threaded processing, interrupt processing, or multiple processors, rather than sequentially. In addition, although certain aspects of the present application are described as being performed by a single module or unit for clarity, it should be understood that the techniques of this application may be performed by a unit or combination of modules associated with a video decoder.
在一个或多个可行的实施方式中,所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么功能可作为一个或多个指令或代码而存储于计算机可读媒体上或经由计算机可读媒体来传输,且通过基于硬件的处理单元来执行。计算机可读媒体可包含计算机可读存储媒体或通信媒体,计算机可读存储媒体对应于例如数据存储媒体的有形媒体,通信媒体包含促进计算机程序(例如)根据通信协议从一处传送到另一处的任何媒体。In one or more possible implementations, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over a computer-readable medium as one or more instructions or code, and executed by a hardware-based processing unit. The computer-readable medium may include a computer-readable storage medium or a communication medium, the computer-readable storage medium corresponding to a tangible medium such as a data storage medium, and the communication medium includes a computer program that facilitates, for example, transmission from one place to another according to a communication protocol Any media.
以这个方式,计算机可读媒体示例性地可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)例如信号或载波的通信媒体。数据存储媒体可为可由一个或多个计算机或一个或多个处理器存取以检索用于实施本申请中所描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。In this manner, computer-readable media may illustratively correspond to (1) non-transitory, tangible computer-readable storage media, or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application. The computer program product may include a computer-readable medium.
作为可行的实施方式而非限制,此计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用于存储呈指令或数据结构的形式的所要代码且可由计算机存取的任何其它媒体。同样,任何连接可适当地称作计算机可读媒体。例如,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL),或例如红外线、无线电及微波的无线技术而从 网站、服务器或其它远端源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL,或例如红外线、无线电及微波的无线技术包含于媒体的定义中。As a feasible implementation, without limitation, the computer-readable storage medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store rendering instructions. Or any other medium in the form of a data structure with the desired code and accessible by a computer. Also, any connection is properly termed a computer-readable medium. For example, if a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave is used to transmit instructions from a website, server, or other remote source, then coaxial Cables, fiber optic cables, twisted pairs, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media.
然而,应理解,计算机可读存储媒体及数据存储媒体不包含连接、载波、信号或其它暂时性媒体,而替代地针对非暂时性有形存储媒体。如本文中所使用,磁盘及光盘包含紧密光盘(CD)、雷射光盘、光盘、数字多功能光盘(DVD)、软性磁盘及蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘通过雷射以光学方式再现数据。以上各物的组合也应包含于计算机可读媒体的范围内。It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transitory, tangible storage media. As used herein, magnetic disks and optical discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), flexible disks, and Blu-ray discs, where magnetic discs typically reproduce data magnetically, and optical discs pass lasers The data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
可通过例如一个或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它等效集成或离散逻辑电路的一个或多个处理器来执行指令。因此,如本文中所使用,术语“处理器”可指前述结构或适于实施本文中所描述的技术的任何其它结构中的任一者。另外,在一些方面中,可将本文所描述的功能性提供于经配置以用于编码及解码的专用硬件和/或软件模块内,或并入于组合式编码解码器中。同样,技术可完全实施于一个或多个电路或逻辑元件中。Can be processed by one or more of, for example, one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuits To execute instructions. Thus, as used herein, the term "processor" may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
本申请的技术可实施于广泛多种装置或设备中,包含无线手机、集成电路(IC)或IC的集合(例如,芯片组)。本申请中描述各种组件、模块或单元以强调经配置以执行所揭示的技术的装置的功能方面,但未必需要通过不同硬件单元实现。更确切来说,如前文所描述,各种单元可组合于编码解码器硬件单元中或由互操作的硬件单元(包含如前文所描述的一个或多个处理器)结合合适软件和/或固件的集合来提供。The techniques of this application can be implemented in a wide variety of devices or devices, including wireless handsets, integrated circuits (ICs), or collections of ICs (eg, chipset). Various components, modules, or units are described in this application to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily need to be implemented by different hardware units. More specifically, as described above, various units may be combined in a codec hardware unit or by interoperable hardware units (including one or more processors as described above) combined with appropriate software and / or firmware To provide.
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。The above description is only an exemplary specific implementation of the present application, but the scope of protection of the present application is not limited to this. Any person skilled in the art can easily think of changes or changes within the technical scope disclosed in this application. Replacement shall be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (20)

  1. 一种预测运动信息的解码方法,其特征在于,包括:A decoding method for predicting motion information, comprising:
    解析码流以获得第一标识;Parse the code stream to obtain a first identifier;
    根据所述第一标识,从第一候选集合中确定目标元素,所述第一候选集合中的元素包括至少一个第一候选运动信息和多个第二候选运动信息,所述第一候选运动信息包括第一运动信息,所述第二候选运动信息包括预设的运动信息偏移量;Determining a target element from a first candidate set according to the first identifier, the elements in the first candidate set including at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information Including first motion information, and the second candidate motion information includes a preset motion information offset;
    当所述目标元素为所述第一候选运动信息时,将所述第一候选运动信息作为目标运动信息,所述目标运动信息用来预测待处理图像块的运动信息;When the target element is the first candidate motion information, use the first candidate motion information as target motion information, and the target motion information is used to predict motion information of an image block to be processed;
    当所述目标元素根据所述多个第二候选运动信息获得时,解析所述码流以获得第二标识,根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息。When the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and based on the second identifier, based on one of the plurality of second candidate motion information, Determining the target motion information.
  2. 根据权利要求1所述的方法,其特征在于,所述第一候选运动信息包括所述待处理图像块的空域相邻图像块的运动信息。The method according to claim 1, wherein the first candidate motion information includes motion information of a spatially adjacent image block of the image block to be processed.
  3. 根据权利要求1或2所述的方法,其特征在于,所述第二候选运动信息基于所述第一运动信息和预设的运动信息偏移量获得。The method according to claim 1 or 2, wherein the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
  4. 根据权利要求1或2所述的方法,其特征在于,所述根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息,包括:The method according to claim 1 or 2, wherein determining the target motion information based on the second identifier based on one of the plurality of second candidate motion information includes:
    根据所述第二标识从多个预设的运动信息偏移量中确定目标偏移量;Determining a target offset from a plurality of preset motion information offsets according to the second identifier;
    基于所述第一运动信息和所述目标偏移量确定所述目标运动信息。The target motion information is determined based on the first motion information and the target offset.
  5. 根据权利要求1至4任一项所述的方法,其特征在于,在所述至少一个第一候选运动信息中,用于标识所述第一运动信息的编码码字最短。The method according to any one of claims 1 to 4, characterized in that, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
  6. 根据权利要求1至5任一项所述的方法,其特征在于,当所述目标元素根据所述多个第二候选运动信息获得时,所述方法还包括:The method according to any one of claims 1 to 5, wherein when the target element is obtained according to the plurality of second candidate motion information, the method further comprises:
    解析所述码流以获得第三标识,所述第三标识包括预设系数。Parse the code stream to obtain a third identifier, where the third identifier includes a preset coefficient.
  7. 根据权利要求6所述的方法,其特征在于,在根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息之前,所述方法还包括:The method according to claim 6, wherein before the determining the target motion information based on the second identifier based on one of the plurality of second candidate motion information, the method further comprises:
    将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量。Multiply a plurality of preset motion information offsets by a preset coefficient to obtain a plurality of adjusted motion information offsets.
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述目标运动信息用来预测待处理图像块的运动信息,包括:The method according to any one of claims 1 to 7, wherein the target motion information is used to predict motion information of an image block to be processed, and includes:
    将所述目标运动信息作为所述待处理图像块的运动信息;或者,将所述目标运动信息作为所述待处理图像块的预测运动信息。Use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
  9. 根据权利要求1至8任一项所述的方法,其特征在于,所述第二标识采用定长编码方式。The method according to any one of claims 1 to 8, wherein the second identifier adopts a fixed-length encoding method.
  10. 根据权利要求1至8任一项所述的方法,其特征在于,所述第二标识采用变长编码方式。The method according to any one of claims 1 to 8, wherein the second identifier adopts a variable length coding method.
  11. 一种预测运动信息的解码装置,其特征在于,包括:A decoding device for predicting motion information, comprising:
    解析模块,用于解析码流以获得第一标识;A parsing module, configured to parse a code stream to obtain a first identifier;
    确定模块,用于根据所述第一标识,从第一候选集合中确定目标元素,所述第一 候选集合中的元素包括至少一个第一候选运动信息和多个第二候选运动信息,所述第一候选运动信息包括第一运动信息,所述第二候选运动信息包括预设的运动信息偏移量;A determining module, configured to determine a target element from a first candidate set according to the first identifier, and the elements in the first candidate set include at least one first candidate motion information and a plurality of second candidate motion information; The first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset;
    赋值模块,用于当所述目标元素为所述第一候选运动信息时,将所述第一候选运动信息作为目标运动信息,所述目标运动信息用来预测待处理图像块的运动信息;An assignment module, configured to use the first candidate motion information as target motion information when the target element is the first candidate motion information, and the target motion information is used to predict motion information of an image block to be processed;
    所述解析模块还用于,当所述目标元素根据所述多个第二候选运动信息获得时,解析所述码流以获得第二标识,根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息。The analysis module is further configured to: when the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and according to the second identifier, based on the plurality of first candidate motion information, One of the two candidate motion information determines the target motion information.
  12. 根据权利要求11所述的装置,其特征在于,所述第一候选运动信息包括所述待处理图像块的空域相邻图像块的运动信息。The apparatus according to claim 11, wherein the first candidate motion information includes motion information of a spatially adjacent image block of the image block to be processed.
  13. 根据权利要求11或12所述的装置,其特征在于,所述第二候选运动信息基于所述第一运动信息和预设的运动信息偏移量获得。The apparatus according to claim 11 or 12, wherein the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
  14. 根据权利要求11或12所述的装置,其特征在于,所述解析模块具体用于:The apparatus according to claim 11 or 12, wherein the analysis module is specifically configured to:
    根据所述第二标识从多个预设的运动信息偏移量中确定目标偏移量;Determining a target offset from a plurality of preset motion information offsets according to the second identifier;
    基于所述第一运动信息和所述目标偏移量确定所述目标运动信息。The target motion information is determined based on the first motion information and the target offset.
  15. 根据权利要求11至14任一项所述的装置,其特征在于,在所述至少一个第一候选运动信息中,用于标识所述第一运动信息的编码码字最短。The device according to any one of claims 11 to 14, wherein, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
  16. 根据权利要求11至15任一项所述的装置,其特征在于,当所述目标元素根据所述多个第二候选运动信息获得时,所述解析模块还用于:The apparatus according to any one of claims 11 to 15, wherein when the target element is obtained according to the plurality of second candidate motion information, the analysis module is further configured to:
    解析所述码流以获得第三标识,所述第三标识包括预设系数。Parse the code stream to obtain a third identifier, where the third identifier includes a preset coefficient.
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括:The device according to claim 16, further comprising:
    计算模块,用于将多个预设的运动信息偏移量和所述预设系数相乘,以得到多个调整后的运动信息偏移量。A calculation module is configured to multiply a plurality of preset motion information offsets by the preset coefficients to obtain a plurality of adjusted motion information offsets.
  18. 根据权利要求11至17任一项所述的装置,其特征在于,所述确定模块具体用于:The device according to any one of claims 11 to 17, wherein the determining module is specifically configured to:
    将所述目标运动信息作为所述待处理图像块的运动信息;或者,将所述目标运动信息作为所述待处理图像块的预测运动信息。Use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
  19. 根据权利要求11至18任一项所述的装置,其特征在于,所述第二标识采用定长编码方式。The device according to any one of claims 11 to 18, wherein the second identifier adopts a fixed-length encoding method.
  20. 根据权利要求11至18任一项所述的装置,其特征在于,所述第二标识采用变长编码方式。The device according to any one of claims 11 to 18, wherein the second identifier adopts a variable length coding method.
PCT/CN2019/105711 2018-09-13 2019-09-12 Decoding method and device for predicted motion information WO2020052653A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
EP19860217.9A EP3843404A4 (en) 2018-09-13 2019-09-12 Decoding method and device for predicted motion information
SG11202102362UA SG11202102362UA (en) 2018-09-13 2019-09-12 Decoding method and decoding apparatus for predicting motion information
BR112021004429-9A BR112021004429A2 (en) 2018-09-13 2019-09-12 decoding method and decoding apparatus for predicting motion information
KR1020247028818A KR20240135033A (en) 2018-09-13 2019-09-12 Decoding method and decoding apparatus for predicting motion information
KR1020217010321A KR102701208B1 (en) 2018-09-13 2019-09-12 Decoding method and decoding device for predicting motion information
JP2021513418A JP7294576B2 (en) 2018-09-13 2019-09-12 Decoding method and decoding device for predicting motion information
CA3112289A CA3112289A1 (en) 2018-09-13 2019-09-12 Decoding method and decoding apparatus for predicting motion information
US17/198,544 US20210203944A1 (en) 2018-09-13 2021-03-11 Decoding method and decoding apparatus for predicting motion information
ZA2021/01890A ZA202101890B (en) 2018-09-13 2021-03-19 Decoding method and decoding apparatus for predicting motion information

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201811068957.4 2018-09-13
CN201811068957 2018-09-13
CN201811264674.7 2018-10-26
CN201811264674.7A CN110896485B (en) 2018-09-13 2018-10-26 Decoding method and device for predicting motion information

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/198,544 Continuation US20210203944A1 (en) 2018-09-13 2021-03-11 Decoding method and decoding apparatus for predicting motion information

Publications (1)

Publication Number Publication Date
WO2020052653A1 true WO2020052653A1 (en) 2020-03-19

Family

ID=69778170

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/105711 WO2020052653A1 (en) 2018-09-13 2019-09-12 Decoding method and device for predicted motion information

Country Status (1)

Country Link
WO (1) WO2020052653A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103907346A (en) * 2011-10-11 2014-07-02 联发科技股份有限公司 Method and apparatus of motion and disparity vector derivation for 3D video coding and HEVC
CN104126302A (en) * 2011-11-07 2014-10-29 高通股份有限公司 Generating additional merge candidates
CN105308967A (en) * 2013-04-05 2016-02-03 三星电子株式会社 Video stream coding method according to prediction structure for multi-view video and device therefor, and video stream decoding method according to prediction structure for multi-view video and device therefor
EP3062518A1 (en) * 2013-10-24 2016-08-31 Electronics and Telecommunications Research Institute Video encoding/decoding method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103907346A (en) * 2011-10-11 2014-07-02 联发科技股份有限公司 Method and apparatus of motion and disparity vector derivation for 3D video coding and HEVC
CN104126302A (en) * 2011-11-07 2014-10-29 高通股份有限公司 Generating additional merge candidates
CN105308967A (en) * 2013-04-05 2016-02-03 三星电子株式会社 Video stream coding method according to prediction structure for multi-view video and device therefor, and video stream decoding method according to prediction structure for multi-view video and device therefor
EP3062518A1 (en) * 2013-10-24 2016-08-31 Electronics and Telecommunications Research Institute Video encoding/decoding method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP3843404A4 *
XU CHEN , NA ZHANG , JIANHUA ZHENG : "CE 4: Enhanced Merge Mode (Test 4. 2. 15)", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 11TH MEETING, no. JVET-K0198, 18 July 2018 (2018-07-18), Ljubljana SI, pages 1 - 8, XP030199235 *

Similar Documents

Publication Publication Date Title
US10609423B2 (en) Tree-type coding for video coding
CN109996081B (en) Image prediction method, device and coder-decoder
TWI544783B (en) Method,device and computer-readable storage medium of coding video data
KR20210024165A (en) Inter prediction method and apparatus
US20130272409A1 (en) Bandwidth reduction in video coding through applying the same reference index
US20130272380A1 (en) Grouping bypass coded syntax elements in video coding
US11172212B2 (en) Decoder-side refinement tool on/off control
KR102542196B1 (en) Video coding method and apparatus
US20210185325A1 (en) Motion vector obtaining method and apparatus, computer device, and storage medium
CN111200735B (en) Inter-frame prediction method and device
US20210203944A1 (en) Decoding method and decoding apparatus for predicting motion information
JP6224851B2 (en) System and method for low complexity coding and background detection
US11394996B2 (en) Video coding method and apparatus
JP7331105B2 (en) INTER-FRAME PREDICTION METHOD AND RELATED APPARATUS
WO2019000443A1 (en) Inter-frame prediction method and device
WO2020042758A1 (en) Interframe prediction method and device
WO2020052653A1 (en) Decoding method and device for predicted motion information
CN110855993A (en) Method and device for predicting motion information of image block
WO2020024275A1 (en) Inter-frame prediction method and device
WO2020038232A1 (en) Method and apparatus for predicting movement information of image block

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19860217

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3112289

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2021513418

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112021004429

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 2019860217

Country of ref document: EP

Effective date: 20210323

ENP Entry into the national phase

Ref document number: 20217010321

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112021004429

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20210309