WO2020052653A1 - Decoding method and device for predicted motion information - Google Patents
Decoding method and device for predicted motion information Download PDFInfo
- Publication number
- WO2020052653A1 WO2020052653A1 PCT/CN2019/105711 CN2019105711W WO2020052653A1 WO 2020052653 A1 WO2020052653 A1 WO 2020052653A1 CN 2019105711 W CN2019105711 W CN 2019105711W WO 2020052653 A1 WO2020052653 A1 WO 2020052653A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- motion information
- candidate
- motion vector
- identifier
- list
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
Definitions
- the present application relates to the technical field of video encoding and decoding, and in particular, to a method and device for predicting motion information decoding.
- Digital video technology can be widely used in various devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), notebook computers, tablet computers, e-book readers, digital cameras, digital Recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio telephones, video teleconferencing devices, video streaming devices, and the like.
- Digital video devices implement video decoding technology to effectively send, receive, encode, decode, and / or store digital video information.
- Video compression techniques perform spatial (intra-image) prediction and / or temporal (inter-image) prediction to reduce or remove redundant information inherent in a video sequence.
- the basic principle of video compression is to remove the redundancy as much as possible by using the correlation between the spatial, temporal and codewords.
- the current popular approach is to use a block-based hybrid video coding framework to achieve video coding compression through prediction (including intra prediction and inter prediction), transformation, quantization, and entropy coding.
- Inter-prediction is to use the correlation of the video time domain to predict the pixels of the current image by using the pixels of adjacent coded images in order to effectively remove the video time domain redundancy.
- the predicted motion information of each image block is determined from the candidate motion information list, thereby generating its prediction block through a motion compensation process.
- the motion information includes reference image information and motion vectors.
- the reference picture information includes unidirectional / bidirectional prediction information, a reference picture list, and a reference picture index corresponding to the reference picture list.
- Motion vectors refer to horizontal and vertical position shifts.
- the embodiments of the present application provide a decoding method and device for predicting motion information, which can effectively control the length of the candidate motion information list when more candidate motion information is introduced.
- a first aspect of the embodiments of the present application provides a method for decoding motion information prediction, including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first candidate
- the elements in the set include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset; when the target element When it is the first candidate motion information, the first candidate motion information as the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; when the target element is based on the plurality of second candidate motion information When obtained, the code stream is parsed to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information.
- the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information.
- the structure of the multilayer candidate set when more is introduced
- a set of candidate motion information sets can be added as an element to the first candidate set.
- the length of the first candidate set is greatly shortened.
- the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
- the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
- the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier.
- the target element is uniquely determined by combining the fourth identifier with the first identifier.
- the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
- the first candidate motion information may be candidate motion information generated by a Merge mode.
- the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
- determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier includes: biasing the plurality of preset motion information according to the second identifier.
- the target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
- an encoding codeword for identifying the first motion information is shortest.
- the method for decoding prediction motion information provided in the present application may further include: parsing a bitstream to obtain a third An identifier, and the third identifier includes a preset coefficient.
- the method before determining the target motion information based on one of the plurality of second candidate motion information according to the second identifier, the method further includes: converting a plurality of preset motion information The offset is multiplied by a preset coefficient to obtain a plurality of adjusted motion information offsets.
- the target motion information is used to predict motion information of an image block to be processed, and includes: using the target motion information as motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
- the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
- the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
- another decoding method for predicting motion information including: parsing a code stream to obtain a first identifier; and determining a target element from a first candidate set according to the first identifier, the first
- the elements in the candidate set include at least one first candidate motion information and at least one second candidate set.
- the elements in the second candidate set include multiple second candidate motion information.
- the target element is the first candidate motion information, it will be regarded as The first candidate motion information of the target element is used as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed.
- the code stream is parsed to obtain a second identifier. Two identifiers, determining the target motion information from the plurality of second candidate motion information.
- the elements in the first candidate set include the first candidate motion information and at least one second candidate set.
- the structure of the multi-layer candidate set when more candidates are introduced
- a type of candidate motion information set can be added as an element to the first candidate set.
- the length of the first candidate set is greatly selected.
- the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
- the first identifier may be a category identifier, which is used to indicate a category to which the target element belongs.
- the method for decoding prediction motion information provided in the embodiment of the present application may further include: parsing a bitstream to obtain a fourth identifier, where the fourth identifier is a target element in the first candidate set. , The index in the category indicated by the first identifier.
- the target element is uniquely determined by combining the fourth identifier with the first identifier.
- the first candidate motion information includes motion information of spatially adjacent image blocks of the image block to be processed.
- the first candidate motion information may be candidate motion information generated by a Merge mode.
- the second candidate motion information includes motion information of a spatial domain non-adjacent image block of the image block to be processed.
- the second candidate motion information may be candidate motion information generated by the Affine Merge mode.
- the first candidate motion information includes first motion information
- the second candidate motion information includes second motion information
- the second motion information is based on the first motion information and preset motion information. Offset is obtained.
- the first candidate motion information includes the first motion information
- the second candidate motion information includes a preset motion information offset
- the determination of the target motion information in the second candidate motion information includes: determining a target offset from a plurality of preset motion information offsets according to a second identifier; and determining the target motion information based on the first motion information and the target offset.
- the first candidate motion information includes first motion information
- at least one second candidate set included in the first candidate set is a plurality of second candidate sets, and a plurality of second candidates
- the set includes at least one third candidate set and at least one fourth candidate set.
- Elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed
- elements in the fourth candidate set include multiple Motion information obtained based on the first motion information and a preset motion information offset.
- a coding codeword for identifying the first motion information is shortest.
- the first motion information does not include motion information obtained according to an alternative temporal motion vector prediction (alternative temporal vector prediction) mode.
- At least one second candidate set included in the first candidate set is a plurality of second candidate sets
- the plurality of second candidate sets includes at least one fifth candidate set and at least one The sixth candidate set
- the elements in the fifth candidate set include motion information of spatial non-adjacent image blocks of the plurality of image blocks to be processed
- the elements in the sixth candidate set include a plurality of preset motion information offsets.
- the decoding method for predicting motion information provided in this application may further include: parsing the code stream to obtain a third identifier, where the third identifier Including preset coefficients.
- the method before determining the target offset from a plurality of preset motion information offsets according to the second identifier, the method further includes: shifting the plurality of preset motion information. And the preset coefficients included in the third identifier are multiplied to obtain a plurality of adjusted motion information offsets; correspondingly, the target offset is determined from the plurality of preset motion information offsets according to the second identifier Includes: determining a target offset amount from a plurality of adjusted motion information offset amounts adjusted according to a preset coefficient according to a second identifier.
- the second candidate motion information and the first candidate motion information are different.
- the first candidate motion information and the second candidate motion information may be candidate motion information selected according to different inter prediction modes.
- the target motion information is used to predict the motion information of the image block to be processed, and includes: using the target motion information as the motion information of the image block to be processed; or Process the predicted motion information of the image block. After obtaining the motion information or prediction motion information of the image block to be processed, motion compensation is performed to generate its image block or prediction block.
- the second identifier may adopt a fixed-length encoding method, which can save the number of bytes occupied by the identifier.
- the second identifier may adopt a variable-length encoding manner, so that more candidate motion information can be identified.
- a decoding apparatus for predicting motion information including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier.
- the target element is determined in the set.
- the elements in the first candidate set include at least one first candidate motion information and a plurality of second candidate motion information.
- the first candidate motion information includes the first motion information
- the second candidate motion information includes a preset.
- an assignment module configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed;
- the module is further configured to: when the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and determine the target motion information based on one of the plurality of second candidate motion information according to the second identifier.
- the elements in the first candidate set include the first candidate motion information and a plurality of second candidate motion information.
- the structure of the multi-layer candidate set when more is introduced When candidate is selected, a type of candidate motion information set can be added as an element to the first candidate set.
- the length of the first candidate set is greatly selected.
- the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
- the first candidate motion information may include motion information of a spatially adjacent image block of the image block to be processed.
- the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
- the analysis module is specifically configured to determine a target offset from a plurality of preset motion information offsets according to the second identifier; based on the first motion information and the target offset Determine the target motion information.
- an encoding codeword for identifying the first motion information is shortest.
- the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
- the device further includes a calculation module for multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motion information offsets. Shift amount.
- the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained by the calculation module according to the second identifier, and then based on the first motion The information and target offset determine target motion information.
- the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
- the second identifier adopts a fixed-length encoding manner.
- the second identifier adopts a variable length coding method.
- the apparatus for decoding prediction motion information provided in the third aspect of the embodiments of the present application is configured to execute the method for decoding prediction motion information provided in the first aspect, and the specific implementation is the same, and details are not repeated one by one.
- a decoding apparatus for predicting motion information including: a parsing module for parsing a bitstream to obtain a first identifier; and a determining module for parsing a first candidate from the first candidate according to the first identifier.
- the target element is determined in the set.
- the elements in the first candidate set include at least one first candidate motion information and at least one second candidate set.
- the elements in the second candidate set include a plurality of second candidate motion information; the assignment module, when the target When the element is the first candidate motion information, it is used to use the first candidate motion information as the target motion information, and the target motion information is used to predict the motion information of the image block to be processed; the analysis module is further configured to, when the target element is the second candidate During assembly, the code streams are parsed to obtain a second identifier, and the determining module is further configured to determine target motion information from the plurality of second candidate motion information according to the second identifier.
- the elements in the first candidate set include the first candidate motion information and at least one second candidate set.
- the structure of the multi-layer candidate set when more candidates are introduced
- a type of candidate motion information set can be added as an element to the first candidate set.
- the length of the first candidate set is greatly selected.
- the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
- the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed.
- the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed.
- the first candidate motion information includes first motion information
- the second candidate motion information includes second motion information
- the second motion information is based on the first motion information and preset motion information. Offset is obtained.
- the first candidate motion information includes first motion information
- the second candidate motion information includes a preset motion information offset
- the analysis module is specifically configured to: The second identifier determines a target offset from a plurality of preset motion information offsets; and determines the target motion information based on the first motion information and the target offset.
- the first candidate motion information includes first motion information
- at least one second candidate set is a plurality of second candidate sets
- the plurality of second candidate sets includes at least one third candidate Set and at least one fourth candidate set
- the elements in the third candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed
- the elements in the fourth candidate set include multiple based on the first motion information and Motion information obtained by a preset motion information offset.
- an encoding codeword for identifying the first motion information is shortest.
- the first motion information does not include motion information obtained according to the ATMVP mode.
- the at least one second candidate set is a plurality of second candidate sets, and the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set.
- the elements in the candidate set include motion information of spatial non-adjacent image blocks of multiple image blocks to be processed, and the elements in the sixth candidate set include multiple preset motion information offsets.
- the parsing module is further configured to parse the code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
- the fourth aspect further includes a calculation module, configured to multiply a plurality of preset motion information offsets by a preset coefficient to obtain a plurality of adjusted motion information offsets.
- the determination module is specifically configured to determine a target offset from a plurality of adjusted motion information offsets obtained from the calculation module according to the second identifier, and then determine the target motion based on the first motion information and the target offset. information.
- the second candidate motion information and the first candidate motion information are different.
- the determining module is specifically configured to use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
- the second identifier adopts a fixed-length encoding manner.
- the second identifier adopts a variable length coding method.
- a fifth aspect of the embodiments of the present application provides a decoding apparatus for predicting motion information, including: a processor and a memory coupled to the processor; the processor is configured to execute the first aspect or the second aspect. Decoding method for predicting motion information.
- a video decoder which includes a non-volatile storage medium and a central processing unit.
- the non-volatile storage medium stores executable programs, and the central processing unit and the The non-volatile storage medium is connected and executes the decoding method for predicting motion information according to the first aspect and / or the second aspect, or any one of the possible implementation manners.
- a computer-readable storage medium stores instructions. When the instructions are run on a computer, the computer is caused to execute the first aspect or the first aspect.
- a computer program product including instructions is provided, and when the instructions are run on a computer, the computer is caused to execute the method for decoding motion prediction information described in the first or second aspect. .
- FIG. 1 is an exemplary block diagram of a video decoding system that can be configured for use in an embodiment of the present application
- FIG. 2 is an exemplary system block diagram of a video encoder that can be configured for use in an embodiment of the present application
- FIG. 3 is an exemplary system block diagram of a video decoder that can be configured for use in embodiments of the present application
- FIG. 4 is a block diagram of an exemplary inter prediction module that can be configured for use in an embodiment of the present application
- FIG. 5 is an exemplary implementation flowchart of a merge prediction mode
- FIG. 6 is an exemplary implementation flowchart of an advanced motion vector prediction mode
- FIG. 7 is an exemplary implementation flowchart of a motion compensation performed by a video decoder that can be configured for an embodiment of the present application
- FIG. 8 is a schematic diagram of an exemplary coding unit and adjacent position image blocks associated with the coding unit
- FIG. 9 is an exemplary implementation flowchart of constructing a candidate prediction motion vector list
- 10 is an exemplary implementation diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list
- 11 is an exemplary implementation diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list
- FIG. 12 is an exemplary implementation diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list
- FIG. 13 is a schematic diagram of another exemplary coding unit and adjacent position image blocks associated with the coding unit
- 14A is a schematic diagram of an exemplary method for constructing a candidate motion vector set
- 14B is a schematic diagram of an exemplary method for constructing a candidate motion vector set
- 15 is a schematic flowchart of a decoding method for predicting motion information according to an embodiment of the present application.
- 16A is a schematic diagram of an exemplary method for constructing a candidate motion vector set
- 16B is a schematic diagram of an exemplary method for constructing a candidate motion vector set
- 16C is a schematic diagram of an exemplary method for constructing a candidate motion vector set
- FIG. 17 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application.
- FIG. 18 is a schematic block diagram of a decoding apparatus for predicting motion information according to an embodiment of the present application.
- words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of the words “exemplary” or “for example” is intended to present the relevant concept in a concrete manner.
- FIG. 1 is a block diagram of a video decoding system 1 according to an example described in the embodiment of the present application.
- video coder generally refers to both video encoders and video decoders.
- video coding or “coding” may generally refer to video encoding or video decoding.
- the video encoder 100 and the video decoder 200 of the video decoding system 1 are configured to predict motion information, such as a motion vector, of a currently decoded image block or a sub-block thereof according to any one of multiple new inter prediction modes,
- the predicted motion vector is close to the motion vector obtained by using the motion estimation method to the greatest extent, so that it is not necessary to transmit the motion vector difference value during encoding, thereby further improving the encoding and decoding performance.
- the video decoding system 1 includes a source device 10 and a destination device 20.
- the source device 10 generates encoded video data. Therefore, the source device 10 may be referred to as a video encoding device.
- the destination device 20 may decode the encoded video data generated by the source device 10. Therefore, the destination device 20 may be referred to as a video decoding device.
- Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors.
- the memory may include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other media that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
- the source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called “smart” phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
- the destination device 20 may receive the encoded video data from the source device 10 via the link 30.
- the link 30 may include one or more media or devices capable of moving the encoded video data from the source device 10 to the destination device 20.
- the link 30 may include one or more communication media enabling the source device 10 to directly transmit the encoded video data to the destination device 20 in real time.
- the source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to the destination device 20.
- the one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
- the one or more communication media may include routers, switches, base stations, or other devices that facilitate communication from the source device 10 to the destination device 20.
- the encoded data may be output from the output interface 140 to the storage device 40.
- the encoded data can be accessed from the storage device 40 through the input interface 240.
- the storage device 40 may include any of a variety of distributed or locally accessed data storage media, such as a hard disk drive, a Blu-ray disc, a digital video disc (DVD), and a compact disc-read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or any other suitable digital storage medium for storing encoded video data.
- the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded video produced by the source device 10.
- the destination device 20 may access the stored video data from the storage device 40 via streaming or download.
- the file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to the destination device 20.
- Example file servers include a network server (for example, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, or a local disk drive.
- the destination device 20 can access the encoded video data through any standard data connection, including an Internet connection.
- This may include wireless channels (e.g., wireless-fidelity (Wi-Fi) connections), wired connections (e.g., digital subscriber lines (DSL), cable modems, etc.), or suitable for accessing storage
- Wi-Fi wireless-fidelity
- DSL digital subscriber lines
- the transmission of the encoded video data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
- the decoding method for predicting motion information can be applied to video encoding and decoding to support a variety of multimedia applications, such as air television broadcasting, cable television transmission, satellite television transmission, streaming video transmission (for example, via the Internet), Encoding video data stored on a data storage medium, decoding video data stored on a data storage medium, or other applications.
- the video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
- the video decoding system 1 illustrated in FIG. 1 is merely an example, and the techniques of the present application can be applied to a video decoding setting (for example, video encoding or video decoding) that does not necessarily include any data communication between the encoding device and the decoding device. .
- data is retrieved from local storage, streamed over a network, and so on.
- the video encoding device may encode the data and store the data to a memory, and / or the video decoding device may retrieve the data from the memory and decode the data.
- encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
- the source device 10 includes a video source 120, a video encoder 100, and an output interface 140.
- the output interface 140 may include a regulator / demodulator (modem) and / or a transmitter.
- Video source 120 may include a video capture device (e.g., a camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and / or a computer for generating video data Graphics systems, or a combination of these sources of video data.
- the video encoder 100 may encode video data from the video source 120.
- the source device 10 transmits the encoded video data directly to the destination device 20 via the output interface 140.
- the encoded video data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
- the destination device 20 includes an input interface 240, a video decoder 200, and a display device 220.
- the input interface 240 includes a receiver and / or a modem.
- the input interface 240 may receive the encoded video data via the link 30 and / or from the storage device 40.
- the display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays decoded video data.
- the display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display, or other types of display devices.
- LCD liquid crystal display
- OLED organic light-emitting diode
- video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer unit Or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
- the demultiplexer (MUX-DEMUX) unit may conform to the International Telecommunication Union (ITU) H.223 multiplexer protocol, or, for example, the user datagram protocol (user) datagram protocol, UDP) and other protocols.
- ITU International Telecommunication Union
- UDP user datagram protocol
- Each of the video encoder 100 and the video decoder 200 may be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), and application specific integrated circuits. (application-specific integrated circuit (ASIC)), field programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder in a corresponding device / Decoder (codec).
- codec device / Decoder
- This application may generally refer to video encoder 100 as “signaling” or “transmitting” certain information to another device, such as video decoder 200.
- the terms “signaling” or “transmitting” may generally refer to the transmission of syntax elements and / or other data to decode the compressed video data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur over a period of time, such as when a syntax element is stored in a coded stream to a computer-readable storage medium at the time of encoding, and the decoding device may then store the syntax element after the syntax element is stored on this medium Retrieve the syntax element at any time.
- H.265 High Efficiency Video Coding (HEVC)
- HEVC High Efficiency Video Coding
- the HEVC standardization is based on an evolution model of a video decoding device called a HEVC test model (HEVC model).
- the latest standard document of H.265 can be obtained from http://www.itu.int/rec/T-REC-H.265.
- the latest version of the standard document is H.265 (12/16).
- the standard document is in full text.
- the citation is incorporated herein.
- HM assumes that video decoding devices have several additional capabilities over existing algorithms of ITU-TH.264 / AVC. For example, H.264 provides 9 intra-prediction encoding modes, while HM provides up to 35 intra-prediction encoding modes.
- H.266 test model The evolution model of the video decoding device.
- the algorithm description of H.266 can be obtained from http://phenix.int-evry.fr/jvet. The latest algorithm description is included in JVET-F1001-v2.
- the algorithm description document is incorporated herein by reference in its entirety.
- the reference software for the JEM test model is available from https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/ and is also incorporated herein by reference in its entirety.
- HM can divide a video frame or image into a tree block or a sequence of the largest coding unit (LCU) that contains both luma and chroma samples.
- LCU is also called the coding tree unit.
- CTU coding tree unit
- the tree block has a similar purpose as the macro block of the H.264 standard.
- a slice contains several consecutive tree blocks in decoding order.
- a video frame or image can be split into one or more slices.
- Each tree block can be split into coding units according to a quadtree. For example, a tree block that is a root node of a quad tree may be split into four child nodes, and each child node may be a parent node and split into another four child nodes.
- the final indivisible child nodes that are leaf nodes of the quadtree include decoding nodes, such as decoded video blocks.
- decoding nodes such as decoded video blocks.
- the syntax data associated with the decoded codestream can define the maximum number of times a tree block can be split, and can also define the minimum size of a decoding node.
- the coding unit includes a decoding node and a prediction unit (PU), and a transform unit (TU) associated with the decoding node.
- the size of the CU corresponds to the size of the decoding node and the shape must be square.
- the size of the CU can range from 8 ⁇ 8 pixels to a maximum 64 ⁇ 64 pixels or larger tree block size.
- Each CU may contain one or more PUs and one or more TUs.
- the syntax data associated with a CU may describe a case where a CU is partitioned into one or more PUs.
- the partitioning mode may be different between cases where the CU is skipped or is encoded in direct mode, intra prediction mode, or inter prediction mode.
- the PU can be divided into non-square shapes.
- the syntax data associated with a CU may also describe a case where a CU is partitioned into one or more TUs according to a quadtree.
- the shape of the TU can be square or non-square.
- the HEVC standard allows transformation based on the TU, which can be different for different CUs.
- the TU is usually sized based on the size of the PUs within a given CU defined for the partitioned LCU, but this may not always be the case.
- the size of the TU is usually the same as or smaller than the PU.
- a quad-tree structure called "residual quad-tree" (RQT) may be used to subdivide the residual samples corresponding to the CU into smaller units.
- the leaf node of RQT may be called TU.
- the pixel difference values associated with the TU may be transformed to produce a transformation coefficient, which may be quantized.
- the PU contains data related to the prediction process.
- the PU may include data describing the intra-prediction mode of the PU.
- the PU may include data defining a motion vector of the PU.
- the data defining the motion vector of the PU may describe the horizontal component of the motion vector, the vertical component of the motion vector, the resolution of the motion vector (e.g., quarter-pixel accuracy or eighth-pixel accuracy), motion vector The reference image pointed to, and / or the reference image list of the motion vector (eg, list 0, list 1 or list C).
- TU uses transform and quantization processes.
- a given CU with one or more PUs may also contain one or more TUs.
- video encoder 100 may calculate a residual value corresponding to the PU.
- the residual values include pixel differences that can be transformed into transform coefficients, quantized, and scanned using TU to generate serialized transform coefficients for entropy decoding.
- This application generally uses the term "video block" to refer to the decoding node of a CU.
- the term “video block” may also be used in this application to refer to a tree block including a decoding node and a PU and a TU, such as an LCU or a CU.
- a video sequence usually contains a series of video frames or images.
- a group of pictures exemplarily includes a series, one or more video pictures.
- the GOP may include syntax data in the header information of the GOP, the header information of one or more of the pictures, or elsewhere, and the syntax data describes the number of pictures included in the GOP.
- Each slice of the image may contain slice syntax data describing the coding mode of the corresponding image.
- Video encoder 100 typically operates on video blocks within individual video slices to encode video data.
- a video block may correspond to a decoding node within a CU.
- Video blocks may have fixed or varying sizes, and may differ in size according to a specified decoding standard.
- HM supports prediction of various PU sizes. Assuming the size of a specific CU is 2N ⁇ 2N, HM supports intra prediction of PU sizes of 2N ⁇ 2N or N ⁇ N, and symmetric PU sizes of 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N or N ⁇ N prediction. HM also supports asymmetric partitioning of PU-sized inter predictions of 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N, and nR ⁇ 2N. In asymmetric partitioning, one direction of the CU is not partitioned, and the other direction is partitioned into 25% and 75%.
- 2N ⁇ nU refers to a horizontally divided 2N ⁇ 2NCU, where 2N ⁇ 0.5NPU is at the top and 2N ⁇ 1.5NPU is at the bottom.
- N ⁇ N and “N times N” are used interchangeably to refer to the pixel size of a video block according to vertical and horizontal dimensions, for example, 16 ⁇ 16 pixels or 16 ⁇ 16 pixels.
- an N ⁇ N block has N pixels in the vertical direction and N pixels in the horizontal direction, where N represents a non-negative integer value.
- Pixels in a block can be arranged in rows and columns.
- the block does not necessarily need to have the same number of pixels in the horizontal direction as in the vertical direction.
- a block may include N ⁇ M pixels, where M is not necessarily equal to N.
- the video encoder 100 may calculate the residual data of the TU of the CU.
- a PU may include pixel data in a spatial domain (also referred to as a pixel domain), and a TU may include transforming (e.g., discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform) Coefficients in the transform domain after being applied to the residual video data.
- the residual data may correspond to a pixel difference between a pixel of an uncoded image and a prediction value corresponding to a PU.
- the video encoder 100 may form a TU including residual data of a CU, and then transform the TU to generate a transform coefficient of the CU.
- video encoder 100 may perform quantization of the transform coefficients.
- Quantization exemplarily refers to the process of quantizing coefficients to possibly reduce the amount of data used to represent the coefficients to provide further compression.
- the quantization process may reduce the bit depth associated with some or all of the coefficients. For example, n-bit values may be rounded down to m-bit values during quantization, where n is greater than m.
- the JEM model further improves the coding structure of video images.
- a block coding structure called "Quad Tree Combined with Binary Tree” (QTBT) is introduced.
- QTBT Quality Tree Combined with Binary Tree
- a CU can be square or rectangular.
- a CTU first performs a quadtree partition, and the leaf nodes of the quadtree further perform a binary tree partition.
- there are two partitioning modes in binary tree partitioning symmetrical horizontal partitioning and symmetrical vertical partitioning.
- the leaf nodes of a binary tree are called CUs, and JEM's CUs cannot be further divided during the prediction and transformation process, that is, JEM's CU, PU, and TU have the same block size.
- the maximum size of the CTU is 256 ⁇ 256 luminance pixels.
- the video encoder 100 may utilize a predefined scan order to scan the quantized transform coefficients to generate a serialized vector that can be entropy encoded.
- the video encoder 100 may perform adaptive scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, the video encoder 100 may perform context-based adaptive variable-length decoding (CAVLC), context-adaptive binary arithmetic decoding (context-based based adaptive binary coding (CABAC), syntax-based adaptive binary binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) decoding, or other entropy decoding methods Entropy decodes a one-dimensional vector.
- Video encoder 100 may also entropy encode syntax elements associated with the encoded video data for use by video decoder 200 to decode the video data.
- video encoder 100 may assign a context within a context model to a symbol to be transmitted. Context can be related to whether adjacent values of a symbol are non-zero.
- Context can be related to whether adjacent values of a symbol are non-zero.
- the video encoder 100 may select a variable length code of a symbol to be transmitted. Codewords in variable-length decoding (VLC) may be constructed such that relatively short codes correspond to more likely symbols, and longer codes correspond to less likely symbols. In this way, the use of VLC can achieve the goal of saving code rates relative to using equal length codewords for each symbol to be transmitted.
- the probability in CABAC can be determined based on the context assigned to the symbol.
- the video encoder may perform inter prediction to reduce temporal redundancy between images.
- a CU may have one or more prediction units PU according to the provisions of different video compression codec standards.
- multiple PUs may belong to a CU, or PUs and CUs are the same size.
- the CU's partitioning mode is not divided, or it is divided into one PU, and the PU is uniformly used for expression.
- the video encoder may signal the video decoder motion information for the PU.
- the motion information of the PU may include: a reference image index, a motion vector, and a prediction direction identifier.
- a motion vector may indicate a displacement between an image block (also called a video block, a pixel block, a pixel set, etc.) of a PU and a reference block of the PU.
- the reference block of the PU may be a part of the reference picture similar to the image block of the PU.
- the reference block may be located in a reference image indicated by a reference image index and a prediction direction identifier.
- the video encoder may generate candidate prediction motion vectors (Motion Vector, MV) for each of the PUs according to the merge prediction mode or advanced motion vector prediction mode process List.
- MV Motion Vector
- Each candidate prediction motion vector in the candidate prediction motion vector list for the PU may indicate motion information, and the MV list may also be referred to as a candidate motion information list.
- the motion information indicated by some candidate prediction motion vectors in the candidate prediction motion vector list may be based on the motion information of other PUs. If the candidate prediction motion vector indicates motion information specifying one of a spatial candidate prediction motion vector position or a temporal candidate prediction motion vector position, the present application may refer to the candidate prediction motion vector as an "original" candidate prediction motion vector.
- a merge mode also referred to herein as a merge prediction mode
- the video encoder may generate additional candidate prediction motion vectors by combining partial motion vectors from different original candidate prediction motion vectors, modifying the original candidate prediction motion vectors, or inserting only zero motion vectors as candidate prediction motion vectors. These additional candidate prediction motion vectors are not considered as original candidate prediction motion vectors and may be referred to as artificially generated candidate prediction motion vectors in this application.
- the techniques of this application generally relate to a technique for generating a list of candidate prediction motion vectors at a video encoder and a technique for generating the same list of candidate prediction motion vectors at a video decoder.
- the video encoder and video decoder may generate the same candidate prediction motion vector list by implementing the same techniques used to construct the candidate prediction motion vector list. For example, both a video encoder and a video decoder may build a list with the same number of candidate prediction motion vectors (eg, five candidate prediction motion vectors).
- Video encoders and decoders may first consider spatial candidate prediction motion vectors (e.g., neighboring blocks in the same image), then consider temporal candidate prediction motion vectors (e.g., candidate prediction motion vectors in different images), and finally consider The artificially generated candidate prediction motion vectors are added until a desired number of candidate prediction motion vectors are added to the list.
- a type of candidate prediction motion vector may be indicated in the candidate prediction motion vector list through an identification bit to control the length of the candidate prediction motion vector list.
- the spatial candidate prediction motion vector set and the temporal candidate prediction motion vector can be used as the original candidate prediction motion vector.
- an identification bit space is added to the vector list to indicate an artificially generated candidate prediction motion vector set.
- a prediction motion vector is selected from a set of candidate prediction motion vectors indicated by the identification bit.
- the video encoder may select the candidate prediction motion vector from the candidate prediction motion vector list and output the candidate prediction motion vector index in the code stream.
- the selected candidate prediction motion vector may be a candidate prediction motion vector having a motion vector that most closely matches the predictor of the target PU being decoded.
- the candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list.
- the video encoder may also generate a predictive image block for the PU based on a reference block indicated by the motion information of the PU. The motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector.
- the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
- the motion information of the PU may be determined based on the motion vector difference of the PU and the motion information indicated by the selected candidate prediction motion vector.
- the video encoder may generate one or more residual image blocks for the CU based on the predictive image blocks of the PU of the CU and the original image blocks for the CU. The video encoder may then encode one or more residual image blocks and output one or more residual image blocks in a code stream.
- the bitstream may include data identifying a selected candidate prediction motion vector in the candidate prediction motion vector list of the PU, which is referred to herein as an identifier or signal.
- the data may include an index in the candidate prediction motion vector list, and the target motion vector is determined through the index; or the target motion vector is determined to be a certain type of candidate prediction motion vector through the index.
- the data further includes an instruction selection Information on the specific position of the data of the candidate prediction motion vector in this type of candidate prediction motion vector.
- the video decoder can parse the bitstream to obtain data of the selected candidate prediction motion vector in the candidate prediction motion vector list identifying the PU, and determine the data of the selected candidate prediction motion vector based on the data.
- the motion information indicated by the selected candidate prediction motion vector in determines the motion information of the PU.
- the video decoder may identify one or more reference blocks for the PU based on the motion information of the PU. After identifying one or more reference blocks of the PU, the video decoder may generate predictive image blocks for the PU based on the one or more reference blocks of the PU. The video decoder may reconstruct an image block for a CU based on a predictive image block for a PU of the CU and one or more residual image blocks for the CU.
- the present application may describe a position or an image block as having various spatial relationships with a CU or a PU. This description can be interpreted to mean that the position or image block and the image block associated with the CU or PU have various spatial relationships.
- a PU currently being decoded by a video decoder may be referred to as a current PU, and may also be referred to as a current image block to be processed.
- This application may refer to the CU that the video decoder is currently decoding as the current CU.
- This application may refer to the image currently being decoded by the video decoder as the current image. It should be understood that this application is applicable to a case where the PU and the CU have the same size, or the PU is the CU, and the PU is used to represent the same.
- video encoder 100 may use inter prediction to generate predictive image blocks and motion information for a PU of a CU.
- the motion information of a given PU may be the same or similar to the motion information of one or more nearby PUs (ie, PUs whose image blocks are spatially or temporally near the image blocks of the given PU). Because nearby PUs often have similar motion information, video encoder 100 may refer to the motion information of nearby PUs to encode motion information for a given PU. Encoding the motion information of a given PU with reference to the motion information of a nearby PU can reduce the number of encoded bits required to indicate the motion information of a given PU in the code stream.
- Video encoder 100 may refer to motion information of nearby PUs in various ways to encode motion information for a given PU.
- video encoder 100 may indicate that the motion information of a given PU is the same as the motion information of nearby PUs.
- This application may use a merge mode to refer to indicating that the motion information of a given PU is the same as that of nearby PUs or may be derived from the motion information of nearby PUs.
- the video encoder 100 may calculate a Motion Vector Difference (MVD) for a given PU.
- MVD Motion Vector Difference
- MVD indicates the difference between the motion vector of a given PU and the motion vector of a nearby PU.
- Video encoder 100 may include MVD instead of a motion vector of a given PU in the motion information of a given PU. Representing MVD in the codestream requires fewer coding bits than representing the motion vector of a given PU.
- This application may use advanced motion vector prediction mode to refer to the motion information of a given PU by using the MVD and an index value identifying a candidate motion vector.
- the video encoder 100 may generate a list of candidate predicted motion vectors for a given PU.
- the candidate prediction motion vector list may include one or more candidate prediction motion vectors.
- Each of the candidate prediction motion vectors in the candidate prediction motion vector list for a given PU may specify motion information.
- the motion information indicated by each candidate prediction motion vector may include a motion vector, a reference image index, and a prediction direction identifier.
- the candidate prediction motion vectors in the candidate prediction motion vector list may include "raw" candidate prediction motion vectors, each of which indicates motion information that is different from one of the specified candidate prediction motion vector positions within a PU of a given PU.
- the video encoder 100 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list for the PU. For example, a video encoder may compare each candidate prediction motion vector with the PU being decoded and may select a candidate prediction motion vector with a desired code rate-distortion cost. Video encoder 100 may output a candidate prediction motion vector index for a PU. The candidate prediction motion vector index may identify the position of the selected candidate prediction motion vector in the candidate prediction motion vector list.
- the video encoder 100 may generate a predictive image block for a PU based on a reference block indicated by motion information of the PU.
- the motion information of the PU may be determined based on the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU.
- the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
- motion information of a PU may be determined based on a motion vector difference for the PU and motion information indicated by a selected candidate prediction motion vector.
- Video encoder 100 may process predictive image blocks for a PU as described previously.
- an identifier bit may be used in the candidate prediction motion vector list to indicate a type of candidate prediction motion vector to control the length of the candidate prediction motion vector list. I will not repeat them here.
- video decoder 200 may generate a list of candidate predicted motion vectors for each of the PUs of the CU.
- the candidate prediction motion vector list generated by the video decoder 200 for the PU may be the same as the candidate prediction motion vector list generated by the video encoder 100 for the PU.
- the syntax element parsed by the video decoder 200 from the bitstream may indicate the position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
- the video decoder 200 may generate predictive image blocks for the PU based on one or more reference blocks indicated by the motion information of the PU.
- the video decoder 200 may determine the motion information of the PU from the motion information indicated by the selected candidate prediction motion vector in the candidate prediction motion vector list for the PU based on the syntax element obtained by parsing the bitstream. Video decoder 200 may reconstruct an image block for a CU based on a predictive image block for a PU and a residual image block for a CU.
- the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector.
- the video decoder 200 first parses the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
- the candidate prediction motion vector list of the PU includes at least one first candidate motion vector and at least one second candidate set, and the second candidate set includes at least one second candidate motion vector.
- Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier.
- the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is the second candidate set, the video decoder 200 parses the bitstream to obtain a second identifier, and the second identifier is used to identify a selected candidate prediction motion vector in the second candidate set indicated by the first identifier.
- video decoder 200 determines target motion information from a plurality of second candidate motion vectors in a second candidate set indicated by the first ID according to the second identifier, and uses the target motion information to predict a to-be-processed image block (PU) The motion information is subsequently decoded.
- the candidate prediction motion vector list may use a flag bit to indicate a type of candidate prediction motion vector.
- the video decoder 200 first analyzes the code stream to obtain a first identifier, and the first identifier indicates The position of the candidate prediction motion vector selected in the candidate prediction motion vector list of the PU.
- the PU candidate motion vector list includes at least one first candidate motion vector and multiple second candidate motion information.
- the first candidate motion information includes first motion information
- the second candidate motion information includes preset motion information. Shift amount.
- Video decoder 200 determines a target element corresponding to the first identifier from a list of candidate predicted motion vectors of the PU according to the first identifier.
- the video decoder 200 determines the target element as the target motion vector of the PU, and uses the target motion information to predict the motion information of the image block (PU) to be processed for subsequent decoding processes. If the target element is obtained according to a plurality of second candidate motion information, the video decoder 200 parses the bitstream to obtain a second identifier, and determines the target motion based on one of the plurality of second candidate motion information according to the second identifier. Information, the target motion information is used to predict the motion information of the image block (PU) to be processed for subsequent decoding processes.
- candidate motion vectors in the candidate prediction motion vector list may be obtained according to different modes, which are not specifically limited in this application.
- the construction of the candidate prediction motion vector list and the parsing of the selected candidate prediction motion vector from the code stream in the candidate prediction motion vector list are independent of each other, and can be arbitrarily Sequentially or in parallel.
- the position of the selected candidate prediction motion vector in the candidate prediction motion vector list is first parsed from the code stream, and a candidate prediction motion vector list is constructed based on the parsed position.
- a candidate prediction motion vector list is constructed based on the parsed position.
- the selected candidate predictive motion vector is obtained by parsing the bitstream and is a candidate predictive motion vector with an index of 3 in the candidate predictive motion vector list, only the candidate predictive motion vector from index 0 to index 3 needs to be constructed
- the list can determine the candidate predicted motion vector with the index of 3, which can achieve the technical effect of reducing complexity and improving decoding efficiency.
- FIG. 2 is a block diagram of a video encoder 100 according to an example described in the embodiment of the present application.
- the video encoder 100 is configured to output a video to the post-processing entity 41.
- the post-processing entity 41 represents an example of a video entity that can process the encoded video data from the video encoder 100, such as a media-aware network element (MANE) or a stitching / editing device.
- the post-processing entity 41 may be an instance of a network entity.
- the post-processing entity 41 and the video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to the post-processing entity 41 may be performed by the same device including the video encoder 100 carried out.
- the post-processing entity 41 is an example of the storage device 40 of FIG. 1.
- the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a decoded picture buffer (DPB) 107, a summer 112, a transformer 101, a quantizer 102, and entropy. Encoder 103.
- the prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109.
- the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111.
- the filter unit 106 is intended to represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
- the filter unit 106 is shown as an in-loop filter in FIG. 2, in other implementations, the filter unit 106 may be implemented as a post-loop filter.
- the video encoder 100 may further include a video data memory and a segmentation unit (not shown in the figure).
- the video data memory may store video data to be encoded by the components of the video encoder 100.
- the video data stored in the video data storage may be obtained from the video source 120.
- the DPB 107 may be a reference image memory that stores reference video data used by the video encoder 100 to encode video data in an intra-frame or inter-frame decoding mode.
- the video data memory and the DPB 107 can be formed by any of a variety of memory devices, such as a dynamic random access memory (SDRAM) including a synchronous dynamic random access memory (SDRAM), a magnetoresistance RAM (magnetic random access memory, MRAM), resistive RAM (resistive random access memory, RRAM), or other types of memory devices.
- Video data storage and DPB 107 can be provided by the same storage device or separate storage devices.
- the video data memory may be on-chip with other components of video encoder 100 or off-chip relative to those components.
- the video encoder 100 receives video data and stores the video data in a video data memory.
- the segmentation unit divides the video data into several image blocks, and these image blocks can be further divided into smaller blocks, such as image block segmentation based on a quad tree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger units.
- Video encoder 100 typically illustrates components that encode image blocks within a video slice to be encoded.
- the slice can be divided into multiple image patches (and possibly into a collection of image patches called slices).
- the prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes.
- the prediction processing unit 108 may provide the obtained intra, inter-coded block to the summer 112 to generate a residual block, and to the summer 111 to reconstruct an encoded block used as a reference image.
- the intra predictor 109 within the prediction processing unit 108 may perform intra predictive encoding of the current image block with respect to one or more neighboring blocks in the same frame or slice as the current block to be encoded to remove spatial redundancy.
- the inter predictor 110 within the prediction processing unit 108 may perform inter predictive coding of the current image block with respect to one or more prediction blocks in the one or more reference images to remove temporal redundancy.
- the inter predictor 110 may be configured to determine an inter prediction mode for encoding a current image block. For example, the inter predictor 110 may use a rate-distortion analysis to calculate the rate-distortion values of various inter-prediction modes in the set of candidate inter-prediction modes, and select from them the best rate-distortion characteristics. Inter prediction mode. Code rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate (also That is, the number of bits). For example, the inter predictor 110 may determine that the inter prediction mode with the lowest code rate distortion cost of encoding the current image block in the candidate inter prediction mode set is the inter prediction mode used for inter prediction of the current image block.
- Code rate distortion analysis generally determines the amount of distortion (or error) between the coded block and the original uncoded block that was coded to produce the coded block, and the bit rate (also That is, the number of bits).
- the inter predictor 110
- the inter predictor 110 is configured to predict motion information (such as a motion vector) of one or more sub-blocks in the current image block based on the determined inter prediction mode, and use the motion information (such as the motion vector) of one or more sub-blocks in the current image block. Motion vector) to obtain or generate a prediction block of the current image block.
- the inter predictor 110 may locate a prediction block pointed to by the motion vector in one of the reference image lists.
- the inter predictor 110 may also generate syntax elements associated with image blocks and video slices for use by the video decoder 200 when decoding image blocks of the video slice.
- the inter predictor 110 uses the motion information of each sub-block to perform a motion compensation process to generate a prediction block of each sub-block, thereby obtaining a prediction block of the current image block. It should be understood that the The inter predictor 110 performs motion estimation and motion compensation processes.
- the inter predictor 110 may provide information indicating the selected inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the instruction. Information on the selected inter prediction mode.
- the intra predictor 109 may perform intra prediction on the current image block.
- the intra predictor 109 may determine an intra prediction mode used to encode the current block.
- the intra predictor 109 may use a rate-distortion analysis to calculate the rate-distortion values of various intra-prediction modes to be tested, and select the one with the best rate-distortion characteristics from the modes to be tested.
- Intra prediction mode In any case, after the intra prediction mode is selected for the image block, the intra predictor 109 may provide information indicating the selected intra prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the indication Information on the selected intra prediction mode.
- the video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded.
- the summer 112 represents one or more components that perform this subtraction operation.
- the residual video data in the residual block may be included in one or more (transform units) and applied to the transformer 101.
- the transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a discrete cosine transform (DCT) or a conceptually similar transform.
- the transformer 101 may transform the residual video data from a pixel value domain to a transform domain, such as a frequency domain.
- DCT discrete cosine transform
- the transformer 101 may send the obtained transform coefficients to a quantizer 102.
- a quantizer 102 quantizes the transform coefficients to further reduce the bit code rate.
- the quantizer 102 may then perform a scan of a matrix containing the quantized transform coefficients.
- the entropy encoder 103 may perform scanning.
- the entropy encoder 103 After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 can perform context-adaptive variable-length coding (CAVLC), context-adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), and probability interval segmentation entropy (PIPE ) Coding or another entropy coding method or technique.
- CAVLC context-adaptive variable-length coding
- CABAC context-adaptive binary arithmetic coding
- SBAC syntax-based context-adaptive binary arithmetic coding
- PIPE probability interval segmentation entropy Coding or another entropy coding method or technique.
- the encoded code stream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200.
- the entropy encoder 103 may also perform entrop
- the inverse quantizer 104 and the inverse changer 105 respectively apply inverse quantization and inverse transform to reconstruct the residual block in the pixel domain, for example, for later use as a reference block of a reference image.
- the summer 111 adds the reconstructed residual block to a prediction block generated by the inter predictor 110 or the intra predictor 109 to generate a reconstructed image block.
- the filter unit 106 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts. This reconstructed image block is then stored as a reference block in the decoded image buffer 107 and can be used by the inter predictor 110 as a reference block to perform inter prediction on subsequent video frames or blocks in the image.
- the video encoder 100 may directly quantize the residual signal without processing by the transformer 101 and correspondingly does not need to be processed by the inverse transformer 105; or, for some image blocks Or image frames, the video encoder 100 does not generate residual data, and accordingly does not need to be processed by the transformer 101, quantizer 102, inverse quantizer 104, and inverse transformer 105; or, the video encoder 100 may convert the reconstructed image
- the blocks are stored directly as reference blocks without being processed by the filter unit 106; alternatively, the quantizer 102 and the inverse quantizer 104 in the video encoder 100 may be merged together.
- FIG. 3 is a block diagram of an example video decoder 200 described in the embodiment of the present application.
- the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a DPB 207.
- the prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209.
- video decoder 200 may perform a decoding process that is substantially inverse to the encoding process described with respect to video encoder 100 from FIG. 2.
- the video decoder 200 receives from the video encoder 100 an encoded video codestream representing image blocks of the encoded video slice and associated syntax elements.
- the video decoder 200 may receive video data from the network entity 42, optionally, the video data may also be stored in a video data storage (not shown in the figure).
- the video data memory may store video data, such as an encoded video code stream, to be decoded by components of the video decoder 200.
- the video data stored in the video data storage can be obtained, for example, from the storage device 40, from a local video source such as a camera, via a wired or wireless network of video data, or by accessing a physical data storage medium.
- the video data memory can be used as a decoded image buffer (CPB) for storing encoded video data from the encoded video bitstream. Therefore, although the video data storage is not shown in FIG. 3, the video data storage and the DPB 207 may be the same storage, or may be separately provided storages. Video data memory and DPB 207 can be formed by any of a variety of memory devices, such as: dynamic random access memory (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), and resistive RAM (RRAM) , Or other types of memory devices. In various examples, the video data memory may be integrated on a chip with other components of the video decoder 200 or provided off-chip relative to those components.
- DRAM dynamic random access memory
- SDRAM synchronous DRAM
- MRAM magnetoresistive RAM
- RRAM resistive RAM
- the video data memory may be integrated on a chip with other components of the video decoder 200 or provided off-chip relative to those components.
- the network entity 42 may be, for example, a server, a MANE, a video editor / splicer, or other such device for implementing one or more of the techniques described above.
- the network entity 42 may or may not include a video encoder, such as video encoder 100.
- the network entity 42 may implement some of the techniques described in this application.
- the network entity 42 and the video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to the network entity 42 may be performed by the same device including the video decoder 200.
- the network entity 42 may be an example of the storage device 40 of FIG. 1.
- the entropy decoder 203 of the video decoder 200 entropy decodes the code stream to produce quantized coefficients and some syntax elements.
- the entropy decoder 203 forwards the syntax elements to the prediction processing unit 208.
- Video decoder 200 may receive syntax elements at a video slice level and / or an image block level.
- the intra predictor 209 of the prediction processing unit 208 may be based on the signaled intra prediction mode and the previously decoded block from the current frame or image. Data to generate prediction blocks for image blocks of the current video slice.
- the inter predictor 210 of the prediction processing unit 208 may determine, based on the syntax elements received from the entropy decoder 203, the An inter prediction mode in which a current image block of a video slice is decoded, and based on the determined inter prediction mode, the current image block is decoded (for example, inter prediction is performed).
- the inter predictor 210 may determine whether to use the new inter prediction mode to predict the current image block of the current video slice. If the syntax element indicates that the new inter prediction mode is used to predict the current image block, based on A new inter prediction mode (for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode) predicts the current image block of the current video slice or a sub-block of the current image block. Motion information, so that the motion information of the current image block or a sub-block of the current image block is used to obtain or generate a prediction block of the current image block or a sub-block of the current image block through a motion compensation process.
- a new inter prediction mode for example, a new inter prediction mode specified by a syntax element or a default new inter prediction mode
- the motion information here may include reference image information and motion vectors, where the reference image information may include but is not limited to unidirectional / bidirectional prediction information, a reference image list number, and a reference image index corresponding to the reference image list.
- a prediction block may be generated from one of reference pictures within one of the reference picture lists.
- the video decoder 200 may construct a reference image list, that is, a list 0 and a list 1, based on the reference images stored in the DPB 207.
- the reference frame index of the current image may be included in one or more of the reference frame list 0 and list 1.
- the video encoder 100 may signal whether to use a new inter prediction mode to decode a specific syntax element of a specific block, or may be a signal to indicate whether to use a new inter prediction mode. And indicating which new inter prediction mode is used to decode a specific syntax element of a specific block. It should be understood that the inter predictor 210 here performs a motion compensation process.
- the inverse quantizer 204 inverse quantizes, that is, dequantizes, the quantized transform coefficients provided in the code stream and decoded by the entropy decoder 203.
- the inverse quantization process may include using a quantization parameter calculated by the video encoder 100 for each image block in the video slice to determine the degree of quantization that should be applied and similarly to determine the degree of inverse quantization that should be applied.
- the inverse transformer 205 applies an inverse transform to transform coefficients, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process to generate a residual block in the pixel domain.
- the video decoder 200 works by comparing the residual block from the inverse transformer 205 with the corresponding prediction generated by the inter predictor 210 The blocks are summed to get the reconstructed block, that is, the decoded image block.
- the summer 211 represents a component that performs this summing operation.
- a loop filter in or after the decoding loop
- the filter unit 206 may represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
- the filter unit 206 is shown as an in-loop filter in FIG. 3, in other implementations, the filter unit 206 may be implemented as a post-loop filter.
- the filter unit 206 is adapted to reconstruct a block to reduce block distortion, and the result is output as a decoded video stream.
- the decoded image block in a given frame or image can also be stored in the decoded image buffer 207, and the reference image used for subsequent motion compensation can be stored via the DPB 207.
- the DPB 207 may be part of the memory, which may also store the decoded video for later presentation on a display device (such as the display device 220 of FIG. 1), or may be separate from such memory.
- the video decoder 200 may generate an output video stream without being processed by the filter unit 206; or, for certain image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode the quantized coefficients, and accordingly, It does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
- the techniques of this application exemplarily involve inter-frame decoding. It should be understood that the techniques of this application may be performed by any of the video decoders described in this application.
- the video decoder includes, for example, the video encoder 100 and video decoding as shown and described with respect to FIGS. 1-3. ⁇ 200 ⁇ 200. That is, in one feasible implementation, the inter predictor 110 described with respect to FIG. 2 may perform specific techniques described below when performing inter prediction during encoding of a block of video data. In another possible implementation, the inter predictor 210 described with respect to FIG. 3 may perform specific techniques described below when performing inter prediction during decoding of blocks of video data.
- a reference to a generic "video encoder" or "video decoder” may include video encoder 100, video decoder 200, or another video encoding or coding unit.
- the processing result for a certain link may be further processed and output to the next link, for example, in interpolation filtering, motion vector derivation or loop After filtering and other steps, the results of the corresponding steps are further clipped or shifted.
- the motion vector of the control point of the current image block derived according to the motion vector of the adjacent affine coding block may be further processed, which is not limited in this application.
- the value range of the motion vector is restricted so that it is within a certain bit width. Assuming that the bit width of the allowed motion vector is bitDepth, the range of the motion vector is -2 ⁇ (bitDepth-1) to 2 ⁇ (bitDepth-1) -1, where the " ⁇ " symbol represents the power. If bitDepth is 16, the value ranges from -32768 to 32767. If bitDepth is 18, the value ranges from -131072 to 131071. Constraints can be implemented in two ways:
- ux (vx + 2 bitDepth )% 2 bitDepth
- the value of vx is -32769, and the value obtained by the above formula is 32767. Because in the computer, the value is stored in the two's complement form, and the two's complement of -32769 is 1,0111,1111,1111,1111 (17 bits). The computer treats the overflow as discarding the high order bits, so the value of vx For 0111, 1111, 1111, 1111, it is 32767, which is consistent with the result obtained by formula processing.
- vx Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vx)
- vy Clip3 (-2 bitDepth-1 , 2 bitDepth-1 -1, vy)
- Clip3 is to clamp the value of z to the interval [x, y]:
- FIG. 4 is a schematic block diagram of an inter prediction module 121 according to an embodiment of the present application.
- the inter prediction module 121 may include a motion estimation unit and a motion compensation unit. The relationship between PU and CU is different in different video compression codecs.
- the inter prediction module 121 may partition a current CU into a PU according to a plurality of partitioning modes. For example, the inter prediction module 121 may partition a current CU into a PU according to 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, and N ⁇ N partition modes. In other embodiments, the current CU is the current PU, which is not limited.
- the inter prediction module 121 may perform integer motion estimation (IME) and then perform fractional motion estimation (FME) on each of the PUs.
- IME integer motion estimation
- FME fractional motion estimation
- the inter prediction module 121 may search a reference block for a PU in one or more reference images. After the reference block for the PU is found, the inter prediction module 121 may generate a motion vector indicating the spatial displacement between the PU and the reference block for the PU with integer precision.
- the inter prediction module 121 may improve a motion vector generated by performing IME on the PU.
- a motion vector generated by performing FME on a PU may have sub-integer precision (eg, 1/2 pixel precision, 1/4 pixel precision, etc.).
- the inter prediction module 121 may use the motion vector for the PU to generate a predictive image block for the PU.
- the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU.
- the candidate prediction motion vector list may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors.
- the inter prediction module 121 may select the candidate prediction motion vector from the candidate prediction motion vector list and generate a motion vector difference (MVD) for the PU.
- the MVD for a PU may indicate a difference between a motion vector indicated by a selected candidate prediction motion vector and a motion vector generated for the PU using IME and FME.
- the inter prediction module 121 may output a candidate prediction motion vector index that identifies the position of the selected candidate prediction motion vector in the candidate prediction motion vector list.
- the inter prediction module 121 may also output the MVD of the PU.
- a detailed implementation of the advanced motion vector prediction (AMVP) mode in the embodiment of the present application in FIG. 6 is described in detail below.
- the inter prediction module 121 may also perform a merge operation on each of the PUs.
- the inter prediction module 121 may generate a list of candidate prediction motion vectors for the PU.
- the candidate prediction motion vector list for the PU may include one or more original candidate prediction motion vectors and one or more additional candidate prediction motion vectors derived from the original candidate prediction motion vectors.
- the original candidate prediction motion vector in the candidate prediction motion vector list may include one or more spatial candidate prediction motion vectors and temporal candidate prediction motion vectors.
- the spatial candidate prediction motion vector may indicate motion information of other PUs in the current image.
- the temporal candidate prediction motion vector may be based on motion information of a corresponding PU different from the current picture.
- the temporal candidate prediction motion vector may also be referred to as temporal motion vector prediction (TMVP).
- the inter prediction module 121 may select one of the candidate prediction motion vectors from the candidate prediction motion vector list. The inter prediction module 121 may then generate a predictive image block for the PU based on the reference block indicated by the motion information of the PU. In the merge mode, the motion information of the PU may be the same as the motion information indicated by the selected candidate prediction motion vector.
- Figure 5 described below illustrates an exemplary flowchart for Merge.
- the original candidate predictive motion vector can be directly included in the candidate predictive motion vector list, and one type of additional candidate predictive motion vector is indicated through the identification bit to control the candidate predictive motion.
- the length of the vector list In particular, different types of extra candidate prediction motion vectors are indicated by different identification bits.
- a prediction motion vector is selected from the set of extra candidate prediction motion vectors indicated by the identification bit.
- the candidate prediction motion vector indicated by the identification bit may be a preset motion information offset.
- the inter prediction module 121 may select a predictive image block generated through the FME operation or a merge operation. Predictive image blocks. In some feasible implementations, the inter prediction module 121 may select a predictive image for a PU based on a code rate-distortion cost analysis of the predictive image block generated by the FME operation and the predictive image block generated by the merge operation. Piece.
- the inter prediction module 121 may select a partitioning mode for the current CU. In some embodiments, the inter prediction module 121 may select a rate-distortion cost analysis for a selected predictive image block of the PU generated by segmenting the current CU according to each of the partitioning modes to select the Split mode.
- the inter prediction module 121 may output a predictive image block associated with a PU belonging to the selected partition mode to the residual generation module 102.
- the inter prediction module 121 may output a syntax element indicating motion information of a PU belonging to the selected partition mode to the entropy encoding module.
- the inter prediction module 121 includes IME modules 180A to 180N (collectively referred to as “IME module 180"), FME modules 182A to 182N (collectively referred to as “FME module 182”), and merge modules 184A to 184N (collectively Are “merging module 184"), PU mode decision modules 186A to 186N (collectively referred to as “PU mode decision module 186”) and CU mode decision module 188 (may also include performing a mode decision process from CTU to CU).
- IME module 180 IME modules 180A to 180N
- FME module 182 FME modules 182A to 182N
- merge modules 184A to 184N collectively Are “merging module 184"
- PU mode decision modules 186A to 186N collectively referred to as "PU mode decision module 186”
- CU mode decision module 188 may also include performing a mode decision process from CTU to CU).
- the IME module 180, the FME module 182, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a PU of the current CU.
- the inter prediction module 121 is illustrated in the schematic diagram of FIG. 4 as including a separate IME module 180, an FME module 182, and a merging module 184 for each PU of each partitioning mode of the CU. In other feasible implementations, the inter prediction module 121 does not include a separate IME module 180, an FME module 182, and a merge module 184 for each PU of each partitioning mode of the CU.
- the IME module 180A, the FME module 182A, and the merge module 184A may perform IME operations, FME operations, and merge operations on a PU generated by dividing a CU according to a 2N ⁇ 2N split mode.
- the PU mode decision module 186A may select one of the predictive image blocks generated by the IME module 180A, the FME module 182A, and the merge module 184A.
- the IME module 180B, the FME module 182B, and the merge module 184B may perform an IME operation, an FME operation, and a merge operation on a left PU generated by dividing a CU according to an N ⁇ 2N division mode.
- the PU mode decision module 186B may select one of the predictive image blocks generated by the IME module 180B, the FME module 182B, and the merge module 184B.
- the IME module 180C, the FME module 182C, and the merge module 184C may perform an IME operation, an FME operation, and a merge operation on a right PU generated by dividing a CU according to an N ⁇ 2N division mode.
- the PU mode decision module 186C may select one of the predictive image blocks generated by the IME module 180C, the FME module 182C, and the merge module 184C.
- the IME module 180N, the FME module 182N, and the merge module 184 may perform an IME operation, an FME operation, and a merge operation on a lower right PU generated by dividing a CU according to an N ⁇ N division mode.
- the PU mode decision module 186N may select one of the predictive image blocks generated by the IME module 180N, the FME module 182N, and the merge module 184N.
- the PU mode decision module 186 may select a predictive image block based on a code rate-distortion cost analysis of a plurality of possible predictive image blocks, and select a predictive image block that provides the best code rate-distortion cost for a given decoding situation. For example, for bandwidth-constrained applications, the PU mode decision module 186 may prefer to select predictive image blocks that increase the compression ratio, while for other applications, the PU mode decision module 186 may prefer to select predictive images that increase the quality of the reconstructed video. Piece.
- the CU mode decision module 188 selects a partition mode for the current CU and outputs the predictive image block and motion information of the PU belonging to the selected partition mode. .
- FIG. 5 is an implementation flowchart of a merge mode in an embodiment of the present application.
- a video encoder eg, video encoder 20
- the merging operation 200 may include: 202. Generate a candidate list for a current prediction unit. 204. Generate a predictive video block associated with a candidate in the candidate list. 206. Select a candidate from the candidate list. 208. Output candidates.
- the candidate refers to a candidate motion vector or candidate motion information.
- the video encoder may perform a merge operation different from the merge operation 200.
- the video encoder may perform a merge operation, where the video encoder performs more or fewer steps than the merge operation 200 or steps different from the merge operation 200.
- the video encoder may perform the steps of the merge operation 200 in a different order or in parallel.
- the encoder may also perform a merge operation 200 on a PU encoded in a skip mode.
- the video encoder may generate a list of candidate predicted motion vectors for the current PU (202).
- the video encoder may generate a list of candidate prediction motion vectors for the current PU in various ways.
- the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one of the example techniques described below with respect to FIGS. 8-12.
- the candidate prediction motion vector list for the current PU includes at least one first candidate motion vector and at least one second candidate motion vector set identifier.
- the candidate prediction motion vector list for the current PU may include a temporal candidate prediction motion vector.
- the temporal candidate prediction motion vector may indicate motion information of a co-located PU in the time domain.
- a co-located PU may be spatially in the same position in the image frame as the current PU, but in a reference picture instead of the current picture.
- a reference picture including a PU corresponding to the time domain may be referred to as a related reference picture.
- a reference image index of a related reference image may be referred to as a related reference image index in this application.
- the current image may be associated with one or more reference image lists (eg, list 0, list 1, etc.).
- the reference image index may indicate a reference image by indicating a position in a reference image list of the reference image.
- the current image may be associated with a combined reference image list.
- the related reference picture index is the reference picture index of the PU covering the reference index source position associated with the current PU.
- the reference index source location associated with the current PU is adjacent to the left of the current PU or above the current PU.
- the PU may "cover" the specific location.
- the video encoder can use a zero reference picture index.
- the reference index source location associated with the current PU is within the current CU.
- the PU may need to access motion information of another PU of the current CU in order to determine a reference picture containing a co-located PU. Therefore, these video encoders may use motion information (ie, a reference picture index) of a PU belonging to the current CU to generate a temporal candidate prediction motion vector for the current PU. In other words, these video encoders may use temporal information of a PU belonging to the current CU to generate a temporal candidate prediction motion vector. Therefore, the video encoder may not be able to generate a list of candidate prediction motion vectors for the current PU and the PU covering the reference index source position associated with the current PU in parallel.
- motion information ie, a reference picture index
- the video encoder may explicitly set the relevant reference picture index without referring to the reference picture index of any other PU. This may enable the video encoder to generate candidate prediction motion vector lists for the current PU and other PUs of the current CU in parallel. Because the video encoder explicitly sets the relevant reference picture index, the relevant reference picture index is not based on the motion information of any other PU of the current CU. In some feasible implementations where the video encoder explicitly sets the relevant reference picture index, the video encoder may always set the relevant reference picture index to a fixed, predefined preset reference picture index (eg, 0).
- a fixed, predefined preset reference picture index eg, 0
- the video encoder may generate a temporal candidate prediction motion vector based on the motion information of the co-located PU in the reference frame indicated by the preset reference picture index, and may include the temporal candidate prediction motion vector in the candidate prediction of the current CU List of motion vectors.
- the video encoder may be explicitly used in a syntax structure (e.g., image header, slice header, APS, or another syntax structure)
- the related reference picture index is signaled.
- the video encoder may signal the decoder to the relevant reference picture index for each LCU (ie, CTU), CU, PU, TU, or other type of sub-block. For example, the video encoder may signal that the relevant reference picture index for each PU of the CU is equal to "1".
- the relevant reference image index may be set implicitly rather than explicitly.
- the video encoder may use the motion information of the PU in the reference image indicated by the reference image index of the PU covering the location outside the current CU to generate a candidate prediction motion vector list for the PU of the current CU. Each time candidate predicts a motion vector, even if these locations are not strictly adjacent to the current PU.
- the video encoder may generate predictive image blocks associated with the candidate prediction motion vectors in the candidate prediction motion vector list (204).
- the video encoder may generate the candidate prediction motion vector by determining the motion information of the current PU based on the motion information of the indicated candidate prediction motion vector and then generating a predictive image block based on one or more reference blocks indicated by the motion information of the current PU. Associated predictive image blocks.
- the video encoder may then select one of the candidate prediction motion vectors from the candidate prediction motion vector list (206).
- the video encoder can select candidate prediction motion vectors in various ways. For example, a video encoder may select one of the candidate prediction motion vectors based on a code rate-distortion cost analysis of each of the predictive image blocks associated with the candidate prediction motion vector.
- the video encoder may output a candidate prediction motion vector index (208).
- the candidate prediction motion vector index may indicate a position where a candidate prediction motion vector is selected in the candidate prediction motion vector list.
- the candidate prediction motion vector index may be represented as "merge_idx".
- FIG. 6 is an implementation flowchart of an advanced motion vector prediction (AMVP) mode in an embodiment of the present application.
- a video encoder eg, video encoder 20
- the AMVP operation 210 may include: 211. Generate one or more motion vectors for a current prediction unit. 212. Generate a predictive video block for the current prediction unit. 213. Generate a candidate list for the current prediction unit. 214. Generate a motion vector difference. 215 Select a candidate from the candidate list. 216. Output a reference picture index, a candidate index, and a motion vector difference for selecting a candidate.
- the candidate refers to a candidate motion vector or candidate motion information.
- the video encoder may generate one or more motion vectors for the current PU (211).
- the video encoder may perform integer motion estimation and fractional motion estimation to generate motion vectors for the current PU.
- the current image may be associated with two reference image lists (List 0 and List 1).
- the video encoder may generate a list 0 motion vector or a list 1 motion vector for the current PU.
- the list 0 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 0.
- the list 1 motion vector may indicate a spatial displacement between an image block of the current PU and a reference block in a reference image in list 1.
- the video encoder may generate a list 0 motion vector and a list 1 motion vector for the current PU.
- the video encoder may generate predictive image blocks for the current PU (212).
- the video encoder may generate predictive image blocks for the current PU based on one or more reference blocks indicated by one or more motion vectors for the current PU.
- the video encoder may generate a list of candidate predicted motion vectors for the current PU (213).
- the video decoder may generate a list of candidate prediction motion vectors for the current PU in various ways.
- the video encoder may generate a list of candidate prediction motion vectors for the current PU according to one or more of the possible implementations described below with respect to FIGS. 8 to 12.
- the list of candidate prediction motion vectors may be limited to two candidate prediction motion vectors.
- the list of candidate prediction motion vectors may include more candidate prediction motion vectors (eg, five candidate prediction motion vectors).
- the video encoder may generate one or more motion vector differences (MVD) for each candidate prediction motion vector in the list of candidate prediction motion vectors (214).
- the video encoder may generate a motion vector difference for the candidate prediction motion vector by determining a difference between the motion vector indicated by the candidate prediction motion vector and a corresponding motion vector of the current PU.
- the video encoder may generate a single MVD for each candidate prediction motion vector. If the current PU is bi-predicted, the video encoder may generate two MVDs for each candidate prediction motion vector.
- the first MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 0 motion vector of the current PU.
- the second MVD may indicate a difference between the motion vector of the candidate prediction motion vector and the list 1 motion vector of the current PU.
- the video encoder may select one or more of the candidate prediction motion vectors from the candidate prediction motion vector list (215).
- the video encoder may select one or more candidate prediction motion vectors in various ways. For example, a video encoder may select a candidate prediction motion vector with an associated motion vector that matches the motion vector to be encoded with minimal error, which may reduce the number of bits required to represent the motion vector difference for the candidate prediction motion vector.
- the video encoder may output one or more reference image indexes for the current PU, one or more candidate prediction motion vector indexes, and one or more selected candidate motion vectors.
- One or more motion vector differences of the predicted motion vector (216).
- the video encoder may output a reference picture index ("ref_idx_10") for List 0 or for Reference image index of list 1 ("ref_idx_11").
- the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
- the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
- the video encoder may also output a list 0 motion vector or a list 1 motion vector MVD for the current PU.
- the video encoder may output the reference picture index ("ref_idx_10") for List 0 and the list Reference image index of 1 ("ref_idx_11").
- the video encoder may also output a candidate prediction motion vector index (“mvp_10_flag") indicating the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
- the video encoder may output a candidate prediction motion vector index (“mvp_11_flag”) indicating the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
- the video encoder may also output the MVD of the list 0 motion vector for the current PU and the MVD of the list 1 motion vector for the current PU.
- FIG. 7 is an implementation flowchart of motion compensation performed by a video decoder (such as video decoder 30) in an embodiment of the present application.
- the video decoder may receive an indication of the selected candidate prediction motion vector for the current PU (222). For example, the video decoder may receive a candidate prediction motion vector index indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU.
- the video decoder may receive the first candidate prediction motion vector index and the second candidate prediction motion vector index.
- the first candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 0 motion vector of the current PU in the candidate prediction motion vector list.
- the second candidate prediction motion vector index indicates the position of the selected candidate prediction motion vector for the list 1 motion vector of the current PU in the candidate prediction motion vector list.
- a single syntax element may be used to identify two candidate prediction motion vector indexes.
- the video decoder may accept the candidate prediction motion indicating the position of the selected candidate prediction motion vector within the candidate prediction motion vector list of the current PU.
- Vector index or accept an identifier indicating the position of the classification to which the selected candidate prediction motion vector belongs in the candidate prediction motion vector list of the current PU, and the candidate prediction motion vector of the position of the selected candidate prediction motion vector in its classification index.
- the video decoder may generate a list of candidate predicted motion vectors for the current PU (224).
- the video decoder may generate this candidate prediction motion vector list for the current PU in various ways.
- the video decoder may use the techniques described below with reference to FIGS. 8 to 12 to generate a list of candidate prediction motion vectors for the current PU.
- the video decoder may explicitly or implicitly set a reference image index identifying a reference image including a co-located PU, as described above Figure 5 describes this.
- a type of candidate prediction motion vector may be indicated by an identification bit in the candidate prediction motion vector list to control the length of the candidate prediction motion vector list.
- the video decoder may determine the current PU's based on the motion information indicated by one or more selected candidate prediction motion vectors in the candidate prediction motion vector list for the current PU.
- Motion information (225). For example, if the motion information of the current PU is encoded using a merge mode, the motion information of the current PU may be the same as the motion information indicated by the selected candidate prediction motion vector. If the motion information of the current PU is encoded using the AMVP mode, the video decoder may use one or more MVDs indicated in the one or more motion vectors and the code stream indicated by the or the selected candidate prediction motion vector. To reconstruct one or more motion vectors of the current PU.
- the reference image index and prediction direction identifier of the current PU may be the same as the reference image index and prediction direction identifier of the one or more selected candidate prediction motion vectors.
- the video decoder may generate a predictive image block for the current PU based on one or more reference blocks indicated by the motion information of the current PU (226).
- FIG. 8 is an exemplary schematic diagram of a coding unit (CU) and an adjacent position image block associated with the coding unit (CU) in the embodiment of the present application, illustrating CU250 and schematic candidate prediction motion vector positions 252A to 252E associated with CU250.
- This application may collectively refer to the candidate prediction motion vector positions 252A to 252E as the candidate prediction motion vector positions 252.
- the candidate prediction motion vector position 252 indicates a spatial candidate prediction motion vector in the same image as the CU 250.
- the candidate prediction motion vector position 252A is positioned to the left of CU250.
- the candidate prediction motion vector position 252B is positioned above the CU250.
- the candidate prediction motion vector position 252C is positioned at the upper right of CU250.
- the candidate prediction motion vector position 252D is positioned at the lower left of CU250.
- the candidate prediction motion vector position 252E is positioned at the upper left of the CU250.
- FIG. 8 is a schematic embodiment of a manner for providing a list of candidate prediction motion vectors that the inter prediction module 121 and the motion compensation module can generate. The embodiments will be explained below with reference to the inter prediction module 121, but it should be understood that the motion compensation module can implement the same technique and thus generate the same candidate prediction motion vector list.
- FIG. 9 is a flowchart of constructing a candidate prediction motion vector list according to an embodiment of the present application.
- the technique of FIG. 9 will be described with reference to a list including five candidate prediction motion vectors, but the techniques described herein may also be used with lists of other sizes.
- the five candidate prediction motion vectors may each have an index (eg, 0 to 4).
- the technique of FIG. 9 will be described with reference to a general video decoder.
- a general video decoder may be, for example, a video encoder (such as video encoder 20) or a video decoder (such as video decoder 30).
- the selected prediction motion vector list constructed based on the technology of the present application is described in detail in the following embodiments, and will not be repeated here.
- the video decoder first considers four spatial candidate prediction motion vectors (902).
- the four spatial candidate prediction motion vectors may include candidate prediction motion vector positions 252A, 252B, 252C, and 252D.
- the four spatial candidate prediction motion vectors correspond to motion information of four PUs in the same image as the current CU (for example, CU250).
- the video decoder may consider the four spatial candidate prediction motion vectors in the list in a particular order. For example, the candidate prediction motion vector position 252A may be considered first. If the candidate prediction motion vector position 252A is available, the candidate prediction motion vector position 252A may be assigned to index 0.
- the video decoder may not include the candidate prediction motion vector position 252A in the candidate prediction motion vector list.
- Candidate prediction motion vector positions may be unavailable for various reasons. For example, if the candidate prediction motion vector position is not within the current image, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is intra-predicted, the candidate prediction motion vector position may not be available. In another feasible implementation, if the candidate prediction motion vector position is in a slice different from the current CU, the candidate prediction motion vector position may not be available.
- the video decoder may next consider the candidate prediction motion vector position 252B. If the candidate prediction motion vector position 252B is available and different from the candidate prediction motion vector position 252A, the video decoder may add the candidate prediction motion vector position 252B to the candidate prediction motion vector list.
- the terms "same” and “different” refer to motion information associated with candidate predicted motion vector locations. Therefore, two candidate prediction motion vector positions are considered the same if they have the same motion information, and are considered different if they have different motion information. If the candidate prediction motion vector position 252A is not available, the video decoder may assign the candidate prediction motion vector position 252B to index 0.
- the video decoder may assign the candidate prediction motion vector position 252 to index 1. If the candidate prediction motion vector position 252B is not available or the same as the candidate prediction motion vector position 252A, the video decoder skips the candidate prediction motion vector position 252B and does not include it in the candidate prediction motion vector list.
- the candidate prediction motion vector position 252C is similarly considered by the video decoder for inclusion in the list. If the candidate prediction motion vector position 252C is available and not the same as the candidate prediction motion vector positions 252B and 252A, the video decoder assigns the candidate prediction motion vector position 252C to the next available index. If the candidate prediction motion vector position 252C is unavailable or different from at least one of the candidate prediction motion vector positions 252A and 252B, the video decoder does not include the candidate prediction motion vector position 252C in the candidate prediction motion vector list. Next, the video decoder considers the candidate prediction motion vector position 252D.
- the video decoder assigns the candidate prediction motion vector position 252D to the next available index. If the candidate prediction motion vector position 252D is unavailable or different from at least one of the candidate prediction motion vector positions 252A, 252B, and 252C, the video decoder does not include the candidate prediction motion vector position 252D in the candidate prediction motion vector list.
- candidate prediction motion vectors 252A to 252D for inclusion in the candidate prediction motion vector list, but in some embodiments, all candidate prediction motion vectors 252A to 252D may be first added to the candidate A list of predicted motion vectors, with duplicates removed from the list of candidate predicted motion vectors later.
- the candidate prediction motion vector list may include four spatial candidate prediction motion vectors or the list may include less than four spatial candidate prediction motion vectors. If the list includes four spatial candidate prediction motion vectors (904, Yes), the video decoder considers temporal candidate prediction motion vectors (906).
- the temporal candidate prediction motion vector may correspond to motion information of a co-located PU of a picture different from the current picture. If a temporal candidate prediction motion vector is available and different from the first four spatial candidate prediction motion vectors, the video decoder assigns the temporal candidate prediction motion vector to index 4.
- the video decoder does not include the temporal candidate prediction motion vector in the candidate prediction motion vector list. Therefore, after the video decoder considers temporal candidate prediction motion vectors (906), the candidate prediction motion vector list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902 and the The temporal candidate prediction motion vector) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902). If the candidate prediction motion vector list includes five candidate prediction motion vectors (908, Yes), the video decoder completes building the list.
- the video decoder may consider a fifth spatial candidate prediction motion vector (910).
- the fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors at positions 252A, 252B, 252C, and 252D, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list.
- the five-space candidate prediction motion vector is assigned to index 4.
- the video decoder may not include the candidate prediction motion vector at position 252 in Candidate prediction motion vector list. So after considering the fifth spatial candidate prediction motion vector (910), the list may include five candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at box 902 and the fifth spatial candidate prediction considered at box 910) Motion vectors) or may include four candidate prediction motion vectors (the first four spatial candidate prediction motion vectors considered at block 902).
- the video decoder finishes generating the candidate prediction motion vector list. If the candidate prediction motion vector list includes four candidate prediction motion vectors (912, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes).
- the video decoder may consider the fifth spatial candidate prediction motion vector (918).
- the fifth spatial candidate prediction motion vector may, for example, correspond to the candidate prediction motion vector position 252E. If the candidate prediction motion vector at position 252E is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add a fifth spatial candidate prediction motion vector to the candidate prediction motion vector list, the The five-space candidate prediction motion vector is assigned to the next available index.
- the video decoder may not include the candidate prediction motion vector at position 252E in Candidate prediction motion vector list.
- the video decoder may then consider the temporal candidate prediction motion vector (920). If a temporal candidate prediction motion vector is available and different from the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may add the temporal candidate prediction motion vector to the candidate prediction motion vector list, the temporal candidate The predicted motion vector is assigned to the next available index. If the temporal candidate prediction motion vector is not available or is not different from one of the candidate prediction motion vectors already included in the candidate prediction motion vector list, the video decoder may not include the temporal candidate prediction motion vector in the candidate prediction motion vector. List.
- the candidate prediction motion vector list includes five candidate prediction motion vectors (922, Yes)
- the video decoder finishes generating List of candidate prediction motion vectors. If the list of candidate prediction motion vectors includes less than five candidate prediction motion vectors (922, No), the video decoder adds artificially generated candidate prediction motion vectors (914) until the list includes five candidate prediction motion vectors (916, Yes) until.
- an additional merge candidate prediction motion vector may be artificially generated after the spatial candidate prediction motion vector and the temporal candidate prediction motion vector, so that the size of the merge candidate prediction motion vector list is fixed to the designation of the merge candidate prediction motion vector. Number (for example, five of the possible implementations of FIG. 9 above).
- Additional merge candidate prediction motion vectors may include exemplary combined bi-predictive merge candidate prediction motion vectors (candidate prediction motion vector 1), scaled bi-directional predictive merge candidate prediction motion vectors (candidate prediction motion vector 2), and zero vectors Merge / AMVP candidate prediction motion vector (candidate prediction motion vector 3).
- a spatial candidate prediction motion vector and a temporal candidate prediction motion vector may be directly included in the candidate prediction motion vector list, and an artificially generated additional merge candidate prediction motion vector is indicated in the candidate prediction motion vector list through an identification bit.
- FIG. 10 is an exemplary schematic diagram of adding a combined candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
- the combined bi-directional predictive merge candidate prediction motion vector may be generated by combining the original merge candidate prediction motion vector.
- two candidate prediction motion vectors (which have mvL0_A and ref0 or mvL1_B and ref0) among the original candidate prediction motion vectors may be used to generate a bidirectional predictive merge candidate prediction motion vector.
- two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list.
- the prediction type of one candidate prediction motion vector is List 0 unidirectional prediction
- the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction.
- mvL0_A and ref0 are picked from list 0
- mvL1_B and ref0 are picked from list 1
- a bidirectional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and MvL1_B and ref0) in Listing 1 and check whether it is different from the candidate prediction motion vectors that have been included in the candidate prediction motion vector list. If it is different, the video decoder may include the bi-directional predictive merge candidate prediction motion vector in the candidate prediction motion vector list.
- FIG. 11 is an exemplary schematic diagram of adding a scaled candidate motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
- the scaled bi-directional predictive merge candidate prediction motion vector may be generated by scaling the original merge candidate prediction motion vector.
- a candidate prediction motion vector (which may have mvL0_A and ref0 or mvL1_A and ref1) from the original candidate prediction motion vector may be used to generate a bidirectional predictive merge candidate prediction motion vector.
- two candidate prediction motion vectors are included in the original merge candidate prediction motion vector list.
- the prediction type of one candidate prediction motion vector is List 0 unidirectional prediction
- the prediction type of the other candidate prediction motion vector is List 1 unidirectional prediction.
- mvL0_A and ref0 may be picked from list 0, and ref0 may be copied to the reference index ref0 ′ in list 1. Then, mvL0′_A may be calculated by scaling mvL0_A with ref0 and ref0 ′. The scaling may depend on the POC distance.
- a bi-directional predictive merge candidate prediction motion vector (which has mvL0_A and ref0 in list 0 and mvL0'_A and ref0 'in list 1) can be generated and checked if it is a duplicate. If it is not duplicate, it can be added to the merge candidate prediction motion vector list.
- FIG. 12 is an exemplary schematic diagram of adding a zero motion vector to a merge mode candidate prediction motion vector list in an embodiment of the present application.
- the zero vector merge candidate prediction motion vector may be generated by combining the zero vector with a reference index that can be referred to. If the zero vector candidate prediction motion vector is not duplicated, it can be added to the merge candidate prediction motion vector list. For each generated merge candidate prediction motion vector, the motion information may be compared with the motion information of the previous candidate prediction motion vector in the list.
- the pruning operation may include comparing one or more new candidate prediction motion vectors with candidate prediction motion vectors already in the candidate prediction motion vector list and not adding as candidates already in the candidate prediction motion vector list. Repeated new candidate prediction motion vector for prediction motion vector.
- the pruning operation may include adding one or more new candidate prediction motion vectors to a list of candidate prediction motion vectors and removing duplicate candidate prediction motion vectors from the list later.
- a newly generated candidate prediction motion vector may be used as a type of candidate motion vector, which is indicated by an identification bit in the original candidate prediction motion vector list Newly generated candidate prediction motion vector.
- the code stream includes an identifier 1 indicating the category of the newly generated candidate prediction motion vector and the selected candidate motion vector is in the new Identification 2 of the position in the category of the generated candidate prediction motion vector.
- the selected candidate motion vector is determined from the candidate prediction motion vector list according to the identifier 1 and the identifier 2 and a subsequent decoding process is performed.
- the spatial candidate prediction mode is exemplified from five positions 252A to 252E shown in FIG. s position.
- the spatial candidate prediction mode may further include, for example, a preset distance from the image block to be processed Within a distance, but not adjacent to the image block to be processed. Exemplarily, such positions may be shown as 252F to 252J in FIG. 13.
- FIG. 13 is an exemplary schematic diagram of a coding unit and an adjacent position image block associated with the coding unit in the embodiment of the present application. The positions described in the image blocks that are in the same image frame as the image block to be processed and that have been reconstructed when the image block to be processed is not adjacent to the image block to be processed are within the range of such positions.
- This type of location may be referred to as a non-adjacent image block in the spatial domain, and the first non-adjacent image block, the second non-adjacent image block, and the third non-adjacent image block in the spatial domain may be available.
- the physical meaning of "available” can be referred As mentioned above, I will not repeat them here.
- the candidate prediction motion mode list is checked and constructed in the following order. It should be understood that the check includes the “available” mentioned above. The process of checking and trimming will not be repeated here.
- the candidate prediction mode list includes: a motion vector of a 252A position image block, a motion vector of a 252B position image block, a motion vector of a 252C position image block, a motion vector of a 252D position image block, and prediction from a selective time domain motion vector ( (ATMVP) technology, motion vector obtained from 252E position image block, motion vector obtained by spatio-temporal motion vector prediction (STMVP) technology.
- ATMVP technology and STMVP technology are detailed in JVET-G1001-v1 section 2.3.1.1 and 2.3.1.2. This article introduces JVET-G1001-v1 in its entirety, and will not repeat them here.
- the candidate prediction mode list includes the above 7 prediction motion vectors.
- the number of prediction motion vectors included in the candidate prediction mode list may be less than 7, such as Take the first 5 to form the candidate prediction mode list, and also add the motion vectors constructed by the feasible embodiments described in FIGS. 10 to 12 to the candidate prediction mode list to make it contain more Predicted motion vector.
- the first spatial domain non-adjacent image block, the second spatial domain non-adjacent image block, and the third spatial domain non-adjacent image block may be added to the candidate prediction mode list as image blocks to be processed. Predicted motion vector.
- motion vector of the 252A position image block, the 252B position image block, the 252C position image block, the 252D position image block, the motion vector obtained by ATMVP technology, and the 252E position image block are MVL, MMU, MVUR, MVDL, MVA, MVUL, and MVS.
- Motion vectors are MV0, MV1, and MV2 respectively, you can check and build a candidate prediction motion vector list in the following order:
- Example 1 MVL, MMU, MVUR, MVDL, MV0, MV1, MV2, MVA, MVUL, MVS;
- Example 2 MVL, MMU, MVUR, MVDL, MVA, MV0, MV1, MV2, MVUL, MVS;
- Example 3 MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
- Example 4 MVL, MMU, MVUR, MVDL, MVA, MVUL, MVS, MV0, MV1, MV2;
- Example 5 MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MVS, MV2;
- Example 6 MVL, MMU, MVUR, MVDL, MVA, MV0, MVUL, MV1, MV2, MVS;
- Example 7 MVL, MMU, MVUR, MVDL, MVA, MVUL, MV0, MV1, MV2, MVS;
- candidate prediction motion vectors may be used in the Merge mode or AMVP mode described above, or other prediction modes that obtain the predicted motion vectors of the image block to be processed, may be used at the encoding end, or may be used with the corresponding The encoding end is used consistently at the decoding end, without limitation.
- the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and is consistent at the encoding and decoding ends, and the specific number is not limited.
- examples 1 to 7 give examples of the composition of several feasible candidate prediction motion vector lists. Based on the motion vectors of non-contiguous image blocks in the spatial domain, there may be other composition methods of candidate prediction motion vector lists. The arrangement of candidate prediction motion vectors in the list is not limited.
- This embodiment of the present application provides another method for constructing candidate prediction motion vector lists. Compared with the methods of candidate prediction motion vector lists such as Examples 1 to 7, this embodiment will determine candidate prediction motion vectors in other embodiments. Combined with the preset vector difference, a new candidate prediction motion vector is formed, which overcomes the shortcomings of low prediction accuracy of the prediction motion vector and improves coding efficiency.
- the candidate prediction motion vector list of the image block to be processed includes two sub-lists: a first motion vector set and a vector difference set.
- a first motion vector set For the composition of the first motion vector set, reference may be made to various configurations in the foregoing embodiments of the present invention.
- the vector difference set includes one or more preset vector differences.
- each vector difference in the vector difference set is added to the original target motion vector determined from the first motion vector set, and the added vector difference and the original target motion vector become a new one. Sport vector collection.
- the candidate prediction motion vector list shown in FIG. 14A may include the vector difference set as a subset in the candidate prediction motion vector list, and the candidate prediction motion vector list is calculated by an identification bit (vector difference set calculation).
- MV vector difference set calculation
- each vector difference is indicated by an index in the vector difference set
- a candidate prediction motion vector list constructed is shown in FIG. 14B.
- the manner in which a class of candidate motion vectors is indicated in the predicted motion vector list provided by the technology of the present application can be used in the Merge mode or AMVP mode described above, or other predicted motion vectors of image blocks to be processed.
- the prediction mode it can be used at the encoding end, or can be used at the decoding end consistent with the corresponding encoding end, without limitation.
- the number of candidate prediction motion vectors in the candidate prediction motion vector list is also preset, and The encoding and decoding ends are consistent, and the specific number is not limited.
- the method for decoding the predicted motion information provided in the embodiments of the present application will be described in detail with reference to the accompanying drawings.
- a type of candidate motion information is indicated in the list to control the length of the list, and the prediction motion information provided by the embodiment of the present application is decoded
- the method is developed based on this.
- the method is executed by a decoding device, which may be the video decoder 200 in the video decoding system 1 shown in FIG. 1, or may be a functional unit in the video decoder 200. This application This is not specifically limited.
- FIG. 15 is a schematic flowchart of an embodiment of the present application, which relates to a decoding method for predicting motion information, and specifically may include:
- the decoding device parses the code stream to obtain a first identifier.
- the code stream is sent by the encoding end after encoding the current image block.
- the first identifier indicates the position of the selected candidate motion information when the encoding end encodes the current image block.
- the first identifier is used by the decoding device to determine the The selected candidate motion information further predicts the motion information of the image block to be processed.
- the first identifier may be a specific index of the selected candidate motion information. In this case, the first identifier may uniquely determine one candidate motion information.
- the first identifier may be an identifier of a category to which the selected candidate motion information belongs.
- the code stream further includes a fourth identifier to indicate that the selected candidate motion information is in its own category. Specific location.
- the first identifier may be a fixed-length encoding method.
- the first identifier may be a 1-bit identifier, and the types of indication are limited.
- the first identifier may adopt a variable length encoding method.
- the decoding device determines a target element from the first candidate set according to the first identifier.
- the content of the first candidate set may include the following two possible implementations:
- Elements in the first candidate set include at least one first candidate motion information and at least one second candidate set, and elements in the second candidate set include multiple second candidate motion information.
- Elements in the first candidate set may include at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset Motion information offset. New motion information may be generated according to the first motion information and a preset motion information offset.
- the first candidate set may be a constructed candidate motion information list.
- at least one first candidate motion information is directly included, and a plurality of second candidate motion information is included in the first candidate set in the form of a second candidate set.
- the second candidate motion information and the first candidate motion information are different.
- the first candidate motion information and the second candidate motion information included in each second candidate set may be candidate motion information determined by using different MV prediction modes, or may be different types of candidate motion information. This embodiment of the present application does not specifically limit this.
- the first candidate motion information may be motion information acquired in a Merge manner
- the second candidate motion information may be motion information acquired in an Affine Merge manner.
- the first candidate motion information may be original candidate motion information
- the second candidate motion information may be motion information generated according to the original candidate motion information
- an identification bit in the list is used to indicate a candidate motion information set.
- the identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application.
- the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B.
- the first identifier in the code stream indicates the identifier bit
- it is determined that the target element is a candidate motion information set indicated by the identifier bit.
- the candidate motion information set indicated by the identification bit includes a plurality of second candidate motion information. For the candidate motion information set pointed to by the identification bit, one of the candidate motion information is selected as the target motion information according to the further identification (the second identification in S1504), and used to predict the motion information of the image block to be processed.
- an identification bit in the list is used to indicate a candidate motion information set.
- the identification bit can be located at any position in the list, which is not specifically limited in the embodiment of the present application.
- the identification bit may be located at the end of the list as shown in FIG. 16A; or, the identification bit may be located at the middle of the list as shown in FIG. 16B.
- the first identifier in the code stream indicates the identifier bit
- it is determined that the target element is a plurality of second candidate motion information indicated by the identifier bit.
- the second candidate motion information includes a preset motion information offset.
- one of the candidate motion information is selected according to the further identification (the second identification in S1504), and the target motion information is determined based on the selected second candidate motion information, To predict the motion information of the image block to be processed.
- the Merge candidate list in the Merge candidate list, more than one flag is added, and each flag points to a specific candidate motion information set or a plurality of preset motion information is included. Offset motion information.
- the first identifier in the code stream indicates a certain identifier bit
- it is determined that the target element is the candidate motion information in the candidate motion information set indicated by the identifier bit, or according to multiple candidate motion information indicated by the identifier bit (including One of the preset motion information offsets) to determine the target motion information.
- Figures 16A, 16B, and 16C introduce the identification (pointer) method in the Merge list to implement the introduction of candidates as a subset.
- the length of the candidate list is greatly reduced, and the complexity of list reconstruction is reduced. Degree, which is helpful for simplifying the hardware implementation.
- the first candidate motion information may include motion information of spatially adjacent image blocks of the image block to be processed. It should be noted that the definition of the motion information of the adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
- the second candidate motion information may include motion information of a spatial domain non-adjacent image block of the image block to be processed. It should be noted that the definition of motion information of non-adjacent image blocks in the spatial domain has been described in the foregoing, and is not repeated here.
- the method for acquiring the first motion information may be selected according to actual requirements, which is not specifically limited in the embodiment of the present application.
- the value of the preset motion information offset used to obtain the second motion information may be a fixed value or a value selected from a set.
- the content of the preset motion information offset in this embodiment of the present application Neither the form nor the form is specifically limited.
- the first candidate motion information includes first motion information
- at least one second candidate set is a plurality of second candidate sets
- the plurality of second candidate sets includes at least one third candidate set and at least one A fourth candidate set
- the elements of the third candidate set include motion information of spatial non-adjacent image blocks of a plurality of image blocks to be processed
- the elements of the fourth candidate set include a plurality of motions based on the first motion information and a preset motion Information obtained from the information offset.
- the at least one second candidate set is a plurality of second candidate sets
- the plurality of second candidate sets includes at least one fifth candidate set and at least one sixth candidate set.
- the elements include motion information of spatially non-adjacent image blocks of a plurality of image blocks to be processed, and the elements in the sixth candidate set include a plurality of preset motion information offsets.
- the coding codeword used to identify the first motion information is the shortest.
- the first motion information does not include motion information obtained according to the ATMVP mode.
- the first identifier may be an index in the first candidate set or an identifier classified in the motion information.
- S1502 may be implemented in the following two cases:
- the first identifier is an index in the first candidate set.
- the decoding device in S1502 may determine an element at a position indicated by the first identifier in the first candidate set as a target element. Since the first candidate set includes at least one first candidate motion information and at least one second candidate set, the target element determined according to the first identifier may be the first candidate motion information or a second candidate set, depending on The content arranged at the position indicated by the first identifier.
- the decoding device in S1502 may determine the element at the position indicated by the first identifier in the first candidate set as the target element. Since the first candidate set includes at least one first candidate motion information and a plurality of second candidate motion information, the target element determined according to the first identifier may be the first candidate motion information or may be based on a plurality of second candidate motion information. The information obtained depends on the content of the position arrangement indicated by the first identifier.
- the first identifier is an identifier of candidate motion information classification.
- the decoding device in S1502 determines the classification to which the target element belongs according to the first identifier.
- the decoding device parses the bitstream to obtain a fourth identifier, the fourth identifier indicates a specific position of the target element in its classification, and uniquely determines the target element in its classification according to the fourth identifier. Specifically, if the first identifier indicates that the target element belongs to the classification of the first candidate motion information, one first candidate motion information is determined as the target element among the at least one first candidate motion information according to the fourth identifier. If the first identifier indicates that the target element belongs to a certain category of the second candidate motion information, a second candidate set or a second candidate motion information as the target element is determined according to the fourth identifier.
- the first candidate motion information is Merge motion information
- the first candidate set includes two second candidate sets
- the second candidate motion information in one second candidate set is Affine Merge motion information of the first type.
- the second candidate motion information in another second candidate set is Affine Merge motion information of the second type.
- a configuration identifier of 0 indicates Merge motion information
- an identifier of 1 indicates Affine Merge motion information.
- the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, one second candidate set is determined as the target element among the two second candidate sets.
- the first candidate motion information is Merge motion information
- the first candidate set includes two second candidate sets
- the second candidate motion information in one second candidate set corresponds to the first type of Affine Merge motion information.
- a preset motion information offset, and the second candidate motion information in another second candidate set is a preset motion information offset of the second type of AffineMerge motion information.
- a configuration identifier of 0 indicates Merge motion information
- an identifier of 1 indicates Affine Merge motion information.
- one Merge motion information is determined as a target element among at least one Merge motion information in the first candidate set. If the decoding device obtains the first identifier obtained by analyzing the code stream in S1501, then the decoding device obtains the fourth identifier by analyzing the code stream in S1502. According to the fourth identification, a second candidate set is determined from the two second candidate sets, and a target element is determined based on one of the second candidate motion information in the determined second candidate set.
- the decoding device determines that the target element is the first candidate motion information, and executes S1503; in S1502, the decoding device determines that the target element is the second candidate set, or is obtained according to multiple second candidate motion information. , Then execute S1504.
- the target motion information is used to predict the motion information of the image block to be processed.
- the target motion information is used to predict the motion information of the image block to be processed, which can be specifically implemented as: using the target motion information as the motion information of the image block to be processed; or, using the target motion information as the predicted motion of the image block to be processed information.
- a specific implementation of selecting target motion information to predict motion information of an image block to be processed may be selected according to actual requirements, which is not specifically limited here.
- the code stream is parsed in S1504 to obtain a second identifier, and the target motion information is determined based on the second identifier based on one of the plurality of second candidate motion information, which may be specifically implemented as: analyzing the code stream to obtain the second identifier, according to The second identifier determines target motion information from a plurality of second candidate motion information.
- the second identifier may adopt a fixed-length encoding method.
- the second identifier may be a 1-bit identifier, and the types of indication are limited.
- the second identifier may adopt a variable-length encoding method.
- the second identifier may be a plurality of bit identifiers.
- determining the target motion information may be achieved by one of the following feasible implementation methods, but It is not limited to this.
- the second candidate motion information when the first candidate motion information includes the first motion information, the second candidate motion information includes the second motion information, and the second motion information is based on the first motion information and a preset motion information offset.
- the second identifier may be a specific position of the target motion information in the second candidate set, and the decoding device in S1504 determines the target motion information from a plurality of second candidate motion information according to the second identifier. It is implemented as: determining the second candidate motion information of the position indicated by the second identifier in the second candidate set as the target element as the target motion information.
- the second identifier is that the target offset is between The specific position in the second candidate set.
- the decoding device in S1504 determines the target motion information from the plurality of second candidate motion information according to the second identifier, which may be specifically implemented as: biasing from a plurality of preset motion information according to the second identifier
- the target offset is determined in the shift amount; the target motion information is determined based on the first motion information and the target offset.
- the method for decoding prediction motion information may further include: multiplying a plurality of preset motion information offsets and a preset coefficient to obtain a plurality of adjusted motions.
- Information offset correspondingly, determining a target offset from a plurality of preset motion information offsets according to a second identifier, including: determining a target from a plurality of adjusted motion information offsets according to a second identifier Offset.
- Determining the target motion information based on one of the plurality of second candidate motion information may be specifically implemented as follows: determining a motion information offset from a plurality of preset motion information offsets and multiplying the preset by a second identifier The coefficient is used as the target offset; the target motion information is determined based on the first motion information and the target offset.
- the preset coefficient may be a fixed coefficient configured in the decoding device, or may be a coefficient carried in a code stream, which is not specifically limited in this embodiment of the present application.
- the method for decoding prediction motion information provided in this application may further include S1505.
- the third identifier includes a preset coefficient.
- the elements in the first candidate set include the first candidate motion information and at least one second candidate set, or the elements in the first candidate set include the first candidate motion information And multiple second candidate motion information.
- a type of candidate motion information set can be added as an element to the first candidate set, compared to The candidate motion information is directly added to the first candidate set, and the length of the first candidate set is greatly selected.
- the first candidate set is a candidate motion information list for inter prediction, even if more candidates are caused, the length of the candidate motion information list can be well controlled, which facilitates the detection process and hardware implementation.
- the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image
- the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the index 0 and a preset motion vector offset.
- the candidate motion information corresponding to the first index 0 is forward prediction
- the motion vector is (2, -3)
- the reference frame POC is 2.
- the preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1).
- the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value.
- the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image
- the first index 6 corresponds to new motion information generated based on the candidate motion information corresponding to the first index 0 and a preset motion vector offset.
- the motion information of the candidate corresponding to the first index 0 is bidirectional prediction
- the forward motion vector is (2, -3)
- the reference frame POC is 2
- the backward motion vector is (-2, -1)
- the preset motion vector offsets are (1,0), (0, -1), (-1, 0), (0, 1).
- the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to index 0 and a preset motion vector offset, and then further decoded to obtain The second index value.
- the motion information of the current image block is bidirectional prediction, and when the current frame POC is 3, the forward and backward reference frames POC are non-unidirectional than the current frame POC.
- the candidate motion information corresponding to the first index 0-5 includes a motion vector and a reference image. It is assumed that the candidate motion information indicated by the first index 0 is composed of sub-block motion information, and the candidate motion information corresponding to the first index 1 is not composed of sub-blocks.
- the motion information is composed of motion information and forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
- the first index 6 corresponds to the candidate motion information corresponding to the first index 1 and the preset motion vector bias. New motion information generated by shifting; preset motion vector offsets are (1, 0), (0, -1), (-1, 0), (0, 1).
- the motion information used by the current image block is new motion information generated based on the candidate motion information corresponding to the first index 1 and a preset motion vector offset, and then further decoded.
- the maximum length of the Merge candidate list be 7
- the first index 0-6 indicates each candidate space in the Merge list.
- the first index 6 indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block.
- the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-adjacent airspace candidate motion information in the set be as follows:
- Second index 0 candidate 0: forward prediction
- the motion vector is (2, -3)
- the reference frame POC is 2.
- Second index 1 Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
- Second index 2 Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
- Second index 3 Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
- the first index value obtained by decoding is 6, it indicates that the current block uses the motion information of the non-adjacent spatial candidate as the reference motion information of the current block, and then is further decoded to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of candidate 1 in the non-adjacent spatial domain candidate set is used as the motion information of the current block.
- the candidate motion information corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
- the first index 6 indicates new motion information generated based on candidate motion information corresponding to the first index 0 or motion information using non-adjacent spatial domain candidates as reference motion information of the current block.
- the size of the non-adjacent airspace candidate set be 4, the non-adjacent airspace candidate set puts the available non-adjacent airspace candidates into the set according to a preset detection order, and let the non-non-adjacent airspace candidate motion information in the set be as follows:
- Second index 0 candidate 0: forward prediction, the motion vector is (-5, -3), and the reference frame POC is 2.
- Second index 1 Candidate 1: Forward prediction, the motion vector is (1, -3), and the reference frame POC is 4.
- Second index 2 Candidate 2: Backward prediction, the motion vector is (2, -4), and the reference frame POC is 2.
- Second index 3 Candidate 3: Bidirectional prediction, forward motion vector is (2, -3), reference frame POC is 2, backward motion vector is (2, -2), and reference frame POC is 4.
- Second index 4 Candidate 4: Forward prediction, the motion vector is (2, -3) + (1,0), and the reference frame POC is 2.
- Second index 5 Candidate 5: Forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2.
- Second index 6 candidate 6: forward prediction, the motion vector is (2, -3) + (-1,0), and the reference frame POC is 2.
- Second index 7 candidate 7: forward prediction, the motion vector is (2, -3) + (0, 1), and the reference frame POC is 2.
- the first index value obtained by decoding is 6, it indicates that the current block uses new motion information generated based on candidate motion information corresponding to the first index 0 or uses non-adjacent spatial candidate motion information as reference motion information of the current block. Then it is further decoded to obtain a second index value.
- the motion information of candidate 0 forward prediction, motion vector is (-5, -3), and reference frame POC is 2) in the non-adjacent spatial candidate set.
- the motion vector offset candidate 5 forward prediction, the motion vector is (2, -3) + (0, -1), and the reference frame POC is 2) As the current block motion information.
- the first index 0-6 indicates each candidate space in the Merge list.
- the motion information of the candidate corresponding to the first index 0 is forward prediction, the motion vector is (2, -3), and the reference frame POC is 2.
- the first index 6 indicates that the motion information adopted by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0. Offset according to a preset motion vector:
- the second index value 0 indicates candidates with a spacing of 1
- the 1 index indicates candidates with a spacing of 2
- the third index value indicates a candidate index of a motion vector offset.
- the first index value obtained by decoding is 6, it indicates that the motion information used by the current block is new motion information generated based on the candidate motion information corresponding to the first index 0, and then further decoded to obtain a second index value.
- the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
- Second index 0 AFFINE candidate 0;
- Second index 1 AFFINE candidate 1;
- Second index 2 AFFINE candidate 2
- Second index 3 AFFINE candidate 3;
- the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
- the neighboring spatial motion information candidate set includes four neighboring spatial motion information candidates:
- Second index 0 neighboring spatial candidate 0;
- Second index 1 adjacent airspace candidate 1;
- Second index 2 adjacent airspace candidate 2
- Second index 3 adjacent airspace candidate 3;
- the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by using the neighboring space for the current block is the reference motion information, and then further decoded to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of the neighboring spatial domain candidate 1 is used as the motion information of the current block.
- the first index 0-6 indicates each candidate space in the Merge list.
- the first index 6 indicates that one candidate in the motion information candidate set obtained by using the neighboring time domain for the current block is reference motion information.
- the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
- Second index 0 adjacent time domain candidate 0;
- Second index 1 adjacent time domain candidate 1;
- Second index 2 Adjacent time domain candidate 2
- Second index 3 adjacent time domain candidate 3;
- the first index value obtained by decoding is 6, it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of the neighboring time domain candidate 1 is used as the motion information of the current block.
- the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
- Second index 0 AFFINE candidate
- Second index 1 ATMVP candidate
- Second index 2 STMVP candidates
- the first index value obtained by decoding is 6, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as the reference motion information, and then further decodes to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
- spaces 0-5 in the list are motion information obtained by using Merge
- space 6 is a motion information candidate set obtained by AFFINE.
- the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information
- the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information.
- the AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
- Second index 0 AFFINE candidate 0;
- Second index 1 AFFINE candidate 1;
- Second index 2 AFFINE candidate 2
- Second index 3 AFFINE candidate 3;
- the first index value obtained by decoding when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value.
- the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block;
- the first index value obtained by decoding when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index.
- the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
- spaces 0-3 in the list are motion information obtained using Merge
- space 4 is a motion information candidate set obtained using adjacent time domains
- space 5 is a motion information candidate set composed of sub-block motion information
- Space 6 is the candidate set of motion information obtained by AFFINE.
- the first index 0 indicate that the current block uses the motion information obtained by Merge as the reference motion information
- the first index 1 indicates that one of the candidates in the motion information candidate set obtained by the current block using AFFINE is the reference motion information
- the first index 01 Indicates that one of the candidates of the motion information candidate set obtained by using the adjacent time domain to the current block is reference motion information
- the first index 11 indicates that the current block uses one of the motion information candidate sets composed of the sub-block motion information
- Candidates are reference motion information.
- AFFINE motion information candidate set includes 4 AFFINE motion information candidates:
- Second identification 0 AFFINE candidate 0;
- Second identification 1 AFFINE candidate 1;
- Second identifier 2 AFFINE candidate 2;
- Second identification 3 AFFINE candidate 3;
- the adjacent temporal motion information candidate set includes four adjacent temporal motion information candidates:
- Second index 0 adjacent time domain candidate 0;
- Second index 1 adjacent time domain candidate 1;
- Second index 2 Adjacent time domain candidate 2
- Second index 3 adjacent time domain candidate 3;
- the motion information candidate set composed of sub-block motion information includes AFFINE motion information candidates, ATMVP, and STMVP candidates:
- Second index 0 AFFINE candidate
- Second index 1 ATMVP candidate
- Second index 2 STMVP candidates
- the first index value obtained by decoding when the first index value obtained by decoding is 0, it indicates that the current block uses the motion information obtained by Merge as the reference motion information, and then is further decoded to obtain a fourth index.
- the fourth index value obtained by further decoding is 2, the motion information of space 2 in the Merge candidate list is used as the motion information of the current block.
- the first index value obtained by decoding when the first index value obtained by decoding is 1, it indicates that one of the candidates of the motion information candidate set obtained by AFFINE is the reference motion information, and then further decoded to obtain the second identification value.
- the second identification value obtained by further decoding is 1, the motion information of the AFFINE candidate 1 is used as the motion information of the current block.
- the first index value obtained by decoding is 01
- it indicates that one of the candidates in the motion information candidate set obtained by the neighboring time domain is used as the reference motion information, and then further decoded to obtain a second identification value.
- the second identification value obtained by further decoding is 2, the motion information of the neighboring time-domain candidate 2 is used as the motion information of the current block.
- the first index value obtained by decoding when the first index value obtained by decoding is 11, it indicates that the current block uses one candidate of the motion information candidate set composed of the sub-block motion information as reference motion information, and then further decodes to obtain the second index value.
- the second index value obtained by further decoding is 1, the motion information of the ATMVP candidate is used as the motion information of the current block.
- An embodiment of the present application provides a decoding device for predicting motion information.
- the device may be a video decoder, a video encoder, or a decoder.
- the decoding apparatus for predicting motion information is configured to perform the steps performed by the decoding apparatus in the decoding method for predicting motion information.
- the decoding apparatus for predicting motion information provided in the embodiment of the present application may include a module corresponding to a corresponding step.
- the functional modules of the prediction motion information decoding device may be divided according to the foregoing method example.
- each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
- the above integrated modules may be implemented in the form of hardware or software functional modules.
- the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
- FIG. 17 illustrates a possible structural diagram of a decoding apparatus for predicting motion information involved in the foregoing embodiment.
- the decoding apparatus 1700 for predicting motion information may include an analysis module 1701, a determination module 1702, and an assignment module 1703.
- the functions of each module are as follows:
- the analysis module 1701 is configured to parse a code stream to obtain a first identifier.
- a determining module 1702 is configured to determine a target element from a first candidate set according to a first identifier, and the elements in the first candidate set include at least one first candidate motion information and at least one second candidate set.
- the element includes a plurality of second candidate motion information, or the first candidate motion information includes the first motion information, and the second candidate motion information includes a preset motion information offset.
- the assignment module 1703 is configured to use the first candidate motion information as the target motion information when the target element is the first candidate motion information, and the target motion information is used to predict the motion information of the image block to be processed.
- the analysis module 1701 is further configured to parse a code stream to obtain a second identifier when the target element is the second candidate set. According to the second identifier, the determination module 1702 is further configured to determine target motion information from multiple second candidate motion information. . Alternatively, the parsing module 1701 is configured to parse the code stream to obtain a second identifier when the target element is obtained according to multiple second candidate motion information, and determine the target motion based on one of the multiple second candidate motion information according to the second identifier. information.
- the analysis module 1701 is configured to support the decoding device 1700 for predicting motion information to perform S1501, S1505, and the like in the above embodiments, and / or other processes used in the technology described herein.
- the determining module 1702 is configured to support the decoding apparatus 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein.
- the assignment module 1703 is configured to support the decoding device 1700 for predicting motion information to perform S1502 and the like in the above embodiments, and / or other processes used in the technology described herein.
- the analysis module 1701 is further configured to: Code stream to obtain a third identifier, and the third identifier includes a preset coefficient.
- the decoding apparatus 1700 for predicting motion information may further include a calculation module 1704, configured to multiply the plurality of preset motion information offsets by the preset coefficients to obtain multiple Offset of the adjusted motion information.
- the determination module 1702 is specifically configured to determine the target offset from the plurality of adjusted motion information offsets according to the second identifier.
- FIG. 18 is a schematic structural block diagram of a decoding device 1800 for predicting motion information in an embodiment of the present application.
- the decoding device 1800 for predicting motion information includes: a processor 1801 and a memory 1802 coupled to the processor; the processor 1801 is configured to execute the embodiment shown in FIG. 17 and various feasible implementations.
- the processing module 1801 may be a processor or a controller, for example, it may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an ASIC, an FPGA, or other programmable A logic device, a transistor logic device, a hardware component, or any combination thereof. It may implement or execute various exemplary logical blocks, modules, and circuits described in connection with the present disclosure.
- the processor may also be a combination that implements computing functions, such as a combination including one or more microprocessors, a combination of a DSP and a microprocessor, and so on.
- the storage module 102 may be a memory.
- the above-mentioned prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may both execute the above-mentioned prediction motion information decoding method shown in FIG. 15.
- the prediction motion information decoding device 1700 and the prediction motion information decoding device 1800 may specifically be Video decoding device or other equipment with video codec function.
- the decoding apparatus 1700 for predicting motion information and the decoding apparatus 1800 for predicting motion information may be used to perform image prediction in the decoding process.
- An embodiment of the present application provides an inter prediction device.
- the inter prediction device may be a video decoder, a video encoder, or a decoder.
- the inter prediction apparatus is configured to perform the steps performed by the inter prediction apparatus in the above inter prediction method.
- the inter prediction apparatus provided in the embodiment of the present application may include a module corresponding to a corresponding step.
- each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
- the above integrated modules may be implemented in the form of hardware or software functional modules.
- the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
- the present application also provides a terminal, which includes: one or more processors, a memory, and a communication interface.
- the memory and the communication interface are coupled to one or more processors; the memory is used to store computer program code, and the computer program code includes instructions.
- the terminal executes the predicted motion information in the embodiment of the present application. Decoding method.
- the terminal here can be a video display device, a smart phone, a portable computer, and other devices that can process or play videos.
- the present application also provides a video decoder including a non-volatile storage medium and a central processing unit.
- the non-volatile storage medium stores an executable program, and the central processing unit and the non-volatile storage unit The media is connected, and the executable program is executed to implement the decoding method of the predicted motion information in the embodiment of the present application.
- the present application further provides a decoder, which includes a decoding apparatus for predicting motion information in the embodiment of the present application.
- Another embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium includes one or more program codes, the one or more programs include instructions, and when a processor in a terminal executes the program code, At this time, the terminal executes the decoding method of the predicted motion information shown in FIG. 15.
- a computer program product includes computer-executable instructions stored in a computer-readable storage medium; at least one processor of the terminal may be obtained from a computer.
- the storage medium reads the computer execution instruction, and at least one processor executes the computer execution instruction to cause the terminal to execute a decoding method for predicting motion information as shown in FIG. 15.
- all or part of them may be implemented by software, hardware, firmware, or any combination thereof.
- a software program When implemented using a software program, it may appear in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions according to the embodiments of the present application are generated.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center by wire (for example, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (for example, infrared, wireless, microwave, etc.).
- the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (Solid State Disk (SSD)), and the like.
- a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
- an optical medium for example, a DVD
- a semiconductor medium for example, a solid state disk (Solid State Disk (SSD)
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over a computer-readable medium as one or more instructions or code, and executed by a hardware-based processing unit.
- the computer-readable medium may include a computer-readable storage medium or a communication medium, the computer-readable storage medium corresponding to a tangible medium such as a data storage medium, and the communication medium includes a computer program that facilitates, for example, transmission from one place to another according to a communication protocol Any media.
- computer-readable media may illustratively correspond to (1) non-transitory, tangible computer-readable storage media, or (2) a communication medium such as a signal or carrier wave.
- a data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application.
- the computer program product may include a computer-readable medium.
- the computer-readable storage medium may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store rendering instructions. Or any other medium in the form of a data structure with the desired code and accessible by a computer. Also, any connection is properly termed a computer-readable medium.
- coaxial cable fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology such as infrared, radio, and microwave
- DSL digital subscriber line
- coaxial Cables, fiber optic cables, twisted pairs, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media.
- computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transitory, tangible storage media.
- magnetic disks and optical discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), flexible disks, and Blu-ray discs, where magnetic discs typically reproduce data magnetically, and optical discs pass lasers The data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable gate arrays
- processors may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein.
- functionality described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
- the techniques of this application can be implemented in a wide variety of devices or devices, including wireless handsets, integrated circuits (ICs), or collections of ICs (eg, chipset).
- ICs integrated circuits
- collections of ICs eg, chipset
- Various components, modules, or units are described in this application to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily need to be implemented by different hardware units. More specifically, as described above, various units may be combined in a codec hardware unit or by interoperable hardware units (including one or more processors as described above) combined with appropriate software and / or firmware To provide.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (20)
- 一种预测运动信息的解码方法,其特征在于,包括:A decoding method for predicting motion information, comprising:解析码流以获得第一标识;Parse the code stream to obtain a first identifier;根据所述第一标识,从第一候选集合中确定目标元素,所述第一候选集合中的元素包括至少一个第一候选运动信息和多个第二候选运动信息,所述第一候选运动信息包括第一运动信息,所述第二候选运动信息包括预设的运动信息偏移量;Determining a target element from a first candidate set according to the first identifier, the elements in the first candidate set including at least one first candidate motion information and a plurality of second candidate motion information, the first candidate motion information Including first motion information, and the second candidate motion information includes a preset motion information offset;当所述目标元素为所述第一候选运动信息时,将所述第一候选运动信息作为目标运动信息,所述目标运动信息用来预测待处理图像块的运动信息;When the target element is the first candidate motion information, use the first candidate motion information as target motion information, and the target motion information is used to predict motion information of an image block to be processed;当所述目标元素根据所述多个第二候选运动信息获得时,解析所述码流以获得第二标识,根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息。When the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and based on the second identifier, based on one of the plurality of second candidate motion information, Determining the target motion information.
- 根据权利要求1所述的方法,其特征在于,所述第一候选运动信息包括所述待处理图像块的空域相邻图像块的运动信息。The method according to claim 1, wherein the first candidate motion information includes motion information of a spatially adjacent image block of the image block to be processed.
- 根据权利要求1或2所述的方法,其特征在于,所述第二候选运动信息基于所述第一运动信息和预设的运动信息偏移量获得。The method according to claim 1 or 2, wherein the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
- 根据权利要求1或2所述的方法,其特征在于,所述根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息,包括:The method according to claim 1 or 2, wherein determining the target motion information based on the second identifier based on one of the plurality of second candidate motion information includes:根据所述第二标识从多个预设的运动信息偏移量中确定目标偏移量;Determining a target offset from a plurality of preset motion information offsets according to the second identifier;基于所述第一运动信息和所述目标偏移量确定所述目标运动信息。The target motion information is determined based on the first motion information and the target offset.
- 根据权利要求1至4任一项所述的方法,其特征在于,在所述至少一个第一候选运动信息中,用于标识所述第一运动信息的编码码字最短。The method according to any one of claims 1 to 4, characterized in that, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
- 根据权利要求1至5任一项所述的方法,其特征在于,当所述目标元素根据所述多个第二候选运动信息获得时,所述方法还包括:The method according to any one of claims 1 to 5, wherein when the target element is obtained according to the plurality of second candidate motion information, the method further comprises:解析所述码流以获得第三标识,所述第三标识包括预设系数。Parse the code stream to obtain a third identifier, where the third identifier includes a preset coefficient.
- 根据权利要求6所述的方法,其特征在于,在根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息之前,所述方法还包括:The method according to claim 6, wherein before the determining the target motion information based on the second identifier based on one of the plurality of second candidate motion information, the method further comprises:将多个预设的运动信息偏移量和预设系数相乘,以得到多个调整后的运动信息偏移量。Multiply a plurality of preset motion information offsets by a preset coefficient to obtain a plurality of adjusted motion information offsets.
- 根据权利要求1至7任一项所述的方法,其特征在于,所述目标运动信息用来预测待处理图像块的运动信息,包括:The method according to any one of claims 1 to 7, wherein the target motion information is used to predict motion information of an image block to be processed, and includes:将所述目标运动信息作为所述待处理图像块的运动信息;或者,将所述目标运动信息作为所述待处理图像块的预测运动信息。Use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
- 根据权利要求1至8任一项所述的方法,其特征在于,所述第二标识采用定长编码方式。The method according to any one of claims 1 to 8, wherein the second identifier adopts a fixed-length encoding method.
- 根据权利要求1至8任一项所述的方法,其特征在于,所述第二标识采用变长编码方式。The method according to any one of claims 1 to 8, wherein the second identifier adopts a variable length coding method.
- 一种预测运动信息的解码装置,其特征在于,包括:A decoding device for predicting motion information, comprising:解析模块,用于解析码流以获得第一标识;A parsing module, configured to parse a code stream to obtain a first identifier;确定模块,用于根据所述第一标识,从第一候选集合中确定目标元素,所述第一 候选集合中的元素包括至少一个第一候选运动信息和多个第二候选运动信息,所述第一候选运动信息包括第一运动信息,所述第二候选运动信息包括预设的运动信息偏移量;A determining module, configured to determine a target element from a first candidate set according to the first identifier, and the elements in the first candidate set include at least one first candidate motion information and a plurality of second candidate motion information; The first candidate motion information includes first motion information, and the second candidate motion information includes a preset motion information offset;赋值模块,用于当所述目标元素为所述第一候选运动信息时,将所述第一候选运动信息作为目标运动信息,所述目标运动信息用来预测待处理图像块的运动信息;An assignment module, configured to use the first candidate motion information as target motion information when the target element is the first candidate motion information, and the target motion information is used to predict motion information of an image block to be processed;所述解析模块还用于,当所述目标元素根据所述多个第二候选运动信息获得时,解析所述码流以获得第二标识,根据所述第二标识,基于所述多个第二候选运动信息中的一个,确定所述目标运动信息。The analysis module is further configured to: when the target element is obtained according to the plurality of second candidate motion information, parse the code stream to obtain a second identifier, and according to the second identifier, based on the plurality of first candidate motion information, One of the two candidate motion information determines the target motion information.
- 根据权利要求11所述的装置,其特征在于,所述第一候选运动信息包括所述待处理图像块的空域相邻图像块的运动信息。The apparatus according to claim 11, wherein the first candidate motion information includes motion information of a spatially adjacent image block of the image block to be processed.
- 根据权利要求11或12所述的装置,其特征在于,所述第二候选运动信息基于所述第一运动信息和预设的运动信息偏移量获得。The apparatus according to claim 11 or 12, wherein the second candidate motion information is obtained based on the first motion information and a preset motion information offset.
- 根据权利要求11或12所述的装置,其特征在于,所述解析模块具体用于:The apparatus according to claim 11 or 12, wherein the analysis module is specifically configured to:根据所述第二标识从多个预设的运动信息偏移量中确定目标偏移量;Determining a target offset from a plurality of preset motion information offsets according to the second identifier;基于所述第一运动信息和所述目标偏移量确定所述目标运动信息。The target motion information is determined based on the first motion information and the target offset.
- 根据权利要求11至14任一项所述的装置,其特征在于,在所述至少一个第一候选运动信息中,用于标识所述第一运动信息的编码码字最短。The device according to any one of claims 11 to 14, wherein, among the at least one first candidate motion information, an encoding codeword for identifying the first motion information is shortest.
- 根据权利要求11至15任一项所述的装置,其特征在于,当所述目标元素根据所述多个第二候选运动信息获得时,所述解析模块还用于:The apparatus according to any one of claims 11 to 15, wherein when the target element is obtained according to the plurality of second candidate motion information, the analysis module is further configured to:解析所述码流以获得第三标识,所述第三标识包括预设系数。Parse the code stream to obtain a third identifier, where the third identifier includes a preset coefficient.
- 根据权利要求16所述的装置,其特征在于,所述装置还包括:The device according to claim 16, further comprising:计算模块,用于将多个预设的运动信息偏移量和所述预设系数相乘,以得到多个调整后的运动信息偏移量。A calculation module is configured to multiply a plurality of preset motion information offsets by the preset coefficients to obtain a plurality of adjusted motion information offsets.
- 根据权利要求11至17任一项所述的装置,其特征在于,所述确定模块具体用于:The device according to any one of claims 11 to 17, wherein the determining module is specifically configured to:将所述目标运动信息作为所述待处理图像块的运动信息;或者,将所述目标运动信息作为所述待处理图像块的预测运动信息。Use the target motion information as the motion information of the image block to be processed; or use the target motion information as the predicted motion information of the image block to be processed.
- 根据权利要求11至18任一项所述的装置,其特征在于,所述第二标识采用定长编码方式。The device according to any one of claims 11 to 18, wherein the second identifier adopts a fixed-length encoding method.
- 根据权利要求11至18任一项所述的装置,其特征在于,所述第二标识采用变长编码方式。The device according to any one of claims 11 to 18, wherein the second identifier adopts a variable length coding method.
Priority Applications (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19860217.9A EP3843404A4 (en) | 2018-09-13 | 2019-09-12 | Decoding method and device for predicted motion information |
SG11202102362UA SG11202102362UA (en) | 2018-09-13 | 2019-09-12 | Decoding method and decoding apparatus for predicting motion information |
BR112021004429-9A BR112021004429A2 (en) | 2018-09-13 | 2019-09-12 | decoding method and decoding apparatus for predicting motion information |
KR1020247028818A KR20240135033A (en) | 2018-09-13 | 2019-09-12 | Decoding method and decoding apparatus for predicting motion information |
KR1020217010321A KR102701208B1 (en) | 2018-09-13 | 2019-09-12 | Decoding method and decoding device for predicting motion information |
JP2021513418A JP7294576B2 (en) | 2018-09-13 | 2019-09-12 | Decoding method and decoding device for predicting motion information |
CA3112289A CA3112289A1 (en) | 2018-09-13 | 2019-09-12 | Decoding method and decoding apparatus for predicting motion information |
US17/198,544 US20210203944A1 (en) | 2018-09-13 | 2021-03-11 | Decoding method and decoding apparatus for predicting motion information |
ZA2021/01890A ZA202101890B (en) | 2018-09-13 | 2021-03-19 | Decoding method and decoding apparatus for predicting motion information |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811068957.4 | 2018-09-13 | ||
CN201811068957 | 2018-09-13 | ||
CN201811264674.7 | 2018-10-26 | ||
CN201811264674.7A CN110896485B (en) | 2018-09-13 | 2018-10-26 | Decoding method and device for predicting motion information |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/198,544 Continuation US20210203944A1 (en) | 2018-09-13 | 2021-03-11 | Decoding method and decoding apparatus for predicting motion information |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020052653A1 true WO2020052653A1 (en) | 2020-03-19 |
Family
ID=69778170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/105711 WO2020052653A1 (en) | 2018-09-13 | 2019-09-12 | Decoding method and device for predicted motion information |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2020052653A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103907346A (en) * | 2011-10-11 | 2014-07-02 | 联发科技股份有限公司 | Method and apparatus of motion and disparity vector derivation for 3D video coding and HEVC |
CN104126302A (en) * | 2011-11-07 | 2014-10-29 | 高通股份有限公司 | Generating additional merge candidates |
CN105308967A (en) * | 2013-04-05 | 2016-02-03 | 三星电子株式会社 | Video stream coding method according to prediction structure for multi-view video and device therefor, and video stream decoding method according to prediction structure for multi-view video and device therefor |
EP3062518A1 (en) * | 2013-10-24 | 2016-08-31 | Electronics and Telecommunications Research Institute | Video encoding/decoding method and apparatus |
-
2019
- 2019-09-12 WO PCT/CN2019/105711 patent/WO2020052653A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103907346A (en) * | 2011-10-11 | 2014-07-02 | 联发科技股份有限公司 | Method and apparatus of motion and disparity vector derivation for 3D video coding and HEVC |
CN104126302A (en) * | 2011-11-07 | 2014-10-29 | 高通股份有限公司 | Generating additional merge candidates |
CN105308967A (en) * | 2013-04-05 | 2016-02-03 | 三星电子株式会社 | Video stream coding method according to prediction structure for multi-view video and device therefor, and video stream decoding method according to prediction structure for multi-view video and device therefor |
EP3062518A1 (en) * | 2013-10-24 | 2016-08-31 | Electronics and Telecommunications Research Institute | Video encoding/decoding method and apparatus |
Non-Patent Citations (2)
Title |
---|
See also references of EP3843404A4 * |
XU CHEN , NA ZHANG , JIANHUA ZHENG : "CE 4: Enhanced Merge Mode (Test 4. 2. 15)", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 11TH MEETING, no. JVET-K0198, 18 July 2018 (2018-07-18), Ljubljana SI, pages 1 - 8, XP030199235 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10609423B2 (en) | Tree-type coding for video coding | |
CN109996081B (en) | Image prediction method, device and coder-decoder | |
TWI544783B (en) | Method,device and computer-readable storage medium of coding video data | |
KR20210024165A (en) | Inter prediction method and apparatus | |
US20130272409A1 (en) | Bandwidth reduction in video coding through applying the same reference index | |
US20130272380A1 (en) | Grouping bypass coded syntax elements in video coding | |
US11172212B2 (en) | Decoder-side refinement tool on/off control | |
KR102542196B1 (en) | Video coding method and apparatus | |
US20210185325A1 (en) | Motion vector obtaining method and apparatus, computer device, and storage medium | |
CN111200735B (en) | Inter-frame prediction method and device | |
US20210203944A1 (en) | Decoding method and decoding apparatus for predicting motion information | |
JP6224851B2 (en) | System and method for low complexity coding and background detection | |
US11394996B2 (en) | Video coding method and apparatus | |
JP7331105B2 (en) | INTER-FRAME PREDICTION METHOD AND RELATED APPARATUS | |
WO2019000443A1 (en) | Inter-frame prediction method and device | |
WO2020042758A1 (en) | Interframe prediction method and device | |
WO2020052653A1 (en) | Decoding method and device for predicted motion information | |
CN110855993A (en) | Method and device for predicting motion information of image block | |
WO2020024275A1 (en) | Inter-frame prediction method and device | |
WO2020038232A1 (en) | Method and apparatus for predicting movement information of image block |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19860217 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3112289 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2021513418 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112021004429 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2019860217 Country of ref document: EP Effective date: 20210323 |
|
ENP | Entry into the national phase |
Ref document number: 20217010321 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 112021004429 Country of ref document: BR Kind code of ref document: A2 Effective date: 20210309 |