WO2021196238A1 - Video processing method, video processing device, and computer-readable storage medium - Google Patents

Video processing method, video processing device, and computer-readable storage medium Download PDF

Info

Publication number
WO2021196238A1
WO2021196238A1 PCT/CN2020/083376 CN2020083376W WO2021196238A1 WO 2021196238 A1 WO2021196238 A1 WO 2021196238A1 CN 2020083376 W CN2020083376 W CN 2020083376W WO 2021196238 A1 WO2021196238 A1 WO 2021196238A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
motion information
candidate list
video processing
hmvp
Prior art date
Application number
PCT/CN2020/083376
Other languages
French (fr)
Chinese (zh)
Inventor
郑萧桢
王苏红
马思伟
王苫社
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080005623.1A priority Critical patent/CN112868235A/en
Priority to PCT/CN2020/083376 priority patent/WO2021196238A1/en
Publication of WO2021196238A1 publication Critical patent/WO2021196238A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • This application relates to the field of communication technology, and in particular to a video processing method, video processing device, and computer-readable storage medium.
  • Video data is a continuous image sequence, composed of continuous frames, one frame is one image.
  • Video data has a strong correlation, which means that there is a lot of redundant information.
  • the redundant information can be divided into spatial redundant information and temporal redundant information.
  • Image coding is to remove the redundant information of each frame image in the video data (that is, to remove the correlation between the data) to obtain the coded image.
  • Image decoding is to get the original image based on the encoded image.
  • the traditional image coding and decoding methods need to use the motion information of the spatial neighboring blocks of the image block to construct a motion information candidate list for the image block, which leads to a complicated construction process of the motion information candidate list and reduces the efficiency of image coding and image decoding.
  • the embodiments of the application provide a video processing method, a video processing device, and a computer-readable storage medium.
  • an image block satisfies a preset condition
  • an embodiment of the present application provides a video processing method, and the video processing method includes:
  • the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
  • an embodiment of the present application provides another video processing method, and the video processing method includes:
  • a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  • an embodiment of the present application provides a video processing device, the video processing device includes a memory and a processor, wherein:
  • the memory is used to store a computer program, and the computer program includes program instructions
  • the processor calling program instructions, is used to perform the following steps:
  • the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
  • an embodiment of the present application provides a video processing device.
  • the video processing device includes a memory and a processor, wherein:
  • the memory is used to store a computer program, and the computer program includes program instructions
  • the processor calling program instructions, is used to perform the following steps:
  • a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program.
  • the computer program includes program instructions that, when executed by a processor, cause the The processor executes the video processing method as described in the first aspect.
  • the embodiments of the present application provide another computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause all The processor executes the video processing method described in the second aspect.
  • the HMVP when a certain image block in the current frame meets a preset condition, the HMVP can be used to construct a motion information candidate list for the image block, and there is no need to use the motion information of the spatial neighboring blocks to construct a motion information candidate for the image block. Therefore, the complexity of the construction process of the motion information candidate list is reduced, and the efficiency of image coding and image decoding is improved.
  • FIG. 1 is a schematic diagram of a codec system framework provided by an embodiment of the present application
  • Fig. 2 is a schematic diagram of an image block provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a framework of an encoder provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the architecture of a video processing system provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a video processing method provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a traditional construction of a motion information candidate list provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a video processing device provided by an embodiment of the present application.
  • the video processing method proposed in the embodiment of the present invention can be applied to a video processing device.
  • the video processing device can be set on a smart terminal (such as a mobile phone, a tablet computer, etc.), and the video processing device can be used on an encoding end or a decoding end. It can be an encoder or a decoder.
  • the embodiments of the present invention can be applied to aircraft (such as drones).
  • the embodiments of the present invention can also be applied to other movable platforms (such as unmanned ships, unmanned vehicles). , Robots, etc.), the embodiment of the present invention does not make specific limitations.
  • Figure 1 can be used as an example to illustrate the coding and decoding system framework.
  • Figure 1 is an architecture diagram of a coding and decoding system.
  • the system 100 can receive the data 102 to be processed, process the data 102 to be processed, and generate processed data 108.
  • the system 100 may receive the data to be encoded and encode the data to be encoded to generate encoded data, or the system 100 may receive the data to be decoded and decode the data to be decoded to generate decoded data.
  • the components in the system 100 may be implemented by one or more processors.
  • the processor may be a processor in a computing device or a processor in a mobile device (such as a drone).
  • the processor may be any type of processor, which is not limited in the embodiment of the present invention.
  • the processor may include an encoder, a decoder, or a codec, etc.
  • One or more memories may also be included in the system 100.
  • the memory can be used to store instructions and data, for example, computer-executable instructions that implement the technical solutions of the embodiments of the present invention, to-be-processed data 102, processed data 108, and so on.
  • the memory may be any type of memory, which is not limited in the embodiment of the present invention.
  • the data to be encoded may include text, images, graphic objects, animation sequences, audio, video, or any other data that needs to be encoded.
  • the data to be encoded may include sensor data from sensors, which may be vision sensors (for example, cameras, infrared sensors), microphones, near-field sensors (for example, ultrasonic sensors, radars), position sensors, and temperature sensors. Sensors, touch sensors, etc.
  • the data to be encoded may include information from the user, for example, biological information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA sampling, and the like.
  • a video is a continuous image sequence composed of continuous frames, and one frame is an image.
  • the image can be divided into multiple coding regions (Coding Tree Unit, CTU), and the size of each CTU is the same, and the size of the CTU is, for example, 64x64 or 128x128.
  • Each CTU may be further divided into multiple coding units (Coding Unit, CU).
  • the shape of the CU may be a square or a rectangle.
  • the CU is an image block as an example for description, and the image block mentioned below is the CU.
  • the image shown in Figure 2 is composed of 4 CTUs, and each CTU is composed of multiple image blocks.
  • the size of each image block contained in the image can be Completely different or partly the same.
  • the frame diagram of the encoder can be specifically illustrated in FIG. 3, which is a frame diagram of an encoder.
  • FIG. 3 is a frame diagram of an encoder. The following will exemplarily introduce the process of inter-frame coding with reference to FIG. 3.
  • the process of inter-frame encoding and decoding can be as follows:
  • the current frame image is acquired.
  • a reference frame image is obtained.
  • a reference frame image is used to perform motion estimation to obtain a motion vector (Motion Vector, MV) of each image block of the current frame image.
  • the motion vector obtained by the motion estimation is used to perform motion compensation to obtain the estimated value of the current image block.
  • the estimated value of the current image block is subtracted from the current image block to obtain the residual.
  • the residual is transformed to obtain transform coefficients.
  • the transform coefficient is quantized to obtain the quantized coefficient.
  • the quantized coefficients are subjected to entropy coding, and finally the bit stream obtained by entropy coding and the coding mode information after coding are stored or sent to the decoding end.
  • the quantization result is dequantized.
  • the inverse quantization result is inversely transformed.
  • the reconstructed pixels are obtained by using the inverse transform result and the motion compensation result.
  • the reconstructed pixels are filtered.
  • the filtered reconstructed pixels are output.
  • the intra-frame encoding and decoding process can be as follows:
  • the current frame image is obtained.
  • intra-frame prediction selection is performed on the current frame image.
  • the current image block in the current frame performs intra-frame prediction.
  • the estimated value of the current image block is subtracted from the current image block to obtain the residual.
  • the residual of the image block is transformed to obtain transform coefficients.
  • the transform coefficient is quantized to obtain the quantized coefficient.
  • the quantized coefficients are entropy coded, and finally the bit stream obtained by entropy coding and the coded coding mode information are stored or sent to the decoding end.
  • the quantization result is dequantized.
  • the inverse quantization result is inversely transformed, and in 311, the inverse transform result and the intra-frame prediction result are used to obtain reconstructed pixels.
  • the reconstructed pixels are filtered.
  • the filtered reconstructed pixels are output.
  • the image in order to remove redundancy, can be predicted.
  • Different images in the video can use different prediction methods.
  • the image can be divided into an intra-frame prediction image and an inter-frame prediction image.
  • the inter-frame prediction refers to the use of the correlation in the time domain of the video and the correlation in the real-time domain to predict the pixels of the current image using the pixels adjacent to the encoded image to achieve the purpose of effectively removing the redundant information in the video time domain. Due to the high similarity between consecutive frames (strong real-time domain correlation), in order to facilitate storage and transmission, inter-frame prediction can be used to encode and compress the original video to remove redundancy in the time dimension.
  • IBC intra block copy
  • IBC refers to the use of spatial correlation in the same frame of image, that is, spatial correlation.
  • the pixels of the coded CU predict the pixels of the current CU that needs to be coded, so as to achieve the purpose of effectively removing redundant information in the image space.
  • the original image can be encoded and compressed using IBC to remove the redundancy of the spatial dimension.
  • Inter prediction and IBC may include merge (Merge) mode and non-Merge mode (for example, advanced motion vector prediction mode, Advanced Motion Vector Prediction, AMVP).
  • merge Motion
  • non-Merge mode for example, advanced motion vector prediction mode, Advanced Motion Vector Prediction, AMVP.
  • the feature of the Merge mode is that the MV of the image block is equal to the prediction MV (Motion Vector Prediction, MVP), and there is no need to transmit the motion vector difference (MVD) in the code stream. It only needs to pass the MVP index and the reference frame index to Just the decoder.
  • MVP Motion Vector Prediction
  • the feature of the non-Merge mode is that MVD, MVP index, and reference frame index need to be transmitted in the code stream to the decoder.
  • the motion vector prediction can be determined first, and the MVP can be directly determined as the MV.
  • an MVP candidate list (merge candidate list) can be constructed first
  • the encoder can write the MVP index into the code stream, and then decode
  • the terminal can find the MVP corresponding to the index from the MVP candidate list according to the index, so as to realize the decoding of the image block.
  • Step 1 Obtain the MVP candidate list
  • Step 2 Select an optimal MVP from the MVP candidate list, and at the same time obtain the index of the MVP in the MVP candidate list;
  • Step 3 Use the MVP as the MV of the current block
  • Step 4 Determine the position of the reference block (also called the prediction block) in the reference frame image according to the MV;
  • Step 5 The current block is subtracted from the reference block to obtain residual data
  • Step 6 Pass the residual data and the index of the MVP to the decoding end.
  • Merge mode can also have other implementations.
  • the motion information of the spatial neighboring blocks is usually added, and the priority of the joining order of the motion information of the spatial neighboring blocks is the highest.
  • adding the motion information of the spatial neighboring block to the MVP candidate list makes the encoding or decoding of the current image block dependent on the spatial neighboring block, and the parallel processing of multiple image blocks cannot be performed, which is not conducive to the improvement of encoding or decoding efficiency.
  • the parallel technology when constructing the MVP candidate list is not marked in the high-level syntax (for example, sequence header/picture header/slice header, etc.), that is, this operation cannot be performed through the high-level syntax identifier.
  • the switch is not conducive to adjusting the special requirements that the image block needs to meet, and realizes the flexible adaptation of encoding or decoding.
  • the embodiment of the present application discloses a video processing method.
  • the video processing device can use the history-based motion vector prediction (History-based motion vector prediction) when the first image block of the current frame satisfies the preset condition identified in the high-level syntax.
  • prediction, HMVP constructs a first motion information candidate list for the first image block, and encodes or decodes the first image block according to the motion information in the first motion information candidate list.
  • the video processing device can also construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP when the first image block does not meet the preset conditions identified in the high-level grammar, and according to the second motion information
  • the motion information in the candidate list encodes or decodes the first image block.
  • the current frame is the frame currently being encoded or decoded.
  • the first image block can be any image block in the current frame.
  • HMVP when a certain image block of the current frame meets the preset conditions identified in the high-level syntax, HMVP can be used to construct a motion information candidate list for the image block. Compared with the traditional video processing method, the spatial neighboring block needs to be used.
  • the motion information of the image block constructs a motion information candidate list.
  • the embodiment of the present application reduces the complexity of the construction process of the motion information candidate list, and improves the efficiency of image coding and image decoding.
  • the embodiment of the present application also discloses another video processing method.
  • the video processing device can use the HMVP to construct the first motion information candidate list for the first image block when the first image block of the current frame meets the preset condition.
  • the video processing device may also construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP when the first image block does not meet the preset condition.
  • the HMVP when a certain image block in the current frame meets a preset condition, the HMVP can be used to construct a motion information candidate list for the image block.
  • the motion information of the neighboring blocks in the spatial domain needs to be used as the image
  • the motion information candidate list is constructed by blocks. The embodiment of the present application reduces the complexity of the construction process of the motion information candidate list, and improves the efficiency of image coding and image decoding.
  • the embodiment of the application also discloses another video processing method.
  • the video processing device constructs the first motion information for the first image block according to the preset rule Candidate list, the preset rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of temporal neighboring blocks or HMVP, and then according to the motion information in the first motion information candidate list, the first image The block is encoded or decoded.
  • the motion information added for the first time in the first motion information candidate list is the motion information of the temporal neighboring block or HMVP.
  • the motion information candidate list is constructed by the motion information of the spatial neighboring blocks in the first motion information candidate list for the first time. This embodiment of the application reduces The complexity of the construction process of the motion information candidate list improves the efficiency of image coding and image decoding.
  • the embodiment of the application also discloses another video processing method.
  • the video processing device constructs a first motion information candidate list for the first image block according to the preset rule, and the preset The rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of the temporal neighboring block or HMVP, and then the first image block is encoded or decoded according to the motion information in the first motion information candidate list .
  • the motion information when a certain image block in the current frame meets a preset condition, is constructed by the motion information of the temporal neighboring block or HMVP added to the first motion information candidate list for the first time Candidate list.
  • the motion information candidate list is constructed by the motion information of the neighboring blocks in the spatial domain as the motion information added to the first motion information candidate list for the first time. The complexity of the construction process improves the efficiency of image coding and image decoding.
  • FIG. 4 is a schematic structural diagram of a video processing system provided by an embodiment of the present application.
  • the video processing system includes an encoding terminal 401 and a decoding terminal 402.
  • the encoding terminal 401 is used to encode original video data to obtain encoded video data, or to encode original image data to obtain encoded image data.
  • the encoding terminal 401 sends the encoded video data to the decoding terminal 402.
  • the decoding terminal 402 is used to decode the encoded video data to obtain original video data, or to decode the encoded image data to obtain original image data.
  • the encoding terminal 401 and the decoding terminal 402 may run in the same video processing device.
  • the original video data may be encoded by the encoding terminal 401 to obtain encoded video data, and then the encoded video data may be stored.
  • the video processing device may decode the encoded video data through the decoding terminal 402 to obtain the original video data, and then play the decoded original video data through the player.
  • the original image data may be encoded by the encoding terminal 401 to obtain encoded image data, and then the encoded image data may be stored.
  • the video processing device may decode the encoded image data through the decoder 402 to obtain the original image data, and then play the decoded original image data through the player.
  • the encoding end 401 and the decoding end 402 may run in different video processing devices.
  • the encoding terminal 401 runs in a first video processing device
  • the decoding terminal 402 runs in a second video processing device.
  • the first video processing device collects the original video data
  • the original video data can be encoded by the encoding terminal 401 to obtain encoded video data
  • the first video processing device sends the encoded video data to the second video processing device. equipment.
  • the second video processing device may decode the encoded video data through the decoding terminal 402 to obtain the original video data.
  • the video processing method may be: when the first image block of the current frame meets the preset condition identified in the high-level syntax, the encoding end 401 uses HMVP as the The first image block constructs a first motion information candidate list, and encodes the first image block according to the motion information in the first motion information candidate list.
  • the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block, and based on the second motion
  • the motion information in the information candidate list encodes the first image block.
  • the decoder 402 uses the HMVP to construct the first motion information candidate list for the first image block, and according to the information in the first motion information candidate list Motion information, decode the first image block.
  • the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block, and based on the second motion The motion information in the information candidate list decodes the first image block.
  • the video processing system described in the embodiments of the present application is to illustrate the technical solutions of the embodiments of the present application more clearly, and does not constitute a limitation on the technical solutions provided in the embodiments of the present application.
  • Those of ordinary skill in the art will know that, With the evolution of the system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are equally applicable to similar technical problems.
  • FIG. 5 is a schematic flowchart of a video processing method provided by an embodiment of the present application.
  • the video processing method may include the following steps S501 to S503:
  • Step S501 When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, use HMVP to construct a first motion information candidate list for the first image block. That is, when the first image block of the current frame satisfies the preset condition identified in the high-level syntax, the video processing device does not fill the motion information of the spatial neighboring block into the first motion information candidate list, but uses HMVP or other A method that does not involve spatial dependence constructs a first motion information candidate list for the first image block.
  • that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the first image block is smaller than or equal to the size of the image block identified in the high-level grammar. If the size of the first image block is less than or equal to the size of the image block identified in the high-level grammar, the video processing device may determine that the first image block satisfies the preset condition identified in the high-level grammar.
  • the image blocks in the current frame whose size is less than or equal to 64x64 meet the preset conditions identified in the high-level grammar, which is to satisfy
  • the motion information candidate list constructed by the image blocks with the preset conditions identified in the high-level syntax may all use the HMVP, but does not include the motion information of the neighboring blocks in the spatial domain.
  • the video processing device may determine that the first image block does not meet the preset condition identified in the high-level grammar, and further execute step S502.
  • the size of the image block identified in the high-level grammar is 64x64
  • the image block in the current frame whose size is larger than 64x64 for example, 64x128, 128x64, 128x128, etc.
  • the motion information candidate list constructed by the image blocks with the preset conditions identified in the high-level grammar will all utilize the motion information of the neighboring blocks in the spatial domain, but does not include the motion information of the neighboring blocks in the spatial domain.
  • the size of the image block identified in the high-level grammar includes the size of the first image block. If the size of the image block identified in the high-level grammar includes the size of the first image block, the video processing device may determine that the first image block satisfies the preset condition identified in the high-level grammar.
  • the size of the image block identified in the high-level syntax may include at least one. For example, the size of the image block identified in the high-level syntax is 4x8, 8x4, and 64x64, then the size of the current frame is 4x8, 8x4, and 64x64. All image blocks satisfy the preset conditions identified in the high-level grammar, and the motion information candidate list constructed for the image blocks that meet the preset conditions identified in the high-level grammar may use HMVP, but does not include the motion information of the spatial neighboring blocks .
  • the video processing device may determine that the first image block does not meet the preset condition identified in the high-level grammar.
  • the size of the image block identified in the high-level syntax may include at least one.
  • the size of the image block identified in the high-level syntax is 4x8, 8x4, and 64x64, then the size of the current frame is not 4x8, 8x4, or None of the 64x64 image blocks meets the preset conditions identified in the high-level grammar, and the motion information candidate list constructed for image blocks that do not meet the preset conditions identified in the high-level grammar will use the motion information of the spatial neighboring blocks.
  • the size of each identified image block is M*N, and both M and N are greater than or equal to 4.
  • M and N can be equal, or M and N can be unequal. specific:
  • M is greater than or equal to 4
  • N is greater than or equal to 4
  • M and N may be equal, and M and N may not be equal.
  • a high-level syntax can be used to identify image blocks with sizes of 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128.
  • M is greater than or equal to 4
  • N is greater than or equal to 4
  • M and N may not be equal.
  • a high-level syntax can be used to identify image blocks with a size of 8x4, 16x32, 32x16, and 64x128.
  • one of M and N is greater than or equal to 4, and the other of M and N is greater than 4.
  • a high-level syntax can be used to identify image blocks with a size of 4x8, 8x4, 16x32, 32x16, 64x128, 128x128.
  • the video processing device may add the identification of the syntax element to the high-level syntax, that is, identify that for image blocks of certain sizes, use HMVP to construct a motion information candidate list.
  • the size of the image block that can use HMVP instead of the motion information of the spatial neighboring blocks to construct the motion information candidate list in the subsequent encoding process is determined.
  • the syntax element set by the video processing device may include an index value of the size of at least one image block.
  • the index value of 4x4 is 0, the index value of 4x8 is 1, and the index value of 4x16 is 2. If the video processing device sets the syntax element to include 0 and 1, then it can be determined that the size of the image block is 4x4 or 4x8.
  • the HMVP can be used to construct the motion information candidate list instead of the motion information of the neighboring blocks in the spatial domain.
  • the syntax element set by the video processing device may include the size of at least one image block.
  • HMVP can be used to construct a motion information candidate list instead of the motion information of spatial neighboring blocks.
  • the syntax element set by the video processing device may include an index value of the size of the image block.
  • the index value of 4x4 is 0, the index value of 4x8 is 1, and the index value of 4x16 is 2. If the video processing device sets the syntax element to include 2, then it can be determined that for image blocks with a size less than or equal to 4x16 in the image, use HMVP Build a list of motion information candidates.
  • the syntax element set by the video processing device may include the size of the image block.
  • the motion information candidate list can be constructed using HMVP instead of using the motion information of spatial neighboring blocks.
  • the preset conditions identified in the high-level grammar in the embodiments of the present application include but are not limited to the foregoing content.
  • the first image block meeting the preset condition identified in the high-level grammar includes: the position of the image block identified in the high-level grammar includes the position of the first image block in the image.
  • the video processing device can determine that the first image block satisfies the preset condition identified in the high-level grammar; if the position of the image block identified in the high-level grammar The position does not include the position of the first image block in the image, then the video processing device may determine that the first image block does not satisfy the preset condition identified in the high-level grammar. For example, if the position of the image block identified in the high-level grammar is the upper left corner or the lower right corner, the video processing device can obtain the position of the first image block in the image.
  • the video processing device It may be determined that the first image block satisfies the preset condition identified in the high-level grammar. If the first image block is located in the upper right corner of the image, the video processing device may determine that the first image block does not satisfy the preset condition identified in the high-level grammar. Wherein, the position of the image block identified in the high-level grammar includes at least one position.
  • the video processing device may add the identifier of the syntax element to the image header information, sequence header information, or strip header information.
  • each image corresponds to an image header information. If a syntax element identification is added to the image header information of a certain frame, then it can be determined that the size of the image block in the frame is the size indicated by the syntax element.
  • HMVP or other methods that do not involve spatial dependence are constructed, and the motion information of neighboring blocks in the spatial domain is not used to construct a candidate list of motion information.
  • each sequence of video data corresponds to a sequence header information.
  • the motion information candidate list can be constructed by using HMVP instead of using the motion information of the neighboring blocks in the spatial domain.
  • each frame can correspond to at least one piece of header information. If a syntax element identifier is added to a certain piece of header information in a frame, then it can be determined that the size of the image block in the frame is the size indicated by the syntax element.
  • HMVP can be used to construct a motion information candidate list instead of the motion information of neighboring blocks in the spatial domain.
  • the identification method of the preset condition in the high-level grammar in the embodiment of the present application includes but is not limited to the foregoing content. As long as it can be used to determine which image block of the current frame is constructing the motion information candidate list, it does not need to use the motion information of the neighboring blocks in the spatial domain.
  • the video processing device constructs the first motion information candidate list for the first image block according to the type of the motion information of the first image block.
  • the prediction mode is determined.
  • the prediction mode may include inter prediction or IBC.
  • the motion information in which the video processing device constructs the first motion information candidate list for the first image block may include HMVP.
  • the video processing device may select HMVP from the HMVP list as the motion information in the first motion information candidate list. If after filling the selected candidate HMVP into the first motion information candidate list, the first motion information candidate list is not filled, then use zero motion vector (0, 0) to fill the first motion information candidate list until it is filled .
  • the motion information in the first motion information candidate list includes a zero motion vector. For example, if all the first image blocks meet the preset conditions identified in the high-level syntax, after each first image block is encoded or decoded, the motion information used in the encoding or decoding of each first image block is not used If the HMVP list is updated, the HMVP list may be empty. Based on this, the motion information in the first motion information candidate list may be a zero motion vector.
  • the motion information of the first motion information candidate list constructed by the video processing device for the first image block may also include motion information of temporal neighboring blocks.
  • the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include motion information of temporal neighboring blocks, HMVP, and composition. For the average candidate MV. If the first motion information candidate list is not filled after the motion information, HMVP, and paired average candidate MV of neighboring blocks in the time domain are filled into the first motion information candidate list, the zero motion vector (0, 0) pair is used The first motion information candidate list is filled until it is full.
  • the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include temporal neighboring blocks The motion information and the paired average candidate MV. Further, the motion information in which the video processing device constructs the first motion information candidate list for the first image block may also include a zero motion vector.
  • the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include motion information of temporal neighboring blocks and HMVP. If the motion information of the neighboring blocks in the time domain and the selected candidate HMVP are filled into the first motion information candidate list, and the first motion information candidate list is not filled, the zero motion vector (0, 0) is used for the first motion The information candidate list is filled until it is full.
  • the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include temporal neighbors The movement information of the block. Further, the motion information in which the video processing device constructs the first motion information candidate list for the first image block may also include a zero motion vector.
  • the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block.
  • the second image block is a spatial neighboring block of the first image block.
  • the video processing device since the video processing device does not need to use the motion information of spatial neighboring blocks to construct the first motion information candidate list for the first image block when the first image block satisfies the preset condition identified in the high-level grammar, it can Parallelize the construction process of the first motion information candidate list of the first image block and the construction process of the motion information candidate list of the second image block to realize the encoding of the first image block and the encoding of the second image block of the current frame Synchronization, or the decoding of the first image block is synchronized with the decoding of the second image block.
  • the identification of the syntax element is added for the above operation, that is, the image block of certain size can be operated in parallel with other image blocks to achieve a parallel operation of the construction of the motion information candidate list with adjustable size. Effect. Specifically, by setting the syntax element, the size of the image block that can be parallelized in the construction of the motion information candidate list with other image blocks is determined in the subsequent encoding process.
  • the video processing device After the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it may not use the motion information used in the encoding or decoding of the first image block. , To update the HMVP list.
  • the video processing device uses HMVP to construct a first motion information candidate list for the first image block when the first image block satisfies the preset condition identified in the high-level grammar, and then the video processing device can use the HMVP to construct the first motion information candidate list according to the first motion information candidate list
  • the motion information in the first image block is encoded or decoded.
  • the video processing device After the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use the motion information used in the encoding or decoding of the first image block to update the HMVP list, That is, the HMVP list does not include the motion information used in encoding or decoding of the first image block.
  • the motion information used in encoding or decoding of the image block is skipped, and the process of updating the HMVP list is reduced.
  • the coding and decoding complexity of the image block is improved, and the throughput rate of the image block during encoding or decoding is improved.
  • the HMVP list may be operated based on the prediction mode of the first image block.
  • the prediction mode may include inter prediction or IBC.
  • the prediction mode is inter-frame prediction
  • the video processing device can update the HMVP list by using the motion information used in encoding or decoding of the first image block.
  • the prediction mode is IBC
  • the video processing device can keep the HMVP list unchanged.
  • Step S502 When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, construct a second motion information candidate list for the first image block using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  • the specific process for the video processing device to construct the second motion information candidate list may be:
  • the spatial neighboring blocks of CU1 include CU2 and CU3.
  • the motion information of the encoded CU2 may be filled into the second motion information candidate list
  • the motion information of the encoded CU3 may be filled into the second motion information candidate list. It can be seen that before using the motion information of the spatial neighboring blocks and HMVP to construct the second motion information candidate list for the image block, it is necessary to ensure that the spatial neighboring blocks of the image block have been encoded or decoded. If the spatial neighboring blocks are not encoded, If the decoding is completed or the decoding is not completed, the image block cannot be encoded or decoded, resulting in a low throughput rate of the encoding and decoding of the image block.
  • HMVP HMVP list as the motion information in the motion information candidate list. If the motion information candidate list of the coded neighboring CUs and the selected HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, then the motion information candidate list is filled with a zero motion vector (0, 0), Until it fills up.
  • the motion information candidate list constructed in the process of encoding or decoding according to whether the image block meets the preset conditions of the high-level syntax identification, either the first motion information candidate list is constructed, or the first motion information candidate list is constructed. 2.
  • the motion information candidate list may be an MVP candidate list as an example.
  • the above-mentioned construction method of the MVP candidate list is the construction method of the first motion information candidate list or the second motion information candidate list.
  • the motion information candidate list mentioned in the embodiment of the present application (for example, the first motion information candidate list or the second motion information candidate list) may be a set of candidate motion information of the image block, and each of the motion information candidate lists Candidate motion information can be stored in the same buffer or in different buffers, and there is no restriction here.
  • the index of the motion information in the motion information candidate list may be the index of the motion information in the set of candidate motion information of the image block.
  • the set of candidate motion information includes 5 candidate motion information, and the indexes of the 5 candidate motion information in the motion information candidate list may be 0, 1, 2, 3, 4, respectively.
  • the motion information mentioned in the embodiments of the present application may include a motion vector, or include a motion vector and reference frame information (for example, a reference frame index), and so on.
  • Step S503 encode or decode the first image block according to the motion information in the first motion information candidate list or the second motion information candidate list.
  • the video processing device can use HMVP to construct a first motion information candidate list for the first image block, and then according to the first motion information candidate list
  • the motion information in the first image block is encoded or decoded.
  • the video processing device can use the motion information of the spatial neighboring blocks and the HMVP to construct a second motion information candidate list for the first image block, and then according to the first image block Second, the motion information in the motion information candidate list encodes or decodes the first image block.
  • the video processing device uses HMVP to construct the first motion information candidate list for the first image block, which reduces the number of motion information candidate lists.
  • the complexity of the construction process improves the efficiency of image coding and image decoding.
  • image blocks that meet the preset conditions can be set to be adjustable in size through high-level syntax, which can increase the flexibility and adaptability of coding and decoding.
  • FIG. 7 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • the video processing method may include the following steps S701 to S703:
  • Step S701 When the first image block of the current frame satisfies a preset condition, use the HMVP to construct a first motion information candidate list for the first image block. That is to say, when the first image block of the current frame meets the preset condition, the video processing device does not fill the motion information of the spatial neighboring blocks into the first motion information candidate list, but can use HMVP to construct the first image block The first motion information candidate list.
  • That the first image block satisfies the preset condition includes: the size of the first image block meets the preset size.
  • the preset size may be a size preset by the video processing device, or a default value of the encoder or decoder, or a size specified by the encoder and decoder at the same time.
  • the size of the first image block meeting the preset size includes: the size of the first image block is less than or equal to the preset size; or, the preset size includes the first image block Size. That is, if the size of the first image block is less than or equal to the preset size, or the preset size includes the size of the first image block, the video processing device may determine the size of the first image block Meet the preset size.
  • the preset size includes at least one, and each preset size is M*N, and both M and N are greater than or equal to 4.
  • M and N may be equal, or M and N may not be equal.
  • the video processing device may determine that the first image block meets the preset condition. For example, if the preset size is 64x64, then all image blocks with a size smaller than or equal to 64x64 (for example, 4x8, 8x4, 16x32, etc.) in the current frame meet the preset condition.
  • M is greater than or equal to 4
  • N is greater than or equal to 4
  • M and N may be equal or not equal.
  • the preset sizes include 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128, the image blocks with sizes of 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128 in the current frame all meet the preset conditions.
  • M is greater than or equal to 4
  • N is greater than or equal to 4
  • M and N may not be equal.
  • the preset size is 8x4, 16x32, 32x16, 64x128, then the image blocks with sizes of 8x4, 16x32, 32x16, 64x128 in the current frame all meet the preset conditions.
  • the size of the first image block is K*L, one of K and L is greater than or equal to 4, and the other of K and L is greater than 4.
  • HMVP HMVP
  • the information constructs a first motion information candidate list for the first image block.
  • the type of motion information for constructing the first motion information candidate list for the first image block is determined according to the prediction mode of the first image block.
  • the prediction mode may include inter prediction or IBC.
  • the motion information for constructing the first motion information candidate list for the first image block may also include motion information of temporal neighboring blocks.
  • the motion information for constructing the first motion information candidate list for the first image block may include HMVP.
  • the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block.
  • the second image block is a spatial neighboring block of the first image block.
  • the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use the motion information used in the encoding or decoding of the first image block, Update the HMVP list.
  • step S701 and step S501 the difference between step S701 and step S501 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S701, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same part of step S701 and step S501, reference may be made to the corresponding description in S501 for details, which will not be repeated in this embodiment of the application.
  • Step S702 When the first image block of the current frame does not meet the preset condition, use the motion information of the neighboring blocks in the spatial domain and the HMVP to construct a second motion information candidate list for the first image block.
  • step S702 the difference between step S702 and step S502 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S702, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same parts of step S702 and step S502, reference may be made to the corresponding description in S502 for details, which will not be repeated in this embodiment of the application.
  • Step S703 encoding or decoding the first image block according to the motion information in the first motion information candidate list or the second motion information candidate list.
  • the video processing device can use HMVP to construct a first motion information candidate list for the first image block, and then, according to the motion information in the first motion information candidate list, Encode or decode the first image block.
  • the video processing device can construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and HMVP, and then according to the second motion information candidate list The motion information in the first image block is encoded or decoded.
  • the video processing device uses HMVP to construct the first motion information candidate list for the first image block when the first image block of the current frame meets the preset condition, which reduces the complexity of the construction process of the motion information candidate list. , Improve the efficiency of image coding and image decoding.
  • FIG. 8 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • the video processing method may include the following steps S801 and S802:
  • Step S801 When the first image block of the current frame meets the preset condition identified in the high-level grammar, construct a first motion information candidate list for the first image block according to the preset rule, and the preset rule is used to indicate the first motion information candidate
  • the motion information added to the list for the first time is the motion information of neighboring blocks in the time domain or HMVP.
  • the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks, motion information of temporal neighboring blocks, HMVP, and Average candidate MVs in pairs.
  • the order in which the video processing device adds the motion information to the motion information candidate list can be: motion information of neighboring blocks in the spatial domain ⁇ motion information of neighboring blocks in the temporal domain ⁇ HMVP ⁇ pairwise average candidate MV, that is, first adjacent in the spatial domain
  • the motion information of the block is filled in the motion information candidate list, and then the motion information of the temporal neighboring blocks is filled in the motion information candidate list, and then the HMVP is filled in the motion information candidate list, and the paired average candidate MV is further filled in the motion information candidate list.
  • the motion information candidate list is not filled, then use the zero motion vector motion information candidate list to fill it until it is filled .
  • the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks and HMVP.
  • the order in which the video processing device adds the motion information to the motion information candidate list can be: motion information of adjacent blocks in the spatial domain ⁇ motion information of adjacent blocks in the time domain ⁇ HMVP, that is, the motion information of adjacent blocks in the spatial domain is first filled into In the motion information candidate list, the motion information of the temporal neighboring blocks is filled in the motion information candidate list, and then the HMVP is filled in the motion information candidate list. If the motion information candidate list of the neighboring blocks in the time domain and the HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, and then the motion information candidate list is filled with the zero motion vector until it is filled.
  • the preset rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of temporal neighboring blocks, which means The motion information of the neighboring blocks in the spatial domain is excluded from the first motion information candidate list.
  • the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks and HMVP. If the motion information candidate list of the neighboring blocks in the spatial domain and the HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, then the zero motion vector motion information candidate list is used for filling until it is filled.
  • the order in which the video processing device adds the motion information to the motion information candidate list may be: motion information of neighboring blocks in the spatial domain ⁇ HMVP ⁇ zero motion vector.
  • the preset rule is used to indicate that the first motion information added to the first motion information candidate list is HMVP, which means that it is in the first motion information candidate list
  • HMVP the preset rule
  • the motion information added to the motion information candidate list for the first time is the motion information of the neighboring blocks in the spatial domain, and the embodiment of the present application expects to exclude the use of neighboring spatial motion blocks to construct the first motion information.
  • the first motion information added to the first motion information candidate list in this embodiment of the application is motion information of neighboring blocks in the time domain or HMVP.
  • step S801 and step S501 the difference between step S801 and step S501 is that the method of constructing the first motion information candidate list is not to use the type of motion information, but to limit it according to the order in which the motion information added to the first motion information candidate list is added.
  • the motion information of the neighboring blocks in the spatial domain is not used to construct the first motion information candidate list. Therefore, for the same parts of step S801 and step S501, reference may be made to the corresponding description in S501 for details, which will not be repeated in this embodiment of the application.
  • Step S802 encode or decode the first image block according to the motion information in the first motion information candidate list.
  • the video processing device when the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the video processing device constructs a second motion information candidate list for the first image block according to another preset rule, and A preset rule is used to indicate that the motion information added to the second motion information candidate list for the first time is motion information of neighboring blocks in the spatial domain. Then, the video processing device encodes or decodes the first image block according to the motion information in the second motion information candidate list.
  • the first motion information candidate list is constructed for the first image block according to the preset rule, and the preset rule is used to indicate the first image block.
  • the motion information added to the motion information candidate list for the first time is the motion information or HMVP of the neighboring blocks in the time domain, rather than the motion information of the neighboring blocks in the spatial domain, thus eliminating the use of neighboring motion blocks in the spatial domain to construct the first motion information candidate list.
  • the complexity of the construction process of the motion information candidate list is reduced, and the efficiency of image coding and image decoding is improved.
  • FIG. 9 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • the video processing method may include the following steps S901 and S902:
  • Step S901 When the first image block of the current frame meets a preset condition, construct a first motion information candidate list for the first image block according to the preset rule, and the preset rule is used to indicate the first motion information candidate list added to the first motion information candidate list.
  • the motion information is the motion information of neighboring blocks in the time domain or HMVP.
  • step S901 and step S801 the difference between step S901 and step S801 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S901, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same parts of step S901 and step S801, reference may be made to the corresponding description in S801, which will not be repeated in this embodiment of the application.
  • Step S902 encode or decode the first image block according to the motion information in the first motion information candidate list.
  • the video processing device constructs a second motion information candidate list for the first image block according to another preset rule, and another preset rule is used
  • the motion information added for the first time in the second motion information candidate list indicates that the motion information of the neighboring blocks in the spatial domain is the motion information.
  • the video processing device encodes or decodes the first image block according to the motion information in the second motion information candidate list.
  • a first motion information candidate list is constructed for the first image block according to a preset rule, and the preset rule is used to indicate that the first motion information candidate list is
  • the motion information added for the first time is the motion information of neighboring blocks in the time domain or HMVP, rather than the motion information of neighboring blocks in the spatial domain, which reduces the complexity of the process of constructing the motion information candidate list and improves the efficiency of image coding and image decoding.
  • FIG. 10 is a schematic structural diagram of a video processing device according to an embodiment of the present application.
  • the video processing device described in the embodiment of the present application at least includes: a processor 1001 and a memory 1002, where:
  • the memory 1002 is configured to store a computer program, and the computer program includes program instructions;
  • the processor 1001 calls the program instructions to execute the following steps:
  • the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
  • that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the image block identified in the high-level grammar includes the size of the first image block.
  • that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the first image block is smaller than or equal to the size of the image block identified in the high-level grammar.
  • the size of the image block identified in the high-level grammar includes at least one, and the size of each identified image block is M*N, and both the M and the N are greater than or equal to 4.
  • the M and the N are not equal.
  • the type of motion information used to construct the first motion information candidate list for the first image block is based on the first image block.
  • the prediction mode of the image block is determined.
  • the prediction mode includes inter prediction or IBC; when the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block further includes The motion information of the temporal neighboring block; when the prediction mode is the IBC, the motion information for constructing a first motion information candidate list for the first image block includes the HMVP.
  • the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block; wherein ,
  • the second image block is a spatial neighboring block of the first image block.
  • the processor 1001 after the processor 1001 encodes or decodes the first image block according to the motion information in the first motion information candidate list, the processor 1001 does not use the first image block to encode or decode the first image block. Or the motion information used in decoding, update the HMVP list.
  • the processor 1001 is further configured to perform the following operations after encoding or decoding the first image block according to the motion information in the first motion information candidate list:
  • the prediction mode includes inter prediction or IBC
  • the processor 1001 is specifically configured to perform the following operations when operating the HMVP list based on the prediction mode of the first image block:
  • the prediction mode is the inter-frame prediction
  • the processor 1001 described in the embodiment of the present application may execute the implementation manner described in the video processing method provided in FIG. 5 of the embodiment of the present application, and details are not described herein again.
  • processor 1001 calls the program instructions to perform the following steps:
  • a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  • that the first image block satisfies the preset condition includes: the size of the first image block meets the preset size.
  • that the size of the first image block satisfies a preset size includes: the size of the first image block is less than or equal to a preset size; or, the preset size The size includes the size of the first image block.
  • the preset size includes at least one, and each preset size is M*N, and both the M and the N are greater than or equal to 4.
  • the M and the N are not equal.
  • the type of motion information used to construct the first motion information candidate list for the first image block is based on the type of the first image block.
  • the prediction mode is determined.
  • the prediction mode includes inter prediction or IBC
  • the motion information for constructing the first motion information candidate list for the first image block also includes motion information of temporal neighboring blocks;
  • the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
  • the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block; wherein ,
  • the second image block is a spatial neighboring block of the first image block.
  • the processor 1001 after the processor 1001 encodes or decodes the first image block according to the motion information in the first motion information candidate list, the processor 1001 does not use the first image block to encode or decode the first image block. Or the motion information used in decoding, update the HMVP list.
  • the processor 1001 is further configured to perform the following operations after encoding or decoding the first image block according to the motion information in the first motion information candidate list:
  • the prediction mode includes inter prediction or IBC
  • the processor 1001 When the processor 1001 operates the HMVP list based on the prediction mode of the first image block, it specifically performs the following operations:
  • the prediction mode is the inter-frame prediction
  • update the HMVP list by using the motion information used during encoding or decoding of the first image block
  • the size of the first image block is K*L, one of the K and the L is greater than or equal to 4, so The other of the K and the L is greater than 4.
  • the processor 1001 described in the embodiment of the present application may execute the implementation manner described in the video processing method provided in FIG. 7 of the embodiment of the present application, and details are not described herein again.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores program instructions, and the program instructions may include the video processing method in the corresponding embodiments of FIG. 5 and FIG. 7 to FIG. 9 when the program instruction is executed. Part or all of the steps.
  • the embodiments of the present application also provide a computer program product.
  • the computer program product When the computer program product is run by a computer device, it can execute part or all of the steps of the video processing method in the embodiment corresponding to FIG. 5, FIG. 7 to FIG. 9.
  • the computer program can be stored in a computer-readable storage medium. During execution, the procedures of the foregoing method embodiments may be included.
  • the computer-readable storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

Abstract

A video processing method, a video processing device, and a computer-readable storage medium. The video processing method comprises: when a first image block of a current frame satisfies a preset criterion identified in a high-level syntax, utilizing HMVP to construct a first motion information candidate list for the first image block; and, when the first image block of the current frame does not satisfy the preset criterion identified in the high-level syntax, utilizing motion information of a spatially adjacent block and HMVP to construct a second motion information candidate list for the first image block. The employment of the embodiments of the present application reduces the complexity of the process of constructing the motion information candidate lists and increases the efficiency of image coding and image decoding.

Description

一种视频处理方法、视频处理设备及计算机可读存储介质Video processing method, video processing equipment and computer readable storage medium 技术领域Technical field
本申请涉及通信技术领域,尤其涉及一种视频处理方法、视频处理设备及计算机可读存储介质。This application relates to the field of communication technology, and in particular to a video processing method, video processing device, and computer-readable storage medium.
背景技术Background technique
视频数据是连续的图像序列,由连续的帧构成,一帧即为一幅图像。视频数据有很强的相关性,也就是说有大量的冗余信息。其中冗余信息可分为空域冗余信息和时域冗余信息。图像编码就是将视频数据中各帧图像的冗余信息去掉(即去除数据之间的相关性),得到编码后的图像。图像解码就是根据编码后的图像得到原始的图像。但是传统的图像编解码方法需要利用图像块的空域相邻块的运动信息为该图像块构建运动信息候选列表,导致运动信息候选列表的构建过程较为复杂,降低了图像编码和图像解码的效率。Video data is a continuous image sequence, composed of continuous frames, one frame is one image. Video data has a strong correlation, which means that there is a lot of redundant information. The redundant information can be divided into spatial redundant information and temporal redundant information. Image coding is to remove the redundant information of each frame image in the video data (that is, to remove the correlation between the data) to obtain the coded image. Image decoding is to get the original image based on the encoded image. However, the traditional image coding and decoding methods need to use the motion information of the spatial neighboring blocks of the image block to construct a motion information candidate list for the image block, which leads to a complicated construction process of the motion information candidate list and reduces the efficiency of image coding and image decoding.
发明内容Summary of the invention
本申请实施例提供了一种视频处理方法、视频处理设备及计算机可读存储介质,可以在图像块满足预设条件时,无需利用空域相邻块的运动信息为该图像块构建运动信息候选列表,从而减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率,使得满足预设条件的图像块构建运动信息候选列表的过程可以并行进行。The embodiments of the application provide a video processing method, a video processing device, and a computer-readable storage medium. When an image block satisfies a preset condition, there is no need to use the motion information of spatial neighboring blocks to construct a motion information candidate list for the image block. Therefore, the complexity of the construction process of the motion information candidate list is reduced, and the efficiency of image coding and image decoding is improved, so that the process of constructing the motion information candidate list by the image blocks that meet the preset conditions can be performed in parallel.
第一方面,本申请实施例提供了一种视频处理方法,该视频处理方法包括:In the first aspect, an embodiment of the present application provides a video processing method, and the video processing method includes:
在当前帧的第一图像块满足高层语法中标识的预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
第二方面,本申请实施例提供了另一种视频处理方法,该视频处理方法包括:In the second aspect, an embodiment of the present application provides another video processing method, and the video processing method includes:
在当前帧的第一图像块满足预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame meets a preset condition, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not satisfy the preset condition, a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
第三方面,本申请实施例提供了一种视频处理设备,该视频处理设备包括存储器和处理器,其中:In a third aspect, an embodiment of the present application provides a video processing device, the video processing device includes a memory and a processor, wherein:
存储器,用于存储计算机程序,计算机程序包括程序指令;The memory is used to store a computer program, and the computer program includes program instructions;
处理器,调用程序指令,用于执行如下步骤:The processor, calling program instructions, is used to perform the following steps:
在当前帧的第一图像块满足高层语法中标识的预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
第四方面,本申请实施例提供了一种视频处理设备,该视频处理设备包括存储器和处理器,其中:In a fourth aspect, an embodiment of the present application provides a video processing device. The video processing device includes a memory and a processor, wherein:
存储器,用于存储计算机程序,计算机程序包括程序指令;The memory is used to store a computer program, and the computer program includes program instructions;
处理器,调用程序指令,用于执行如下步骤:The processor, calling program instructions, is used to perform the following steps:
在当前帧的第一图像块满足预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame meets a preset condition, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not satisfy the preset condition, a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
第五方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如第一方面所述的视频处理方法。In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program. The computer program includes program instructions that, when executed by a processor, cause the The processor executes the video processing method as described in the first aspect.
第六方面,本申请实施例提供了另一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如第二方面所述的视频处理方法。In the sixth aspect, the embodiments of the present application provide another computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause all The processor executes the video processing method described in the second aspect.
本申请实施例中,在当前帧的某一图像块满足预设条件时,可利用HMVP为该图像块构建运动信息候选列表,无需利用空域相邻块的运动信息为该图像 块构建运动信息候选列表,从而减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of this application, when a certain image block in the current frame meets a preset condition, the HMVP can be used to construct a motion information candidate list for the image block, and there is no need to use the motion information of the spatial neighboring blocks to construct a motion information candidate for the image block. Therefore, the complexity of the construction process of the motion information candidate list is reduced, and the efficiency of image coding and image decoding is improved.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative labor.
图1是本申请实施例提供的一种编解码的系统框架示意图;FIG. 1 is a schematic diagram of a codec system framework provided by an embodiment of the present application;
图2是本申请实施例提供的一种图像块的示意图;Fig. 2 is a schematic diagram of an image block provided by an embodiment of the present application;
图3是本申请实施例提供的一种编码器的框架示意图;FIG. 3 is a schematic diagram of a framework of an encoder provided by an embodiment of the present application;
图4是本申请实施例提供的一种视频处理系统的架构示意图;FIG. 4 is a schematic diagram of the architecture of a video processing system provided by an embodiment of the present application;
图5是本申请实施例提供的一种视频处理方法的流程示意图;FIG. 5 is a schematic flowchart of a video processing method provided by an embodiment of the present application;
图6是本申请实施例提供的一种传统的构建运动信息候选列表的示意图;FIG. 6 is a schematic diagram of a traditional construction of a motion information candidate list provided by an embodiment of the present application;
图7是本申请实施例提供的另一种视频处理方法的流程示意图;FIG. 7 is a schematic flowchart of another video processing method provided by an embodiment of the present application;
图8是本申请实施例提供的另一种视频处理方法的流程示意图;FIG. 8 is a schematic flowchart of another video processing method provided by an embodiment of the present application;
图9是本申请实施例提供的另一种视频处理方法的流程示意图;FIG. 9 is a schematic flowchart of another video processing method provided by an embodiment of the present application;
图10是本申请实施例提供的一种视频处理设备的结构示意图。FIG. 10 is a schematic structural diagram of a video processing device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
本发明实施例提出的视频处理方法可以应用于视频处理设备,所述视频处理设备可以设置在智能终端(如手机、平板电脑等)上,该视频处理设备可以用于编码端或解码端,具体可以为编码器或解码器。在某些实施例中,本发明实施例可应用于飞行器(如无人机)上,在其他实施例中,本发明实施例还可 以应用于其他可移动平台(如无人船、无人汽车、机器人等)上,本发明实施例不做具体限定。The video processing method proposed in the embodiment of the present invention can be applied to a video processing device. The video processing device can be set on a smart terminal (such as a mobile phone, a tablet computer, etc.), and the video processing device can be used on an encoding end or a decoding end. It can be an encoder or a decoder. In some embodiments, the embodiments of the present invention can be applied to aircraft (such as drones). In other embodiments, the embodiments of the present invention can also be applied to other movable platforms (such as unmanned ships, unmanned vehicles). , Robots, etc.), the embodiment of the present invention does not make specific limitations.
具体可以图1为例对编解码的系统框架进行说明,图1是一种编解码系统的架构图。如图1所示,系统100可以接收待处理数据102,对待处理数据102进行处理,产生处理后的数据108。例如,系统100可以接收待编码数据,对待编码数据进行编码以产生编码后的数据,或者,系统100可以接收待解码数据,对待解码数据进行解码以产生解码后的数据。在一些实施例中,系统100中的部件可以由一个或多个处理器实现,该处理器可以是计算设备中的处理器,也可以是移动设备(例如无人机)中的处理器。该处理器可以为任意种类的处理器,本发明实施例对此不做限定。在一些可能的设计中,该处理器可以包括编码器、解码器或编解码器等。系统100中还可以包括一个或多个存储器。该存储器可用于存储指令和数据,例如,实现本发明实施例的技术方案的计算机可执行指令、待处理数据102、处理后的数据108等。该存储器可以为任意种类的存储器,本发明实施例对此也不做限定。Specifically, Figure 1 can be used as an example to illustrate the coding and decoding system framework. Figure 1 is an architecture diagram of a coding and decoding system. As shown in FIG. 1, the system 100 can receive the data 102 to be processed, process the data 102 to be processed, and generate processed data 108. For example, the system 100 may receive the data to be encoded and encode the data to be encoded to generate encoded data, or the system 100 may receive the data to be decoded and decode the data to be decoded to generate decoded data. In some embodiments, the components in the system 100 may be implemented by one or more processors. The processor may be a processor in a computing device or a processor in a mobile device (such as a drone). The processor may be any type of processor, which is not limited in the embodiment of the present invention. In some possible designs, the processor may include an encoder, a decoder, or a codec, etc. One or more memories may also be included in the system 100. The memory can be used to store instructions and data, for example, computer-executable instructions that implement the technical solutions of the embodiments of the present invention, to-be-processed data 102, processed data 108, and so on. The memory may be any type of memory, which is not limited in the embodiment of the present invention.
待编码数据可以包括文本、图像、图形对象、动画序列、音频、视频、或者任何需要编码的其他数据。在一些情况下,待编码数据可以包括来自传感器的传感数据,该传感器可以为视觉传感器(例如,相机、红外传感器),麦克风、近场传感器(例如,超声波传感器、雷达)、位置传感器、温度传感器、触摸传感器等。在一些情况下,待编码数据可以包括来自用户的信息,例如,生物信息,该生物信息可以包括面部特征、指纹扫描、视网膜扫描、嗓音记录、DNA采样等。The data to be encoded may include text, images, graphic objects, animation sequences, audio, video, or any other data that needs to be encoded. In some cases, the data to be encoded may include sensor data from sensors, which may be vision sensors (for example, cameras, infrared sensors), microphones, near-field sensors (for example, ultrasonic sensors, radars), position sensors, and temperature sensors. Sensors, touch sensors, etc. In some cases, the data to be encoded may include information from the user, for example, biological information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA sampling, and the like.
一般来说,视频是连续的图像序列,由连续的帧构成,一帧即为一幅图像。对于一帧图像,可以将该图像划分成多个编码区域(Coding Tree Unit,CTU),各个CTU的尺寸相同,CTU的尺寸例如64x64或128x128等。每个CTU可以进一步划分成多个编码单元(Coding Unit,CU),示例性的,CU的形状可以为方形或矩形。为了便于理解,将以CU为图像块为例进行说明,以下提到的图像块即为CU。以图2所示的图像块的示意图为例,图2所示的图像由4个CTU组成,每个CTU由多个图像块组成,示例性的,该图像所包含的各个图像块的尺寸可以完全不相同或者部分相同。Generally speaking, a video is a continuous image sequence composed of continuous frames, and one frame is an image. For a frame of image, the image can be divided into multiple coding regions (Coding Tree Unit, CTU), and the size of each CTU is the same, and the size of the CTU is, for example, 64x64 or 128x128. Each CTU may be further divided into multiple coding units (Coding Unit, CU). Exemplarily, the shape of the CU may be a square or a rectangle. For ease of understanding, the CU is an image block as an example for description, and the image block mentioned below is the CU. Taking the schematic diagram of the image block shown in Figure 2 as an example, the image shown in Figure 2 is composed of 4 CTUs, and each CTU is composed of multiple image blocks. Illustratively, the size of each image block contained in the image can be Completely different or partly the same.
在一个实施例中编码器的框架图具体可以图3为例,图3是一种编码器的框架图。以下将结合图3示例性介绍帧间编码的流程。In an embodiment, the frame diagram of the encoder can be specifically illustrated in FIG. 3, which is a frame diagram of an encoder. The following will exemplarily introduce the process of inter-frame coding with reference to FIG. 3.
如图3所示,帧间编码和解码的流程可以如下所示:As shown in Figure 3, the process of inter-frame encoding and decoding can be as follows:
在301中,获取当前帧图像。在302中,获取参考帧图像。在303a中,利用参考帧图像,进行运动估计,以得到当前帧图像的各个图像块的运动矢量(Motion Vector,MV)。在304a中,利用运动估计得到的运动矢量,进行运动补偿,以得到当前图像块的估计值。在305中,将当前图像块的估计值与当前图像块相减,得到残差。在306中,对残差进行变换,以得到变换系数。在307中,变换系数经量化可得到量化后的系数。在308中,将量化后的系数进行熵编码,最后将熵编码得到的比特流及进行编码后的编码模式信息进行存储或发送到解码端。在309中,对量化的结果进行反量化。在310中,对反量化结果进行反变换。在311中,利用反变换结果以及运动补偿结果,得到重建像素。在312中,对重建像素进行滤波。在313中,输出滤波后的重建像素。In 301, the current frame image is acquired. In 302, a reference frame image is obtained. In 303a, a reference frame image is used to perform motion estimation to obtain a motion vector (Motion Vector, MV) of each image block of the current frame image. In 304a, the motion vector obtained by the motion estimation is used to perform motion compensation to obtain the estimated value of the current image block. In 305, the estimated value of the current image block is subtracted from the current image block to obtain the residual. In 306, the residual is transformed to obtain transform coefficients. In 307, the transform coefficient is quantized to obtain the quantized coefficient. In 308, the quantized coefficients are subjected to entropy coding, and finally the bit stream obtained by entropy coding and the coding mode information after coding are stored or sent to the decoding end. In 309, the quantization result is dequantized. In 310, the inverse quantization result is inversely transformed. In 311, the reconstructed pixels are obtained by using the inverse transform result and the motion compensation result. In 312, the reconstructed pixels are filtered. In 313, the filtered reconstructed pixels are output.
如图3所示,帧内编码和解码的流程可以如下所示:As shown in Figure 3, the intra-frame encoding and decoding process can be as follows:
在302中,获取当前帧图像。在303b中,对当前帧图像进行帧内预测选择。在304b中,当前帧中的当前图像块进行帧内预测。在305中,将当前图像块的估计值与当前图像块相减,得到残差。在306中,对图像块的残差进行变换,以得到变换系数。在307中,变换系数经量化可得到量化后的系数。在308中,将量化后的系数进行熵编码,最后将熵编码得到的比特流及进行编码后的编码模式信进行存储或发送到解码端。在309中,对量化结果进行反量化。在310中,对反量化结果进行反变换,在311中,利用反变换结果以及帧内预测结果,得到重建像素。在312中,对重建像素进行滤波。在313中,输出滤波后的重建像素。In 302, the current frame image is obtained. In 303b, intra-frame prediction selection is performed on the current frame image. In 304b, the current image block in the current frame performs intra-frame prediction. In 305, the estimated value of the current image block is subtracted from the current image block to obtain the residual. In 306, the residual of the image block is transformed to obtain transform coefficients. In 307, the transform coefficient is quantized to obtain the quantized coefficient. In 308, the quantized coefficients are entropy coded, and finally the bit stream obtained by entropy coding and the coded coding mode information are stored or sent to the decoding end. In 309, the quantization result is dequantized. In 310, the inverse quantization result is inversely transformed, and in 311, the inverse transform result and the intra-frame prediction result are used to obtain reconstructed pixels. In 312, the reconstructed pixels are filtered. In 313, the filtered reconstructed pixels are output.
如图3所示,在编码过程中,为了去除冗余,可以对图像进行预测。视频中不同的图像可采用不同的预测方式。根据图像所采用的预测方式,可以将图像区分为帧内预测图像和帧间预测图像。其中,帧间预测指的是利用视频时间域的相关性,即时域相关性,使用邻近已编码图像的像素预测当前图像的像素,以达到有效去除视频时域冗余信息的目的。由于连续的帧之间相似性极高(即时域相关性较强),为便于储存传输,可以使用帧间预测对原始的视频进行编码压缩,以去除时间维度的冗余。As shown in Figure 3, in the encoding process, in order to remove redundancy, the image can be predicted. Different images in the video can use different prediction methods. According to the prediction method adopted by the image, the image can be divided into an intra-frame prediction image and an inter-frame prediction image. Among them, the inter-frame prediction refers to the use of the correlation in the time domain of the video and the correlation in the real-time domain to predict the pixels of the current image using the pixels adjacent to the encoded image to achieve the purpose of effectively removing the redundant information in the video time domain. Due to the high similarity between consecutive frames (strong real-time domain correlation), in order to facilitate storage and transmission, inter-frame prediction can be used to encode and compress the original video to remove redundancy in the time dimension.
除了上述说明的预测模式之外,还可以包括帧内块拷贝技术(Intra block copy,IBC),IBC指的是利用同一帧图像空间域的相关性,即空间相关性,使用该帧图像中已编码的CU的像素预测当前需要编码的CU的像素,以达到有效去除图像空域冗余信息的目的。同一帧中存在重复纹理,具有较强的空间相关性,为便于储存传输,可以使用IBC对原始的图像进行编码压缩,以去除空间维度的冗余。In addition to the prediction mode described above, it can also include intra block copy (IBC) technology. IBC refers to the use of spatial correlation in the same frame of image, that is, spatial correlation. The pixels of the coded CU predict the pixels of the current CU that needs to be coded, so as to achieve the purpose of effectively removing redundant information in the image space. There are repeated textures in the same frame, which has strong spatial correlation. In order to facilitate storage and transmission, the original image can be encoded and compressed using IBC to remove the redundancy of the spatial dimension.
帧间预测和IBC可以包括合并(Merge)模式和非Merge模式(例如,高级运动矢量预测模式,Advanced Motion Vector Prediction,AMVP)。Merge模式的特点在于:图像块的MV等于预测MV(Motion Vector Prediction,MVP),不需要在码流中传输运动矢量差值(Motion vector difference,MVD),只需要传递MVP索引以及参考帧索引至解码端即可。非Merge模式的特点在于:需要在码流中传输MVD、MVP索引以及参考帧索引至解码端。Inter prediction and IBC may include merge (Merge) mode and non-Merge mode (for example, advanced motion vector prediction mode, Advanced Motion Vector Prediction, AMVP). The feature of the Merge mode is that the MV of the image block is equal to the prediction MV (Motion Vector Prediction, MVP), and there is no need to transmit the motion vector difference (MVD) in the code stream. It only needs to pass the MVP index and the reference frame index to Just the decoder. The feature of the non-Merge mode is that MVD, MVP index, and reference frame index need to be transmitted in the code stream to the decoder.
示例性的,对于Merge模式而言,可以先确定运动矢量预测(motion vector prediction,MVP),并直接将MVP确定为MV,其中,为了得到MVP,可以先构建一个MVP候选列表(merge candidate list),在MVP候选列表中,可以包括至少一个候选MVP,每个候选MVP可以对应有一个索引,编码端在从MVP候选列表中选择MVP之后,可以将该MVP索引写入到码流中,则解码端可以按照该索引从MVP候选列表中找到该索引对应的MVP,以实现对图像块的解码。Exemplarily, for the Merge mode, the motion vector prediction (MVP) can be determined first, and the MVP can be directly determined as the MV. Among them, in order to obtain the MVP, an MVP candidate list (merge candidate list) can be constructed first In the MVP candidate list, at least one candidate MVP can be included. Each candidate MVP can have an index. After selecting the MVP from the MVP candidate list, the encoder can write the MVP index into the code stream, and then decode The terminal can find the MVP corresponding to the index from the MVP candidate list according to the index, so as to realize the decoding of the image block.
为了更加清楚地理解Merge模式,以下将介绍采用Merge模式进行编码的操作流程。In order to understand the Merge mode more clearly, the following will introduce the operation process of using the Merge mode for encoding.
步骤一、获取MVP候选列表;Step 1: Obtain the MVP candidate list;
步骤二、从MVP候选列表中选出最优的一个MVP,同时得到该MVP在MVP候选列表中的索引;Step 2: Select an optimal MVP from the MVP candidate list, and at the same time obtain the index of the MVP in the MVP candidate list;
步骤三、把该MVP作为当前块的MV;Step 3: Use the MVP as the MV of the current block;
步骤四、根据MV确定参考块(也可以称为预测块)在参考帧图像中的位置;Step 4: Determine the position of the reference block (also called the prediction block) in the reference frame image according to the MV;
步骤五、当前块减去参考块得到残差数据;Step 5. The current block is subtracted from the reference block to obtain residual data;
步骤六、把残差数据和MVP的索引传给解码端。Step 6. Pass the residual data and the index of the MVP to the decoding end.
应理解,以上流程只是Merge模式的一种具体实现方式。Merge模式还可 以具有其他的实现方式。It should be understood that the above process is only a specific implementation of the Merge mode. Merge mode can also have other implementations.
在目前的编解码标准中,在构建MVP候选列表时,通常会加入空域相邻块的运动信息,且空域相邻块的运动信息的加入顺序的优先级最高。然而,在MVP候选列表中加入空域相邻块的运动信息,使得当前图像块的编码或解码依赖于空域相邻块,而无法进行多个图像块的并行处理,不利于编码或解码效率的提高。同时,在构建MVP候选列表时的并行技术在高层语法(例如,序列头sequence header/图像头picture header/条带头slice header等)中并没有标识,即无法通过高层语法上的标识对这种操作进行开关,不利于针对图像块需要满足的特殊要求进行调整,而实现编码或解码的灵活性适应。In the current coding and decoding standards, when constructing the MVP candidate list, the motion information of the spatial neighboring blocks is usually added, and the priority of the joining order of the motion information of the spatial neighboring blocks is the highest. However, adding the motion information of the spatial neighboring block to the MVP candidate list makes the encoding or decoding of the current image block dependent on the spatial neighboring block, and the parallel processing of multiple image blocks cannot be performed, which is not conducive to the improvement of encoding or decoding efficiency. . At the same time, the parallel technology when constructing the MVP candidate list is not marked in the high-level syntax (for example, sequence header/picture header/slice header, etc.), that is, this operation cannot be performed through the high-level syntax identifier. The switch is not conducive to adjusting the special requirements that the image block needs to meet, and realizes the flexible adaptation of encoding or decoding.
因此,本申请实施例公开了一种视频处理方法,视频处理设备可以在当前帧的第一图像块满足高层语法中标识的预设条件时,利用基于历史的运动矢量预测(History-based motion vector prediction,HMVP)为第一图像块构建第一运动信息候选列表,根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。视频处理设备还可以在第一图像块不满足高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表,根据第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。其中,当前帧即当前进行编码或解码的帧。第一图像块可以为当前帧中的任一图像块。Therefore, the embodiment of the present application discloses a video processing method. The video processing device can use the history-based motion vector prediction (History-based motion vector prediction) when the first image block of the current frame satisfies the preset condition identified in the high-level syntax. prediction, HMVP) constructs a first motion information candidate list for the first image block, and encodes or decodes the first image block according to the motion information in the first motion information candidate list. The video processing device can also construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP when the first image block does not meet the preset conditions identified in the high-level grammar, and according to the second motion information The motion information in the candidate list encodes or decodes the first image block. Among them, the current frame is the frame currently being encoded or decoded. The first image block can be any image block in the current frame.
本申请实施例中,在当前帧的某一图像块满足高层语法中标识的预设条件时,可利用HMVP为该图像块构建运动信息候选列表,相对传统的视频处理方法需要利用空域相邻块的运动信息为图像块构建运动信息候选列表,本申请实施例减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of this application, when a certain image block of the current frame meets the preset conditions identified in the high-level syntax, HMVP can be used to construct a motion information candidate list for the image block. Compared with the traditional video processing method, the spatial neighboring block needs to be used. The motion information of the image block constructs a motion information candidate list. The embodiment of the present application reduces the complexity of the construction process of the motion information candidate list, and improves the efficiency of image coding and image decoding.
本申请实施例还公开了另一种视频处理方法,视频处理设备可以在当前帧的第一图像块满足预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表。视频处理设备还可以在第一图像块不满足预设条件时,利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表。The embodiment of the present application also discloses another video processing method. The video processing device can use the HMVP to construct the first motion information candidate list for the first image block when the first image block of the current frame meets the preset condition. The video processing device may also construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP when the first image block does not meet the preset condition.
本申请实施例中,在当前帧的某一图像块满足预设条件时,可利用HMVP为该图像块构建运动信息候选列表,相对传统的视频处理方法需要利用空域相 邻块的运动信息为图像块构建运动信息候选列表,本申请实施例减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of this application, when a certain image block in the current frame meets a preset condition, the HMVP can be used to construct a motion information candidate list for the image block. Compared with the traditional video processing method, the motion information of the neighboring blocks in the spatial domain needs to be used as the image The motion information candidate list is constructed by blocks. The embodiment of the present application reduces the complexity of the construction process of the motion information candidate list, and improves the efficiency of image coding and image decoding.
本申请实施例还公开了另一种视频处理方法,在当前帧的第一图像块满足高层语法中标识的预设条件时,视频处理设备按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP,然后根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。The embodiment of the application also discloses another video processing method. When the first image block of the current frame meets the preset condition identified in the high-level grammar, the video processing device constructs the first motion information for the first image block according to the preset rule Candidate list, the preset rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of temporal neighboring blocks or HMVP, and then according to the motion information in the first motion information candidate list, the first image The block is encoded or decoded.
本申请实施例中,在当前帧的某一图像块满足高层语法中标识的预设条件时,通过在第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP的方式构建运动信息候选列表,相对传统的视频处理方法通过在第一运动信息候选列表中首次加入的运动信息为空域相邻块的运动信息的方式构建运动信息候选列表,本申请实施例减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of the present application, when a certain image block of the current frame satisfies the preset condition identified in the high-level syntax, the motion information added for the first time in the first motion information candidate list is the motion information of the temporal neighboring block or HMVP. Compared with the traditional video processing method, the motion information candidate list is constructed by the motion information of the spatial neighboring blocks in the first motion information candidate list for the first time. This embodiment of the application reduces The complexity of the construction process of the motion information candidate list improves the efficiency of image coding and image decoding.
本申请实施例还公开了另一种视频处理方法,在当前帧的第一图像块满足预设条件时,视频处理设备按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP,然后根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。The embodiment of the application also discloses another video processing method. When the first image block of the current frame meets a preset condition, the video processing device constructs a first motion information candidate list for the first image block according to the preset rule, and the preset The rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of the temporal neighboring block or HMVP, and then the first image block is encoded or decoded according to the motion information in the first motion information candidate list .
本申请实施例中,在当前帧的某一图像块满足预设条件时,通过在第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP的方式构建运动信息候选列表,相对传统的视频处理方法通过在第一运动信息候选列表中首次加入的运动信息为空域相邻块的运动信息的方式构建运动信息候选列表,本申请实施例减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of the present application, when a certain image block in the current frame meets a preset condition, the motion information is constructed by the motion information of the temporal neighboring block or HMVP added to the first motion information candidate list for the first time Candidate list. Compared with the traditional video processing method, the motion information candidate list is constructed by the motion information of the neighboring blocks in the spatial domain as the motion information added to the first motion information candidate list for the first time. The complexity of the construction process improves the efficiency of image coding and image decoding.
基于上述描述,请参见图4,图4是本申请实施例提供的一种视频处理系统的架构示意图。如图4所示,该视频处理系统包括编码端401和解码端402。编码端401用于对原始的视频数据进行编码得到编码后的视频数据,或用于对原始的图像数据进行编码得到编码后的图像数据。编码端401将编码后的视频数 据发送给解码端402。解码端402用于对编码后的视频数据进行解码得到原始的视频数据,或用于对编码后的图像数据进行解码得到原始的图像数据。Based on the foregoing description, please refer to FIG. 4, which is a schematic structural diagram of a video processing system provided by an embodiment of the present application. As shown in FIG. 4, the video processing system includes an encoding terminal 401 and a decoding terminal 402. The encoding terminal 401 is used to encode original video data to obtain encoded video data, or to encode original image data to obtain encoded image data. The encoding terminal 401 sends the encoded video data to the decoding terminal 402. The decoding terminal 402 is used to decode the encoded video data to obtain original video data, or to decode the encoded image data to obtain original image data.
在一种示例中,编码端401和解码端402可以运行在同一视频处理设备中。例如,视频处理设备采集到原始的视频数据之后,可以通过编码端401对原始的视频数据进行编码得到编码后的视频数据,进而存储编码后的视频数据。视频处理设备在通过播放器播放该视频数据之前,可以通过解码端402对编码后的视频数据进行解码得到原始的视频数据,进而通过播放器播放解码得到的原始的视频数据。又如,视频处理设备采集到原始的图像数据之后,可以通过编码端401对原始的图像数据进行编码得到编码后的图像数据,进而存储编码后的图像数据。视频处理设备在显示屏幕中显示图像数据之前,可以通过解码端402对编码后的图像数据进行解码得到原始的图像数据,进而通过播放器播放解码得到的原始的图像数据。In an example, the encoding terminal 401 and the decoding terminal 402 may run in the same video processing device. For example, after the video processing device collects the original video data, the original video data may be encoded by the encoding terminal 401 to obtain encoded video data, and then the encoded video data may be stored. Before playing the video data through the player, the video processing device may decode the encoded video data through the decoding terminal 402 to obtain the original video data, and then play the decoded original video data through the player. For another example, after the video processing device collects the original image data, the original image data may be encoded by the encoding terminal 401 to obtain encoded image data, and then the encoded image data may be stored. Before displaying the image data on the display screen, the video processing device may decode the encoded image data through the decoder 402 to obtain the original image data, and then play the decoded original image data through the player.
在另一种示例中,编码端401和解码端402可以运行在不同视频处理设备中。例如,编码端401运行在第一视频处理设备中,解码端402运行在第二视频处理设备中。第一视频处理设备采集到原始的视频数据之后,可以通过编码端401对原始的视频数据进行编码得到编码后的视频数据,进而第一视频处理设备将编码后的视频数据发送给第二视频处理设备。第二视频处理设备可以通过解码端402对编码后的视频数据进行解码得到原始的视频数据。In another example, the encoding end 401 and the decoding end 402 may run in different video processing devices. For example, the encoding terminal 401 runs in a first video processing device, and the decoding terminal 402 runs in a second video processing device. After the first video processing device collects the original video data, the original video data can be encoded by the encoding terminal 401 to obtain encoded video data, and then the first video processing device sends the encoded video data to the second video processing device. equipment. The second video processing device may decode the encoded video data through the decoding terminal 402 to obtain the original video data.
在由编码端401和解码端402组成的视频处理系统中,该视频处理方法可以为:在当前帧的第一图像块满足高层语法中标识的预设条件时,编码端401利用HMVP为所述第一图像块构建第一运动信息候选列表,并根据第一运动信息候选列表中的运动信息,对第一图像块进行编码。在当前帧的第一图像块不满足高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表,并根据第二运动信息候选列表中的运动信息,对第一图像块进行编码。或者在当前帧的第一图像块满足高层语法中标识的预设条件时,解码端402利用HMVP为所述第一图像块构建第一运动信息候选列表,并根据第一运动信息候选列表中的运动信息,对第一图像块进行解码。在当前帧的第一图像块不满足高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候 选列表,并根据第二运动信息候选列表中的运动信息,对第一图像块进行解码。In a video processing system composed of an encoding end 401 and a decoding end 402, the video processing method may be: when the first image block of the current frame meets the preset condition identified in the high-level syntax, the encoding end 401 uses HMVP as the The first image block constructs a first motion information candidate list, and encodes the first image block according to the motion information in the first motion information candidate list. When the first image block of the current frame does not meet the preset conditions identified in the high-level syntax, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block, and based on the second motion The motion information in the information candidate list encodes the first image block. Or when the first image block of the current frame satisfies the preset condition identified in the high-level syntax, the decoder 402 uses the HMVP to construct the first motion information candidate list for the first image block, and according to the information in the first motion information candidate list Motion information, decode the first image block. When the first image block of the current frame does not meet the preset conditions identified in the high-level syntax, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block, and based on the second motion The motion information in the information candidate list decodes the first image block.
可以理解的是,本申请实施例描述的视频处理系统是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。It is understandable that the video processing system described in the embodiments of the present application is to illustrate the technical solutions of the embodiments of the present application more clearly, and does not constitute a limitation on the technical solutions provided in the embodiments of the present application. Those of ordinary skill in the art will know that, With the evolution of the system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are equally applicable to similar technical problems.
基于上述描述,请参见图5,图5是本申请实施例提供的一种视频处理方法的流程示意图,该视频处理方法可以包括以下步骤S501至S503:Based on the foregoing description, please refer to FIG. 5. FIG. 5 is a schematic flowchart of a video processing method provided by an embodiment of the present application. The video processing method may include the following steps S501 to S503:
步骤S501:在当前帧的第一图像块满足高层语法中标识的预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表。也就是说,在当前帧的第一图像块满足高层语法中标识的预设条件时,视频处理设备不将空域相邻块的运动信息填充至第一运动信息候选列表,而是利用HMVP或者其他不涉及到空域依赖性的方法为第一图像块构建第一运动信息候选列表。Step S501: When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, use HMVP to construct a first motion information candidate list for the first image block. That is, when the first image block of the current frame satisfies the preset condition identified in the high-level syntax, the video processing device does not fill the motion information of the spatial neighboring block into the first motion information candidate list, but uses HMVP or other A method that does not involve spatial dependence constructs a first motion information candidate list for the first image block.
在一种实现方式中,第一图像块满足高层语法中标识的预设条件包括:第一图像块的尺寸大小小于或等于高层语法中标识的图像块的尺寸大小。若第一图像块的尺寸大小小于或等于高层语法中标识的图像块的尺寸大小,那么视频处理设备可以确定第一图像块满足高层语法中标识的预设条件。例如,高层语法中标识的图像块的尺寸大小为64x64,那么当前帧中尺寸大小小于或等于64x64(例如4x8,8x4,16x32等)的图像块均满足高层语法中标识的预设条件,为满足高层语法中标识的预设条件的图像块构建的运动信息候选列表均可能会利用到HMVP,但不包括空域相邻块的运动信息。In an implementation manner, that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the first image block is smaller than or equal to the size of the image block identified in the high-level grammar. If the size of the first image block is less than or equal to the size of the image block identified in the high-level grammar, the video processing device may determine that the first image block satisfies the preset condition identified in the high-level grammar. For example, if the size of the image block identified in the high-level grammar is 64x64, then the image blocks in the current frame whose size is less than or equal to 64x64 (for example, 4x8, 8x4, 16x32, etc.) meet the preset conditions identified in the high-level grammar, which is to satisfy The motion information candidate list constructed by the image blocks with the preset conditions identified in the high-level syntax may all use the HMVP, but does not include the motion information of the neighboring blocks in the spatial domain.
可选的,若第一图像块的尺寸大小大于高层语法中标识的图像块的尺寸大小,那么视频处理设备可以确定第一图像块不满足高层语法中标识的预设条件,进一步执行步骤S502。例如,高层语法中标识的图像块的尺寸大小为64x64,那么当前帧中尺寸大小大于64x64(例如64x128,128x64,128x128等)的图像块均不满足高层语法中标识的预设条件,为不满足高层语法中标识的预设条件的图像块构建的运动信息候选列表均会利用到空域相邻块的运动信息,但不包括空域相邻块的运动信息。Optionally, if the size of the first image block is greater than the size of the image block identified in the high-level grammar, the video processing device may determine that the first image block does not meet the preset condition identified in the high-level grammar, and further execute step S502. For example, if the size of the image block identified in the high-level grammar is 64x64, then the image block in the current frame whose size is larger than 64x64 (for example, 64x128, 128x64, 128x128, etc.) does not meet the preset condition identified in the high-level grammar, which is not satisfied The motion information candidate list constructed by the image blocks with the preset conditions identified in the high-level grammar will all utilize the motion information of the neighboring blocks in the spatial domain, but does not include the motion information of the neighboring blocks in the spatial domain.
在一种实现方式中,第一图像块满足高层语法中标识的预设条件包括:高 层语法中标识的图像块的尺寸大小包括第一图像块的尺寸大小。若高层语法中标识的图像块的尺寸大小包括第一图像块的尺寸大小,那么视频处理设备可以确定第一图像块满足高层语法中标识的预设条件。具体实现中,高层语法中标识的图像块的尺寸大小可以包括至少一个,例如,高层语法中标识的图像块的尺寸大小为4x8,8x4以及64x64,那么当前帧中尺寸大小为4x8,8x4以及64x64的图像块均满足高层语法中标识的预设条件,为满足高层语法中标识的预设条件的图像块构建的运动信息候选列表均可能会利用到HMVP,但不包括空域相邻块的运动信息。In an implementation manner, that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the image block identified in the high-level grammar includes the size of the first image block. If the size of the image block identified in the high-level grammar includes the size of the first image block, the video processing device may determine that the first image block satisfies the preset condition identified in the high-level grammar. In specific implementation, the size of the image block identified in the high-level syntax may include at least one. For example, the size of the image block identified in the high-level syntax is 4x8, 8x4, and 64x64, then the size of the current frame is 4x8, 8x4, and 64x64. All image blocks satisfy the preset conditions identified in the high-level grammar, and the motion information candidate list constructed for the image blocks that meet the preset conditions identified in the high-level grammar may use HMVP, but does not include the motion information of the spatial neighboring blocks .
可选的,若高层语法中标识的图像块的尺寸大小不包括第一图像块的尺寸大小,那么视频处理设备可以确定第一图像块不满足高层语法中标识的预设条件。具体实现中,高层语法中标识的图像块的尺寸大小可以包括至少一个,例如,高层语法中标识的图像块的尺寸大小为4x8,8x4以及64x64,那么当前帧中尺寸大小不为4x8,8x4或64x64的图像块均不满足高层语法中标识的预设条件,为不满足高层语法中标识的预设条件的图像块构建的运动信息候选列表均会利用到空域相邻块的运动信息。Optionally, if the size of the image block identified in the high-level grammar does not include the size of the first image block, the video processing device may determine that the first image block does not meet the preset condition identified in the high-level grammar. In specific implementation, the size of the image block identified in the high-level syntax may include at least one. For example, the size of the image block identified in the high-level syntax is 4x8, 8x4, and 64x64, then the size of the current frame is not 4x8, 8x4, or None of the 64x64 image blocks meets the preset conditions identified in the high-level grammar, and the motion information candidate list constructed for image blocks that do not meet the preset conditions identified in the high-level grammar will use the motion information of the spatial neighboring blocks.
其中,每一标识的图像块的尺寸大小为M*N,M和N均大于或等于4。M和N可以相等,或者M和N也可以不相等。具体的:Wherein, the size of each identified image block is M*N, and both M and N are greater than or equal to 4. M and N can be equal, or M and N can be unequal. specific:
1、以第一图像块的预测模式为帧间预测为例,M大于或等于4,N大于或等于4,M和N可以相等,M和N也可以不相等。例如,可以通过高层语法标识对于尺寸大小为4x4,8x4,16x32,32x16,64x128以及128x128的图像块。1. Taking the prediction mode of the first image block as inter-frame prediction as an example, M is greater than or equal to 4, N is greater than or equal to 4, M and N may be equal, and M and N may not be equal. For example, a high-level syntax can be used to identify image blocks with sizes of 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128.
2、以第一图像块的预测模式为IBC为例,M大于或等于4,N大于或等于4,M和N可以不相等。例如,可以通过高层语法标识对于尺寸大小为8x4,16x32,32x16,64x128的图像块。2. Taking the prediction mode of the first image block as IBC as an example, M is greater than or equal to 4, N is greater than or equal to 4, and M and N may not be equal. For example, a high-level syntax can be used to identify image blocks with a size of 8x4, 16x32, 32x16, and 64x128.
3、以第一图像块的预测模式为IBC为例,M和N中的一个大于或等于4,M和N中的另一个大于4。例如,可以通过高层语法标识对于尺寸大小为4x8,8x4,16x32,32x16,64x128,128x128的图像块。3. Taking the prediction mode of the first image block as IBC as an example, one of M and N is greater than or equal to 4, and the other of M and N is greater than 4. For example, a high-level syntax can be used to identify image blocks with a size of 4x8, 8x4, 16x32, 32x16, 64x128, 128x128.
在一种实现方式中,视频处理设备可以在高层语法中增加语法元素的标识,即标识对于某些尺寸的图像块,利用HMVP构建运动信息候选列表。通过设 置该语法元素,确定在后续编码过程中可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表的图像块的尺寸大小。In an implementation manner, the video processing device may add the identification of the syntax element to the high-level syntax, that is, identify that for image blocks of certain sizes, use HMVP to construct a motion information candidate list. By setting this syntax element, the size of the image block that can use HMVP instead of the motion information of the spatial neighboring blocks to construct the motion information candidate list in the subsequent encoding process is determined.
以高层语法中标识的图像块的尺寸大小包括第一图像块的尺寸大小的场景为例,在一个示例中,视频处理设备设置的语法元素可以包括至少一个图像块的尺寸大小的索引值。例如4x4的索引值为0,4x8的索引值为1,4x16的索引值为2,若视频处理设备设置语法元素包括0和1,那么可以确定对于图像中尺寸大小为4x4或4x8的图像块,可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表。在另一个示例中,视频处理设备设置的语法元素可以包括至少一个图像块的尺寸大小。若视频处理设备设置语法元素包括4x4和4x8,那么可以确定对于图像中尺寸大小为4x4或4x8的图像块,可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表。Taking a scenario where the size of the image block identified in the high-level syntax includes the size of the first image block as an example, in one example, the syntax element set by the video processing device may include an index value of the size of at least one image block. For example, the index value of 4x4 is 0, the index value of 4x8 is 1, and the index value of 4x16 is 2. If the video processing device sets the syntax element to include 0 and 1, then it can be determined that the size of the image block is 4x4 or 4x8. The HMVP can be used to construct the motion information candidate list instead of the motion information of the neighboring blocks in the spatial domain. In another example, the syntax element set by the video processing device may include the size of at least one image block. If the video processing device sets the syntax elements to include 4x4 and 4x8, it can be determined that for image blocks with a size of 4x4 or 4x8 in the image, HMVP can be used to construct a motion information candidate list instead of the motion information of spatial neighboring blocks.
以第一图像块的尺寸大小小于或等于高层语法中标识的图像块的尺寸大小的场景为例,在一个示例中,视频处理设备设置的语法元素可以包括图像块的尺寸大小的索引值。例如4x4的索引值为0,4x8的索引值为1,4x16的索引值为2,若视频处理设备设置语法元素包括2,那么可以确定对于图像中尺寸大小小于或等于4x16的图像块,利用HMVP构建运动信息候选列表。在另一个示例中,视频处理设备设置的语法元素可以包括图像块的尺寸大小。若视频处理设备设置语法元素包括4x16,那么可以确定对于图像中尺寸大小小于或等于4x16的图像块,可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表。Taking a scenario where the size of the first image block is smaller than or equal to the size of the image block identified in the high-level syntax as an example, in one example, the syntax element set by the video processing device may include an index value of the size of the image block. For example, the index value of 4x4 is 0, the index value of 4x8 is 1, and the index value of 4x16 is 2. If the video processing device sets the syntax element to include 2, then it can be determined that for image blocks with a size less than or equal to 4x16 in the image, use HMVP Build a list of motion information candidates. In another example, the syntax element set by the video processing device may include the size of the image block. If the video processing device sets the syntax element to include 4x16, then it can be determined that for image blocks with a size less than or equal to 4x16 in the image, the motion information candidate list can be constructed using HMVP instead of using the motion information of spatial neighboring blocks.
可以理解,本申请实施例中的高层语法中标识的预设条件包含但不限定于上述内容。例如第一图像块满足高层语法中标识的预设条件包括:高层语法中标识的图像块的位置包括第一图像块在图像中的位置。若高层语法中标识的图像块的位置包括第一图像块在图像中的位置,那么视频处理设备可以确定第一图像块满足高层语法中标识的预设条件;若高层语法中标识的图像块的位置不包括第一图像块在图像中的位置,那么视频处理设备可以确定第一图像块不满足高层语法中标识的预设条件。例如,高层语法中标识的图像块的位置为左上角或右下角,那么视频处理设备可以获取第一图像块在图像中的位置,若第一图像块位于图像中的左上角,那么视频处理设备可以确定第一图像块满足高层 语法中标识的预设条件。若第一图像块位于图像中的右上角,那么视频处理设备可以确定第一图像块不满足高层语法中标识的预设条件。其中,高层语法中标识的图像块的位置包括至少一个。It can be understood that the preset conditions identified in the high-level grammar in the embodiments of the present application include but are not limited to the foregoing content. For example, the first image block meeting the preset condition identified in the high-level grammar includes: the position of the image block identified in the high-level grammar includes the position of the first image block in the image. If the position of the image block identified in the high-level grammar includes the position of the first image block in the image, the video processing device can determine that the first image block satisfies the preset condition identified in the high-level grammar; if the position of the image block identified in the high-level grammar The position does not include the position of the first image block in the image, then the video processing device may determine that the first image block does not satisfy the preset condition identified in the high-level grammar. For example, if the position of the image block identified in the high-level grammar is the upper left corner or the lower right corner, the video processing device can obtain the position of the first image block in the image. If the first image block is located in the upper left corner of the image, then the video processing device It may be determined that the first image block satisfies the preset condition identified in the high-level grammar. If the first image block is located in the upper right corner of the image, the video processing device may determine that the first image block does not satisfy the preset condition identified in the high-level grammar. Wherein, the position of the image block identified in the high-level grammar includes at least one position.
在一种实现方式中,视频处理设备可以在图像头信息、序列头信息或条带头信息中增加语法元素的标识。其中,每一个图像对应有一个图像头信息,若在某一帧的图像头信息中增加语法元素的标识,那么可以确定对于该帧中尺寸大小为语法元素指示的尺寸大小的图像块,可以利用HMVP或者其他不涉及到空域依赖性的方式构建,而不利用空域相邻块的运动信息构建运动信息候选列表。其中,每一序列的视频数据对应有一个序列头信息,若在某一序列的视频数据的序列头信息中增加语法元素的标识,那么可以确定对于该序列的视频数据所包含的每帧中尺寸大小为语法元素指示的尺寸大小的图像块,可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表。其中,每一帧可以对应有至少一个条带头信息,若在某一帧的某一条带头信息中增加语法元素的标识,那么可以确定对于该帧中尺寸大小为语法元素指示的尺寸大小的图像块,可以利用HMVP而不利用空域相邻块的运动信息构建运动信息候选列表。In an implementation manner, the video processing device may add the identifier of the syntax element to the image header information, sequence header information, or strip header information. Among them, each image corresponds to an image header information. If a syntax element identification is added to the image header information of a certain frame, then it can be determined that the size of the image block in the frame is the size indicated by the syntax element. HMVP or other methods that do not involve spatial dependence are constructed, and the motion information of neighboring blocks in the spatial domain is not used to construct a candidate list of motion information. Among them, each sequence of video data corresponds to a sequence header information. If the identifier of a syntax element is added to the sequence header information of a certain sequence of video data, then the size of each frame contained in the sequence of video data can be determined For image blocks whose size is the size indicated by the syntax element, the motion information candidate list can be constructed by using HMVP instead of using the motion information of the neighboring blocks in the spatial domain. Among them, each frame can correspond to at least one piece of header information. If a syntax element identifier is added to a certain piece of header information in a frame, then it can be determined that the size of the image block in the frame is the size indicated by the syntax element. HMVP can be used to construct a motion information candidate list instead of the motion information of neighboring blocks in the spatial domain.
可以理解,本申请实施例中的高层语法中预设条件的标识方式包含但不限定于上述内容。只要能够用于确定当前帧的何种图像块在构建运动信息候选列表无需使用到空域相邻块的运动信息即可。It can be understood that the identification method of the preset condition in the high-level grammar in the embodiment of the present application includes but is not limited to the foregoing content. As long as it can be used to determine which image block of the current frame is constructing the motion information candidate list, it does not need to use the motion information of the neighboring blocks in the spatial domain.
在一种实现方式中,视频处理设备在第一图像块满足高层语法中标识的预设条件时,为第一图像块构建第一运动信息候选列表的运动信息的类型为根据第一图像块的预测模式确定。In an implementation manner, when the first image block meets the preset condition identified in the high-level syntax, the video processing device constructs the first motion information candidate list for the first image block according to the type of the motion information of the first image block. The prediction mode is determined.
其中,预测模式可以包括帧间预测或IBC。Among them, the prediction mode may include inter prediction or IBC.
当预测模式为IBC时,视频处理设备为第一图像块构建第一运动信息候选列表的运动信息可以包括HMVP。When the prediction mode is IBC, the motion information in which the video processing device constructs the first motion information candidate list for the first image block may include HMVP.
在一种示例中,视频处理设备可以从HMVP列表中选择HMVP,作为第一运动信息候选列表中的运动信息。若将选择的候选HMVP填充至第一运动信息候选列表之后,第一运动信息候选列表未被填满,则使用零运动矢量(0,0)对第一运动信息候选列表进行填充,直至填满。In an example, the video processing device may select HMVP from the HMVP list as the motion information in the first motion information candidate list. If after filling the selected candidate HMVP into the first motion information candidate list, the first motion information candidate list is not filled, then use zero motion vector (0, 0) to fill the first motion information candidate list until it is filled .
在另一种示例中,当HMVP列表为空时,第一运动信息候选列表中的运动信息包括零运动矢量。例如,若所有的第一图像块满足高层语法中标识的预设条件,则在对各个第一图像块进行编码或解码之后,均不利用各个第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作,那么HMVP列表可能为空,基于此,第一运动信息候选列表中的运动信息可能为零运动矢量。In another example, when the HMVP list is empty, the motion information in the first motion information candidate list includes a zero motion vector. For example, if all the first image blocks meet the preset conditions identified in the high-level syntax, after each first image block is encoded or decoded, the motion information used in the encoding or decoding of each first image block is not used If the HMVP list is updated, the HMVP list may be empty. Based on this, the motion information in the first motion information candidate list may be a zero motion vector.
当预测模式为帧间预测时,视频处理设备为第一图像块构建第一运动信息候选列表的运动信息还可以包括时域相邻块的运动信息。When the prediction mode is inter prediction, the motion information of the first motion information candidate list constructed by the video processing device for the first image block may also include motion information of temporal neighboring blocks.
在一种示例中,若预测模式为帧间预测的merge模式,那么视频处理设备为第一图像块构建第一运动信息候选列表的运动信息可以包括时域相邻块的运动信息、HMVP以及成对平均候选MV。若将时域相邻块的运动信息、HMVP以及成对平均候选MV填充至第一运动信息候选列表之后,第一运动信息候选列表未被填满,则使用零运动矢量(0,0)对第一运动信息候选列表进行填充,直至填满。In an example, if the prediction mode is the merge mode of inter prediction, the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include motion information of temporal neighboring blocks, HMVP, and composition. For the average candidate MV. If the first motion information candidate list is not filled after the motion information, HMVP, and paired average candidate MV of neighboring blocks in the time domain are filled into the first motion information candidate list, the zero motion vector (0, 0) pair is used The first motion information candidate list is filled until it is full.
在另一种示例中,若预测模式为帧间预测的merge模式,且HMVP列表为空,那么视频处理设备为第一图像块构建第一运动信息候选列表的运动信息可以包括时域相邻块的运动信息以及成对平均候选MV。进一步的,视频处理设备为第一图像块构建第一运动信息候选列表的运动信息还可以包括零运动矢量。In another example, if the prediction mode is the merge mode of inter prediction and the HMVP list is empty, the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include temporal neighboring blocks The motion information and the paired average candidate MV. Further, the motion information in which the video processing device constructs the first motion information candidate list for the first image block may also include a zero motion vector.
在一种示例中,若预测模式为帧间预测的非merge模式,那么视频处理设备为第一图像块构建第一运动信息候选列表的运动信息可以包括时域相邻块的运动信息以及HMVP。若将时域相邻块的运动信息以及选择的候选HMVP填充至第一运动信息候选列表之后,第一运动信息候选列表未被填满,则使用零运动矢量(0,0)对第一运动信息候选列表进行填充,直至填满。In an example, if the prediction mode is a non-merge mode of inter prediction, the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include motion information of temporal neighboring blocks and HMVP. If the motion information of the neighboring blocks in the time domain and the selected candidate HMVP are filled into the first motion information candidate list, and the first motion information candidate list is not filled, the zero motion vector (0, 0) is used for the first motion The information candidate list is filled until it is full.
在另一种示例中,若预测模式为帧间预测的非merge模式,且HMVP列表为空,那么视频处理设备为第一图像块构建第一运动信息候选列表的运动信息可以包括时域相邻块的运动信息。进一步的,视频处理设备为第一图像块构建第一运动信息候选列表的运动信息还可以包括零运动矢量。In another example, if the prediction mode is the non-merge mode of inter prediction and the HMVP list is empty, then the motion information of the first motion information candidate list constructed by the video processing device for the first image block may include temporal neighbors The movement information of the block. Further, the motion information in which the video processing device constructs the first motion information candidate list for the first image block may also include a zero motion vector.
在一种实现方式中,第一图像块的编码与当前帧的第二图像块的编码同步, 或第一图像块的解码与第二图像块的解码同步。其中,第二图像块为第一图像块的空域相邻块。In an implementation manner, the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block. Wherein, the second image block is a spatial neighboring block of the first image block.
本申请实施例中,由于在第一图像块满足高层语法中标识的预设条件时,视频处理设备无需利用空域相邻块的运动信息为第一图像块构建第一运动信息候选列表,那么可通过第一图像块的第一运动信息候选列表的构建过程和第二图像块的运动信息候选列表的构建过程进行并行化操作,实现第一图像块的编码与当前帧的第二图像块的编码同步,或第一图像块的解码与第二图像块的解码同步。另外,在高层语法中对于上述操作增加语法元素的标识,即标识对于某些尺寸的图像块可以与其它图像块进行上述并行操作,达到一种尺寸可调的运动信息候选列表构建的并行化操作效果。具体地,通过设置语法元素,确定在后续编码过程中可以与其它图像块进行运动信息候选列表构建并行化的图像块的尺寸大小。In the embodiment of the present application, since the video processing device does not need to use the motion information of spatial neighboring blocks to construct the first motion information candidate list for the first image block when the first image block satisfies the preset condition identified in the high-level grammar, it can Parallelize the construction process of the first motion information candidate list of the first image block and the construction process of the motion information candidate list of the second image block to realize the encoding of the first image block and the encoding of the second image block of the current frame Synchronization, or the decoding of the first image block is synchronized with the decoding of the second image block. In addition, in the high-level grammar, the identification of the syntax element is added for the above operation, that is, the image block of certain size can be operated in parallel with other image blocks to achieve a parallel operation of the construction of the motion information candidate list with adjustable size. Effect. Specifically, by setting the syntax element, the size of the image block that can be parallelized in the construction of the motion information candidate list with other image blocks is determined in the subsequent encoding process.
在一种实现方式中,视频处理设备在根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码之后,可以不利用第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。具体实现中,视频处理设备在第一图像块满足高层语法中标识的预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表,然后视频处理设备可以根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。视频处理设备在根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码之后,不利用第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作,即HMVP列表不包括第一图像块在编码或解码时使用的运动信息。In an implementation manner, after the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it may not use the motion information used in the encoding or decoding of the first image block. , To update the HMVP list. In specific implementation, the video processing device uses HMVP to construct a first motion information candidate list for the first image block when the first image block satisfies the preset condition identified in the high-level grammar, and then the video processing device can use the HMVP to construct the first motion information candidate list according to the first motion information candidate list The motion information in the first image block is encoded or decoded. After the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use the motion information used in the encoding or decoding of the first image block to update the HMVP list, That is, the HMVP list does not include the motion information used in encoding or decoding of the first image block.
在本申请实施例中,由于对于满足高层语法中标识的预设条件的图像块,跳过了利用该图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作这一流程,降低了图像块的编解码复杂度,提高了图像块在编码或解码时的吞吐率。In the embodiment of the present application, for the image block that meets the preset conditions identified in the high-level syntax, the motion information used in encoding or decoding of the image block is skipped, and the process of updating the HMVP list is reduced. The coding and decoding complexity of the image block is improved, and the throughput rate of the image block during encoding or decoding is improved.
在一种实现方式中,在根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码之后,可以基于第一图像块的预测模式,对HMVP列表进行操作。In an implementation manner, after encoding or decoding the first image block according to the motion information in the first motion information candidate list, the HMVP list may be operated based on the prediction mode of the first image block.
其中,预测模式可以包括帧间预测或IBC。当预测模式为帧间预测时,视 频处理设备可以利用第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。当预测模式为IBC时,视频处理设备可以保持HMVP列表不变。Among them, the prediction mode may include inter prediction or IBC. When the prediction mode is inter-frame prediction, the video processing device can update the HMVP list by using the motion information used in encoding or decoding of the first image block. When the prediction mode is IBC, the video processing device can keep the HMVP list unchanged.
步骤S502:在当前帧的第一图像块不满足高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表。Step S502: When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, construct a second motion information candidate list for the first image block using the motion information of the neighboring blocks in the spatial domain and the HMVP.
以预测模式为IBC为例,视频处理设备构建第二运动信息候选列表的具体过程可以为:Taking the prediction mode as IBC as an example, the specific process for the video processing device to construct the second motion information candidate list may be:
一、确定第一图像块的空域相邻块的运动信息,将该空域相邻块的运动信息填充至第二运动信息候选列表中。1. Determine the motion information of the spatial neighboring block of the first image block, and fill the motion information of the spatial neighboring block into the second motion information candidate list.
以图6所示的图像块的示意图为例,CU1的空域相邻块包括CU2和CU3。在对CU1构建第二运动信息候选列表的过程中,可以将编码完成的CU2的运动信息填充至第二运动信息候选列表,并将编码完成的CU3的运动信息填充至第二运动信息候选列表。由此可知,利用空域相邻块的运动信息和HMVP为图像块构建第二运动信息候选列表之前,需要确保该图像块的空域相邻块已编码完成或解码完成,如果空域相邻块编码未完成或解码未完成,则不能对该图像块进行编码或解码,导致图像块的编解码的吞吐率较低。Taking the schematic diagram of the image block shown in FIG. 6 as an example, the spatial neighboring blocks of CU1 include CU2 and CU3. In the process of constructing the second motion information candidate list for CU1, the motion information of the encoded CU2 may be filled into the second motion information candidate list, and the motion information of the encoded CU3 may be filled into the second motion information candidate list. It can be seen that before using the motion information of the spatial neighboring blocks and HMVP to construct the second motion information candidate list for the image block, it is necessary to ensure that the spatial neighboring blocks of the image block have been encoded or decoded. If the spatial neighboring blocks are not encoded, If the decoding is completed or the decoding is not completed, the image block cannot be encoded or decoded, resulting in a low throughput rate of the encoding and decoding of the image block.
二、从HMVP列表中选择HMVP,作为运动信息候选列表中的运动信息。若将已编码的相邻CU的运动信息和选择的HMVP填充至运动信息候选列表之后,运动信息候选列表未被填满,则使用零运动矢量(0,0)对运动信息候选列表进行填充,直至填满。2. Select HMVP from the HMVP list as the motion information in the motion information candidate list. If the motion information candidate list of the coded neighboring CUs and the selected HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, then the motion information candidate list is filled with a zero motion vector (0, 0), Until it fills up.
可以理解,对于一个图像块而言,在编码或解码的过程中构建的运动信息候选列表,根据该图像块是否满足高层语法标识的预设条件,要么构建第一运动信息候选列表,要么构建第二运动信息候选列表。It can be understood that, for an image block, the motion information candidate list constructed in the process of encoding or decoding, according to whether the image block meets the preset conditions of the high-level syntax identification, either the first motion information candidate list is constructed, or the first motion information candidate list is constructed. 2. A list of motion information candidates.
为了便于理解,可以以运动信息候选列表为MVP候选列表为例进行说明,以上提到的MVP候选列表的构建方式即为第一运动信息候选列表或第二运动信息候选列表的构建方式。其中,本申请实施例中提到的运动信息候选列表(例如,第一运动信息候选列表或第二运动信息候选列表)可以是图像块的候选运动信息的集合,该运动信息候选列表中的各候选运动信息可以存储在同一个缓 冲区(buffer)中,也可以存储在不同的缓冲区中,在此不做限制。运动信息在运动信息候选列表中的索引,可以是运动信息在图像块的候选运动信息的集合中的索引。例如,候选运动信息的集合包括5个候选运动信息,该5个候选运动信息在运动信息候选列表的索引可以分别为0,1,2,3,4。For ease of understanding, the motion information candidate list may be an MVP candidate list as an example. The above-mentioned construction method of the MVP candidate list is the construction method of the first motion information candidate list or the second motion information candidate list. Wherein, the motion information candidate list mentioned in the embodiment of the present application (for example, the first motion information candidate list or the second motion information candidate list) may be a set of candidate motion information of the image block, and each of the motion information candidate lists Candidate motion information can be stored in the same buffer or in different buffers, and there is no restriction here. The index of the motion information in the motion information candidate list may be the index of the motion information in the set of candidate motion information of the image block. For example, the set of candidate motion information includes 5 candidate motion information, and the indexes of the 5 candidate motion information in the motion information candidate list may be 0, 1, 2, 3, 4, respectively.
本申请实施例中所提到的运动信息可以包括运动矢量,或者包括运动矢量和参考帧信息(例如,参考帧索引)等。The motion information mentioned in the embodiments of the present application may include a motion vector, or include a motion vector and reference frame information (for example, a reference frame index), and so on.
步骤S503:根据第一运动信息候选列表或第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。Step S503: encode or decode the first image block according to the motion information in the first motion information candidate list or the second motion information candidate list.
具体实现中,在当前帧的第一图像块满足高层语法中标识的预设条件时,视频处理设备可以利用HMVP为第一图像块构建第一运动信息候选列表,然后根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。在当前帧的第一图像块不满足高层语法中标识的预设条件时,视频处理设备可以利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表,然后根据第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。In specific implementation, when the first image block of the current frame satisfies the preset condition identified in the high-level grammar, the video processing device can use HMVP to construct a first motion information candidate list for the first image block, and then according to the first motion information candidate list The motion information in the first image block is encoded or decoded. When the first image block of the current frame does not meet the preset conditions identified in the high-level syntax, the video processing device can use the motion information of the spatial neighboring blocks and the HMVP to construct a second motion information candidate list for the first image block, and then according to the first image block Second, the motion information in the motion information candidate list encodes or decodes the first image block.
本申请实施例中,视频处理设备在当前帧的第一图像块满足高层语法中标识的预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表,减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。另外,满足预设条件的图像块可通过高层语法设置为尺寸可调,可增加编解码的灵活性和适应性。In the embodiment of the present application, when the first image block of the current frame meets the preset condition identified in the high-level syntax, the video processing device uses HMVP to construct the first motion information candidate list for the first image block, which reduces the number of motion information candidate lists. The complexity of the construction process improves the efficiency of image coding and image decoding. In addition, image blocks that meet the preset conditions can be set to be adjustable in size through high-level syntax, which can increase the flexibility and adaptability of coding and decoding.
请参见图7,图7是本申请实施例提供的另一种视频处理方法的流程示意图,该视频处理方法可以包括以下步骤S701至S703:Please refer to FIG. 7. FIG. 7 is a schematic flowchart of another video processing method provided by an embodiment of the present application. The video processing method may include the following steps S701 to S703:
步骤S701:在当前帧的第一图像块满足预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表。也就是说,在当前帧的第一图像块满足预设条件时,视频处理设备不将空域相邻块的运动信息填充至第一运动信息候选列表,而是可以利用HMVP为第一图像块构建第一运动信息候选列表。Step S701: When the first image block of the current frame satisfies a preset condition, use the HMVP to construct a first motion information candidate list for the first image block. That is to say, when the first image block of the current frame meets the preset condition, the video processing device does not fill the motion information of the spatial neighboring blocks into the first motion information candidate list, but can use HMVP to construct the first image block The first motion information candidate list.
其中,第一图像块满足预设条件包括:第一图像块的尺寸大小满足预设的尺寸大小。预设的尺寸大小可以是视频处理设备预先设定的尺寸大小,或者是 编码端或解码端的默认值,或者是编码端和解码端同时规定的尺寸大小。Wherein, that the first image block satisfies the preset condition includes: the size of the first image block meets the preset size. The preset size may be a size preset by the video processing device, or a default value of the encoder or decoder, or a size specified by the encoder and decoder at the same time.
在一种实现方式中,第一图像块的尺寸大小满足预设的尺寸大小包括:第一图像块的尺寸大小小于或等于预设的尺寸大小;或,预设的尺寸大小包括第一图像块的尺寸大小。也就是说,若第一图像块的尺寸大小小于或等于预设的尺寸大小,或,预设的尺寸大小包括第一图像块的尺寸大小,则视频处理设备可以确定第一图像块的尺寸大小满足预设的尺寸大小。In an implementation manner, the size of the first image block meeting the preset size includes: the size of the first image block is less than or equal to the preset size; or, the preset size includes the first image block Size. That is, if the size of the first image block is less than or equal to the preset size, or the preset size includes the size of the first image block, the video processing device may determine the size of the first image block Meet the preset size.
在一种实现方式中,预设的尺寸大小包括至少一个,每一预设的尺寸大小为M*N,M和N均大于或等于4。In an implementation manner, the preset size includes at least one, and each preset size is M*N, and both M and N are greater than or equal to 4.
其中,M和N可以相等,或者M和N也可以不相等。Among them, M and N may be equal, or M and N may not be equal.
若第一图像块的尺寸大小小于或等于预设的尺寸大小,那么视频处理设备可以确定第一图像块满足预设条件。例如,预设的尺寸大小为64x64,那么当前帧中尺寸大小小于或等于64x64(例如4x8,8x4,16x32等)的图像块均满足预设条件。If the size of the first image block is less than or equal to the preset size, the video processing device may determine that the first image block meets the preset condition. For example, if the preset size is 64x64, then all image blocks with a size smaller than or equal to 64x64 (for example, 4x8, 8x4, 16x32, etc.) in the current frame meet the preset condition.
以第一图像块的预测模式为帧间预测为例,M大于或等于4,N大于或等于4,M和N可以相等,也可以不相等。例如,预设的尺寸大小包括4x4,8x4,16x32,32x16,64x128以及128x128,那么当前帧中尺寸大小为4x4,8x4,16x32,32x16,64x128以及128x128的图像块均满足预设条件。Taking the prediction mode of the first image block as inter-frame prediction as an example, M is greater than or equal to 4, N is greater than or equal to 4, and M and N may be equal or not equal. For example, if the preset sizes include 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128, the image blocks with sizes of 4x4, 8x4, 16x32, 32x16, 64x128, and 128x128 in the current frame all meet the preset conditions.
以第一图像块的预测模式为IBC为例,M大于或等于4,N大于或等于4,M和N可以不相等。例如,预设的尺寸大小为8x4,16x32,32x16,64x128,那么当前帧中尺寸大小为8x4,16x32,32x16,64x128的图像块均满足预设条件。Taking the prediction mode of the first image block as IBC as an example, M is greater than or equal to 4, N is greater than or equal to 4, and M and N may not be equal. For example, if the preset size is 8x4, 16x32, 32x16, 64x128, then the image blocks with sizes of 8x4, 16x32, 32x16, 64x128 in the current frame all meet the preset conditions.
在一种实现方式中,当第一图像块的预测模式为IBC时,第一图像块的尺寸大小为K*L,K和L中的一个大于等于4,K和L中的另一个大于4。In one implementation, when the prediction mode of the first image block is IBC, the size of the first image block is K*L, one of K and L is greater than or equal to 4, and the other of K and L is greater than 4. .
例如,在第一图像块的预设模式为IBC时,即使第一图像块的尺寸大小为4x8,8x4,16x32,32x16,64x128,128x128时,也可以利用HMVP而不利用空域相邻块的运动信息为第一图像块构建第一运动信息候选列表。For example, when the preset mode of the first image block is IBC, even when the size of the first image block is 4x8, 8x4, 16x32, 32x16, 64x128, 128x128, HMVP can be used instead of the motion of adjacent blocks in the spatial domain. The information constructs a first motion information candidate list for the first image block.
在一种实现方式中,在第一图像块满足预设条件时,为第一图像块构建第一运动信息候选列表的运动信息的类型为根据第一图像块的预测模式确定。其中,预测模式可以包括帧间预测或IBC。In an implementation manner, when the first image block satisfies a preset condition, the type of motion information for constructing the first motion information candidate list for the first image block is determined according to the prediction mode of the first image block. Among them, the prediction mode may include inter prediction or IBC.
例如,当预测模式为帧间预测时,为第一图像块构建第一运动信息候选列 表的运动信息还可以包括时域相邻块的运动信息。For example, when the prediction mode is inter-frame prediction, the motion information for constructing the first motion information candidate list for the first image block may also include motion information of temporal neighboring blocks.
又如,当预测模式为IBC时,为第一图像块构建第一运动信息候选列表的运动信息可以包括HMVP。For another example, when the prediction mode is IBC, the motion information for constructing the first motion information candidate list for the first image block may include HMVP.
在一种实现方式中,第一图像块的编码与当前帧的第二图像块的编码同步,或第一图像块的解码与第二图像块的解码同步。其中,第二图像块为第一图像块的空域相邻块。In an implementation manner, the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block. Wherein, the second image block is a spatial neighboring block of the first image block.
在一种实现方式中,视频处理设备在根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码之后,不利用第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。In an implementation manner, after the video processing device encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use the motion information used in the encoding or decoding of the first image block, Update the HMVP list.
可以理解,步骤S701与步骤S501的区别在于,预设条件不一定在高层语法中进行标识,即步骤S701中,预设条件可以不在高层语法中标识,而是在第一图像块的编码或解码的过程中,实时地判断第一图像是否满足预设条件。因此,步骤S701与步骤S501的相同部分具体可以参照S501中的相应描述,本申请实施例不再赘述。It can be understood that the difference between step S701 and step S501 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S701, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same part of step S701 and step S501, reference may be made to the corresponding description in S501 for details, which will not be repeated in this embodiment of the application.
步骤S702:在当前帧的第一图像块不满足预设条件时,利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表。Step S702: When the first image block of the current frame does not meet the preset condition, use the motion information of the neighboring blocks in the spatial domain and the HMVP to construct a second motion information candidate list for the first image block.
可以理解,步骤S702与步骤S502的区别在于,预设条件不一定在高层语法中进行标识,即步骤S702中,预设条件可以不在高层语法中标识,而是在第一图像块的编码或解码的过程中,实时地判断第一图像是否满足预设条件。因此,步骤S702与步骤S502的相同部分具体可以参照S502中的相应描述,本申请实施例不再赘述。It can be understood that the difference between step S702 and step S502 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S702, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same parts of step S702 and step S502, reference may be made to the corresponding description in S502 for details, which will not be repeated in this embodiment of the application.
步骤S703:根据第一运动信息候选列表或第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。Step S703: encoding or decoding the first image block according to the motion information in the first motion information candidate list or the second motion information candidate list.
具体实现中,在当前帧的第一图像块满足预设条件时,视频处理设备可以利用HMVP为第一图像块构建第一运动信息候选列表,然后根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。在当前帧的第一图像块不满足预设条件时,视频处理设备可以利用空域相邻块的运动信息和HMVP为第一图像块构建第二运动信息候选列表,然后根据第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。In specific implementation, when the first image block of the current frame meets a preset condition, the video processing device can use HMVP to construct a first motion information candidate list for the first image block, and then, according to the motion information in the first motion information candidate list, Encode or decode the first image block. When the first image block of the current frame does not meet the preset condition, the video processing device can construct a second motion information candidate list for the first image block by using the motion information of the neighboring blocks in the spatial domain and HMVP, and then according to the second motion information candidate list The motion information in the first image block is encoded or decoded.
本申请实施例中,视频处理设备在当前帧的第一图像块满足预设条件时,利用HMVP为第一图像块构建第一运动信息候选列表,减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In the embodiment of the application, the video processing device uses HMVP to construct the first motion information candidate list for the first image block when the first image block of the current frame meets the preset condition, which reduces the complexity of the construction process of the motion information candidate list. , Improve the efficiency of image coding and image decoding.
请参见图8,图8是本申请实施例提供的另一种视频处理方法的流程示意图,该视频处理方法可以包括以下步骤S801和S802:Please refer to FIG. 8. FIG. 8 is a schematic flowchart of another video processing method provided by an embodiment of the present application. The video processing method may include the following steps S801 and S802:
步骤S801:在当前帧的第一图像块满足高层语法中标识的预设条件时,按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP。Step S801: When the first image block of the current frame meets the preset condition identified in the high-level grammar, construct a first motion information candidate list for the first image block according to the preset rule, and the preset rule is used to indicate the first motion information candidate The motion information added to the list for the first time is the motion information of neighboring blocks in the time domain or HMVP.
可以理解,若预测模式为帧间预测的merge模式,那么视频处理设备为图像块构建运动信息候选列表的运动信息可以包括空域相邻块的运动信息、时域相邻块的运动信息、HMVP以及成对平均候选MV。其中,视频处理设备将运动信息加入至运动信息候选列表的顺序可以为:空域相邻块的运动信息→时域相邻块的运动信息→HMVP→成对平均候选MV,即先将空域相邻块的运动信息填充至运动信息候选列表,再将时域相邻块的运动信息填充至运动信息候选列表,然后将HMVP填充至运动信息候选列表,进一步将成对平均候选MV填充至运动信息候选列表。若将时域相邻块的运动信息、HMVP以及成对平均候选MV填充至运动信息候选列表之后,运动信息候选列表未被填满,则使用零运动矢量运动信息候选列表进行填充,直至填满。It can be understood that if the prediction mode is the merge mode of inter-frame prediction, the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks, motion information of temporal neighboring blocks, HMVP, and Average candidate MVs in pairs. Among them, the order in which the video processing device adds the motion information to the motion information candidate list can be: motion information of neighboring blocks in the spatial domain → motion information of neighboring blocks in the temporal domain → HMVP → pairwise average candidate MV, that is, first adjacent in the spatial domain The motion information of the block is filled in the motion information candidate list, and then the motion information of the temporal neighboring blocks is filled in the motion information candidate list, and then the HMVP is filled in the motion information candidate list, and the paired average candidate MV is further filled in the motion information candidate list. . If after filling the motion information, HMVP, and paired average candidate MV of the time domain neighboring blocks into the motion information candidate list, the motion information candidate list is not filled, then use the zero motion vector motion information candidate list to fill it until it is filled .
若预测模式为帧间预测的非merge模式,那么视频处理设备为图像块构建运动信息候选列表的运动信息可以包括空域相邻块的运动信息、以及HMVP。其中,视频处理设备将运动信息加入至运动信息候选列表的顺序可以为:空域相邻块的运动信息→时域相邻块的运动信息→HMVP,即先将空域相邻块的运动信息填充至运动信息候选列表,再将时域相邻块的运动信息填充至运动信息候选列表,然后将HMVP填充至运动信息候选列表。若将时域相邻块的运动信息以及HMVP填充至运动信息候选列表之后,运动信息候选列表未被填满,则使用零运动矢量对运动信息候选列表进行填充,直至填满。If the prediction mode is a non-merge mode of inter prediction, the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks and HMVP. Among them, the order in which the video processing device adds the motion information to the motion information candidate list can be: motion information of adjacent blocks in the spatial domain → motion information of adjacent blocks in the time domain → HMVP, that is, the motion information of adjacent blocks in the spatial domain is first filled into In the motion information candidate list, the motion information of the temporal neighboring blocks is filled in the motion information candidate list, and then the HMVP is filled in the motion information candidate list. If the motion information candidate list of the neighboring blocks in the time domain and the HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, and then the motion information candidate list is filled with the zero motion vector until it is filled.
由上述可知,若预测模式为帧间预测的merge模式或非merge模式,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运 动信息,则意味着在第一运动信息候选列表中排除了空域相邻块的运动信息。It can be seen from the above that if the prediction mode is the merge mode or non-merge mode of inter-frame prediction, the preset rule is used to indicate that the first motion information added to the first motion information candidate list is the motion information of temporal neighboring blocks, which means The motion information of the neighboring blocks in the spatial domain is excluded from the first motion information candidate list.
同时,若预测模式为IBC的merge模式或非merge模式,那么视频处理设备为图像块构建运动信息候选列表的运动信息可以包括空域相邻块的运动信息和HMVP。若将空域相邻块的运动信息和HMVP填充至运动信息候选列表之后,运动信息候选列表未被填满,则使用零运动矢量运动信息候选列表进行填充,直至填满。示例性的,视频处理设备将运动信息加入至运动信息候选列表的顺序可以为:空域相邻块的运动信息→HMVP→零运动矢量。At the same time, if the prediction mode is the merge mode or the non-merge mode of the IBC, the motion information of the motion information candidate list constructed by the video processing device for the image block may include motion information of spatial neighboring blocks and HMVP. If the motion information candidate list of the neighboring blocks in the spatial domain and the HMVP are filled into the motion information candidate list, the motion information candidate list is not filled, then the zero motion vector motion information candidate list is used for filling until it is filled. Exemplarily, the order in which the video processing device adds the motion information to the motion information candidate list may be: motion information of neighboring blocks in the spatial domain→HMVP→zero motion vector.
由上述可知,若预测模式为IBC的merge模式或非merge模式,那么预设规则用于指示第一运动信息候选列表中首次加入的运动信息为HMVP,则意味着在第一运动信息候选列表中排除了空域相邻块的运动信息。It can be seen from the above that if the prediction mode is the merge mode or non-merge mode of IBC, the preset rule is used to indicate that the first motion information added to the first motion information candidate list is HMVP, which means that it is in the first motion information candidate list The motion information of neighboring blocks in the spatial domain is excluded.
由此可知,在传统的视频处理方法中,运动信息候选列表中首次加入的运动信息为空域相邻块的运动信息,而本申请实施例期望排除利用空域相邻运动块来构建第一运动信息候选列表,那么本申请实施例中的第一运动信息候选列表首次加入的运动信息为时域相邻块的运动信息或HMVP。It can be seen from this that in the traditional video processing method, the motion information added to the motion information candidate list for the first time is the motion information of the neighboring blocks in the spatial domain, and the embodiment of the present application expects to exclude the use of neighboring spatial motion blocks to construct the first motion information. Candidate list, then the first motion information added to the first motion information candidate list in this embodiment of the application is motion information of neighboring blocks in the time domain or HMVP.
可以理解,步骤S801与步骤S501的区别在于,第一运动信息候选列表的构建方式不是利用运动信息的类型,而是根据加入至第一运动信息候选列表的运动信息的加入顺序,以此来限定不利用空域相邻块的运动信息来构建第一运动信息候选列表。因此,步骤S801与步骤S501的相同部分具体可以参照S501中的相应描述,本申请实施例不再赘述。It can be understood that the difference between step S801 and step S501 is that the method of constructing the first motion information candidate list is not to use the type of motion information, but to limit it according to the order in which the motion information added to the first motion information candidate list is added. The motion information of the neighboring blocks in the spatial domain is not used to construct the first motion information candidate list. Therefore, for the same parts of step S801 and step S501, reference may be made to the corresponding description in S501 for details, which will not be repeated in this embodiment of the application.
步骤S802:根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。Step S802: encode or decode the first image block according to the motion information in the first motion information candidate list.
在一种实现方式中,在当前帧的第一图像块不满足高层语法中标识的预设条件时,视频处理设备按照另一预设规则为第一图像块构建第二运动信息候选列表,另一预设规则用于指示第二运动信息候选列表中首次加入的运动信息为空域相邻块的运动信息。然后,视频处理设备根据第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。In one implementation, when the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the video processing device constructs a second motion information candidate list for the first image block according to another preset rule, and A preset rule is used to indicate that the motion information added to the second motion information candidate list for the first time is motion information of neighboring blocks in the spatial domain. Then, the video processing device encodes or decodes the first image block according to the motion information in the second motion information candidate list.
本申请实施例中,在当前帧的第一图像块满足高层语法中标识的预设条件时,按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或 HMVP,而不是空域相邻块的运动信息,从而排除了利用空域相邻运动块来构建第一运动信息候选列表,减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In this embodiment of the application, when the first image block of the current frame meets the preset condition identified in the high-level grammar, the first motion information candidate list is constructed for the first image block according to the preset rule, and the preset rule is used to indicate the first image block. The motion information added to the motion information candidate list for the first time is the motion information or HMVP of the neighboring blocks in the time domain, rather than the motion information of the neighboring blocks in the spatial domain, thus eliminating the use of neighboring motion blocks in the spatial domain to construct the first motion information candidate list. The complexity of the construction process of the motion information candidate list is reduced, and the efficiency of image coding and image decoding is improved.
请参见图9,图9是本申请实施例提供的另一种视频处理方法的流程示意图,该视频处理方法可以包括以下步骤S901和S902:Please refer to FIG. 9. FIG. 9 is a schematic flowchart of another video processing method provided by an embodiment of the present application. The video processing method may include the following steps S901 and S902:
步骤S901:在当前帧的第一图像块满足预设条件时,按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP。Step S901: When the first image block of the current frame meets a preset condition, construct a first motion information candidate list for the first image block according to the preset rule, and the preset rule is used to indicate the first motion information candidate list added to the first motion information candidate list. The motion information is the motion information of neighboring blocks in the time domain or HMVP.
可以理解,步骤S901与步骤S801的区别在于,预设条件不一定在高层语法中进行标识,即步骤S901中,预设条件可以不在高层语法中标识,而是在第一图像块的编码或解码的过程中,实时地判断第一图像是否满足预设条件。因此,步骤S901与步骤S801的相同部分具体可以参照S801中的相应描述,本申请实施例不再赘述。It can be understood that the difference between step S901 and step S801 is that the preset condition is not necessarily identified in the high-level syntax, that is, in step S901, the preset condition may not be identified in the high-level syntax, but in the encoding or decoding of the first image block. During the process, it is determined in real time whether the first image meets the preset condition. Therefore, for the same parts of step S901 and step S801, reference may be made to the corresponding description in S801, which will not be repeated in this embodiment of the application.
步骤S902:根据第一运动信息候选列表中的运动信息,对第一图像块进行编码或解码。Step S902: encode or decode the first image block according to the motion information in the first motion information candidate list.
在一种实现方式中,在当前帧的第一图像块不满足预设条件时,视频处理设备按照另一预设规则为第一图像块构建第二运动信息候选列表,另一预设规则用于指示第二运动信息候选列表中首次加入的运动信息为空域相邻块的运动信息。然后,视频处理设备根据第二运动信息候选列表中的运动信息,对第一图像块进行编码或解码。In an implementation manner, when the first image block of the current frame does not meet a preset condition, the video processing device constructs a second motion information candidate list for the first image block according to another preset rule, and another preset rule is used The motion information added for the first time in the second motion information candidate list indicates that the motion information of the neighboring blocks in the spatial domain is the motion information. Then, the video processing device encodes or decodes the first image block according to the motion information in the second motion information candidate list.
本申请实施例中,在当前帧的第一图像块满足预设条件时,按照预设规则为第一图像块构建第一运动信息候选列表,预设规则用于指示第一运动信息候选列表中首次加入的运动信息为时域相邻块的运动信息或HMVP,而不是空域相邻块的运动信息,减少了运动信息候选列表的构建过程的复杂度,提高了图像编码和图像解码的效率。In this embodiment of the application, when the first image block of the current frame meets a preset condition, a first motion information candidate list is constructed for the first image block according to a preset rule, and the preset rule is used to indicate that the first motion information candidate list is The motion information added for the first time is the motion information of neighboring blocks in the time domain or HMVP, rather than the motion information of neighboring blocks in the spatial domain, which reduces the complexity of the process of constructing the motion information candidate list and improves the efficiency of image coding and image decoding.
请参见图10,图10是本申请实施例提供的一种视频处理设备的结构示意图。本申请实施例中所描述的视频处理设备至少包括:处理器1001和存储器 1002,其中:Please refer to FIG. 10, which is a schematic structural diagram of a video processing device according to an embodiment of the present application. The video processing device described in the embodiment of the present application at least includes: a processor 1001 and a memory 1002, where:
存储器1002,用于存储计算机程序,所述计算机程序包括程序指令;The memory 1002 is configured to store a computer program, and the computer program includes program instructions;
处理器1001,调用所述程序指令,用于执行如下步骤:The processor 1001 calls the program instructions to execute the following steps:
在当前帧的第一图像块满足高层语法中标识的预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
在一种实现方式中,所述第一图像块满足高层语法中标识的预设条件包括:所述高层语法中标识的图像块的尺寸大小包括所述第一图像块的尺寸大小。In an implementation manner, that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the image block identified in the high-level grammar includes the size of the first image block.
在一种实现方式中,所述第一图像块满足高层语法中标识的预设条件包括:所述第一图像块的尺寸大小小于或等于所述高层语法中标识的图像块的尺寸大小。In an implementation manner, that the first image block satisfies the preset condition identified in the high-level grammar includes: the size of the first image block is smaller than or equal to the size of the image block identified in the high-level grammar.
在一种实现方式中,所述高层语法中标识的图像块的尺寸大小包括至少一个,每一所述标识的图像块的尺寸大小为M*N,所述M和所述N均大于或等于4。In an implementation manner, the size of the image block identified in the high-level grammar includes at least one, and the size of each identified image block is M*N, and both the M and the N are greater than or equal to 4.
在一种实现方式中,所述M和所述N不相等。In an implementation manner, the M and the N are not equal.
在一种实现方式中,在所述第一图像块满足高层语法中标识的预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。In an implementation manner, when the first image block satisfies the preset condition identified in the high-level grammar, the type of motion information used to construct the first motion information candidate list for the first image block is based on the first image block. The prediction mode of the image block is determined.
在一种实现方式中,所述预测模式包括帧间预测或IBC;当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括所述时域相邻块的运动信息;当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。In an implementation manner, the prediction mode includes inter prediction or IBC; when the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block further includes The motion information of the temporal neighboring block; when the prediction mode is the IBC, the motion information for constructing a first motion information candidate list for the first image block includes the HMVP.
在一种实现方式中,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;其中,所述第二图像块为所述第一图像块的空域相邻块。In an implementation manner, the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block; wherein , The second image block is a spatial neighboring block of the first image block.
在一种实现方式中,所述处理器1001在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,不利用所述第一图像 块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。In an implementation manner, after the processor 1001 encodes or decodes the first image block according to the motion information in the first motion information candidate list, the processor 1001 does not use the first image block to encode or decode the first image block. Or the motion information used in decoding, update the HMVP list.
在一种实现方式中,所述处理器1001在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,还用于执行以下操作:In an implementation manner, the processor 1001 is further configured to perform the following operations after encoding or decoding the first image block according to the motion information in the first motion information candidate list:
基于所述第一图像块的预测模式,对HMVP列表进行操作。Based on the prediction mode of the first image block, an operation is performed on the HMVP list.
在一种实现方式中,所述预测模式包括帧间预测或IBC;In an implementation manner, the prediction mode includes inter prediction or IBC;
所述处理器1001在基于所述第一图像块的预测模式,对HMVP列表进行操作时,具体用于执行以下操作:The processor 1001 is specifically configured to perform the following operations when operating the HMVP list based on the prediction mode of the first image block:
当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对所述HMVP列表进行更新操作;When the prediction mode is the inter-frame prediction, use the motion information used during encoding or decoding of the first image block to update the HMVP list;
当所述预测模式为所述IBC时,保持HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
具体实现中,本申请实施例中所描述的处理器1001可以执行本申请实施例图5所提供的视频处理方法中所描述的实现方式,在此不再赘述。In specific implementation, the processor 1001 described in the embodiment of the present application may execute the implementation manner described in the video processing method provided in FIG. 5 of the embodiment of the present application, and details are not described herein again.
在另一实施例中,处理器1001,调用所述程序指令,用于执行如下步骤:In another embodiment, the processor 1001 calls the program instructions to perform the following steps:
在当前帧的第一图像块满足预设条件时,利用HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame meets a preset condition, construct a first motion information candidate list for the first image block by using HMVP;
在所述当前帧的第一图像块不满足所述预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not satisfy the preset condition, a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
在一种实现方式中,所述第一图像块满足预设条件包括:所述第一图像块的尺寸大小满足预设的尺寸大小。In an implementation manner, that the first image block satisfies the preset condition includes: the size of the first image block meets the preset size.
在一种实现方式中,所述第一图像块的尺寸大小满足预设的尺寸大小包括:所述第一图像块的尺寸大小小于或等于预设的尺寸大小;或,所述预设的尺寸大小包括所述第一图像块的尺寸大小。In an implementation manner, that the size of the first image block satisfies a preset size includes: the size of the first image block is less than or equal to a preset size; or, the preset size The size includes the size of the first image block.
在一种实现方式中,所述预设的尺寸大小包括至少一个,每一所述预设的尺寸大小为M*N,所述M和所述N均大于或等于4。In an implementation manner, the preset size includes at least one, and each preset size is M*N, and both the M and the N are greater than or equal to 4.
在一种实现方式中,所述M和所述N不相等。In an implementation manner, the M and the N are not equal.
在一种实现方式中,在所述第一图像块满足所述预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。In an implementation manner, when the first image block satisfies the preset condition, the type of motion information used to construct the first motion information candidate list for the first image block is based on the type of the first image block. The prediction mode is determined.
在一种实现方式中,所述预测模式包括帧间预测或IBC;In an implementation manner, the prediction mode includes inter prediction or IBC;
当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括时域相邻块的运动信息;When the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block also includes motion information of temporal neighboring blocks;
当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。When the prediction mode is the IBC, the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
在一种实现方式中,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;其中,所述第二图像块为所述第一图像块的空域相邻块。In an implementation manner, the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the decoding of the second image block; wherein , The second image block is a spatial neighboring block of the first image block.
在一种实现方式中,所述处理器1001在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,不利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。In an implementation manner, after the processor 1001 encodes or decodes the first image block according to the motion information in the first motion information candidate list, the processor 1001 does not use the first image block to encode or decode the first image block. Or the motion information used in decoding, update the HMVP list.
在一种实现方式中,所述处理器1001在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,还用于执行以下操作:In an implementation manner, the processor 1001 is further configured to perform the following operations after encoding or decoding the first image block according to the motion information in the first motion information candidate list:
基于所述第一图像块的预测模式,对HMVP列表进行操作。Based on the prediction mode of the first image block, an operation is performed on the HMVP list.
在一种实现方式中,所述预测模式包括帧间预测或IBC;In an implementation manner, the prediction mode includes inter prediction or IBC;
所述处理器1001在基于所述第一图像块的预测模式,对所述HMVP列表进行操作时,具体执行以下操作:When the processor 1001 operates the HMVP list based on the prediction mode of the first image block, it specifically performs the following operations:
当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新;When the prediction mode is the inter-frame prediction, update the HMVP list by using the motion information used during encoding or decoding of the first image block;
当所述预测模式为所述IBC时,保持HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
在一种实现方式中,当所述第一图像块的预测模式为IBC时,所述第一图像块的尺寸大小为K*L,所述K和所述L中的一个大于等于4,所述K和所述L中的另一个大于4。In an implementation manner, when the prediction mode of the first image block is IBC, the size of the first image block is K*L, one of the K and the L is greater than or equal to 4, so The other of the K and the L is greater than 4.
具体实现中,本申请实施例中所描述的处理器1001可以执行本申请实施例图7所提供的视频处理方法中所描述的实现方式,在此不再赘述。In specific implementation, the processor 1001 described in the embodiment of the present application may execute the implementation manner described in the video processing method provided in FIG. 7 of the embodiment of the present application, and details are not described herein again.
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有程序指令,该程序指令执行时可包括图5、图7至图9对应实施例中的视频处理方法的部分或全部步骤。The embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores program instructions, and the program instructions may include the video processing method in the corresponding embodiments of FIG. 5 and FIG. 7 to FIG. 9 when the program instruction is executed. Part or all of the steps.
本申请实施例还提供一种计算机程序产品,该计算机程序产品被计算机设备运行时,可以执行上述图5、图7至图9对应实施例中的视频处理方法的部 分或全部步骤。The embodiments of the present application also provide a computer program product. When the computer program product is run by a computer device, it can execute part or all of the steps of the video processing method in the embodiment corresponding to FIG. 5, FIG. 7 to FIG. 9.
可以理解,对于前述的各个方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某一些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例为本申请的部分实施例,所涉及的动作和模块并不一定是本申请所必须的。It can be understood that for the foregoing various method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions, because it is based on In this application, certain steps can be performed in other order or at the same time. Secondly, those skilled in the art should also know that the embodiments described in the specification are part of the embodiments of the application, and the involved actions and modules are not necessarily required by the application.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关硬件完成的,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在执行时,可以包括上述各方法实施例的流程。其中,所述计算机可读存储介质可以为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. During execution, the procedures of the foregoing method embodiments may be included. The computer-readable storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
以上对本申请实施例所提供的一种云台的控制方法、云台及可移动平台进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书的内容不应理解为对本申请的限制。The above provides a detailed introduction to the control method of the pan/tilt, the pan/tilt, and the movable platform provided by the embodiments of the application. In this article, specific examples are used to explain the principles and implementation of the application. The description of the above embodiments It is only used to help understand the methods and core ideas of this application; at the same time, for those skilled in the art, according to the ideas of this application, there will be changes in the specific implementation and scope of application. In summary, The content of this specification should not be construed as a limitation to this application.

Claims (48)

  1. 一种视频处理方法,其特征在于,所述方法包括:A video processing method, characterized in that the method includes:
    在当前帧的第一图像块满足高层语法中标识的预设条件时,利用基于历史的运动矢量预测HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, use the history-based motion vector prediction HMVP to construct a first motion information candidate list for the first image block;
    在所述当前帧的第一图像块不满足所述高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
  2. 根据权利要求1所述的方法,其特征在于,所述第一图像块满足高层语法中标识的预设条件包括:The method according to claim 1, wherein the first image block meeting a preset condition identified in a high-level grammar comprises:
    所述高层语法中标识的图像块的尺寸大小包括所述第一图像块的尺寸大小。The size of the image block identified in the high-level syntax includes the size of the first image block.
  3. 根据权利要求1所述的方法,其特征在于,所述第一图像块满足高层语法中标识的预设条件包括:The method according to claim 1, wherein the first image block meeting a preset condition identified in a high-level grammar comprises:
    所述第一图像块的尺寸大小小于或等于所述高层语法中标识的图像块的尺寸大小。The size of the first image block is smaller than or equal to the size of the image block identified in the high-level syntax.
  4. 根据权利要求2或3所述的方法,其特征在于,所述高层语法中标识的图像块的尺寸大小包括至少一个,每一所述标识的图像块的尺寸大小为M*N,所述M和所述N均大于或等于4。The method according to claim 2 or 3, wherein the size of the image block identified in the high-level grammar includes at least one, and the size of each identified image block is M*N, and the size of the image block is M*N. And the N is greater than or equal to 4.
  5. 根据权利要求4所述的方法,其特征在于,所述M和所述N不相等。The method according to claim 4, wherein said M and said N are not equal.
  6. 根据权利要求1所述的方法,其特征在于,在所述第一图像块满足高层语法中标识的预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。The method according to claim 1, wherein when the first image block satisfies the preset condition identified in the high-level grammar, the type of motion information in the first motion information candidate list is constructed for the first image block Is determined according to the prediction mode of the first image block.
  7. 根据权利要求6所述的方法,其特征在于,所述预测模式包括帧间预测或帧内块拷贝技术IBC;The method according to claim 6, wherein the prediction mode includes inter prediction or intra block copy technology IBC;
    当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括时域相邻块的运动信息;When the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block also includes motion information of temporal neighboring blocks;
    当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。When the prediction mode is the IBC, the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
  8. 根据权利要求1所述的方法,其特征在于,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;The method according to claim 1, wherein the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the second image Block decoding synchronization;
    其中,所述第二图像块为所述第一图像块的空域相邻块。Wherein, the second image block is a spatial neighboring block of the first image block.
  9. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,不利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。After the first image block is encoded or decoded according to the motion information in the first motion information candidate list, the HMVP list is performed on the HMVP list without using the motion information used during encoding or decoding of the first image block. Update operation.
  10. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,基于所述第一图像块的预测模式,对HMVP列表进行操作。After encoding or decoding the first image block according to the motion information in the first motion information candidate list, an operation is performed on the HMVP list based on the prediction mode of the first image block.
  11. 根据权利要求10所述的方法,其特征在于,所述预测模式包括帧间预测或IBC;The method according to claim 10, wherein the prediction mode comprises inter prediction or IBC;
    所述基于所述第一图像块的预测模式,对HMVP列表进行操作,包括:The operating the HMVP list based on the prediction mode of the first image block includes:
    当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作;When the prediction mode is the inter-frame prediction, use the motion information used during encoding or decoding of the first image block to update the HMVP list;
    当所述预测模式为所述IBC时,保持所述HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
  12. 一种视频处理方法,其特征在于,所述方法包括:A video processing method, characterized in that the method includes:
    在当前帧的第一图像块满足预设条件时,利用基于历史的运动矢量预测HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies a preset condition, use the history-based motion vector prediction HMVP to construct a first motion information candidate list for the first image block;
    在所述当前帧的第一图像块不满足所述预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not satisfy the preset condition, a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  13. 根据权利要求12所述的方法,其特征在于,所述第一图像块满足预设条件包括:The method according to claim 12, wherein the first image block satisfying a preset condition comprises:
    所述第一图像块的尺寸大小满足预设的尺寸大小。The size of the first image block meets the preset size.
  14. 根据权利要求13所述的方法,其特征在于,所述第一图像块的尺寸大小满足预设的尺寸大小包括:The method according to claim 13, wherein the size of the first image block meeting a preset size comprises:
    所述第一图像块的尺寸大小小于或等于预设的尺寸大小;或,The size of the first image block is less than or equal to a preset size; or,
    所述预设的尺寸大小包括所述第一图像块的尺寸大小。The preset size includes the size of the first image block.
  15. 根据权利要求14所述的方法,其特征在于,所述预设的尺寸大小包括至少一个,每一所述预设的尺寸大小为M*N,所述M和所述N均大于或等于4。The method according to claim 14, wherein the preset size includes at least one, each of the preset sizes is M*N, and the M and the N are both greater than or equal to 4. .
  16. 根据权利要求15所述的方法,其特征在于,所述M和所述N不相等。The method according to claim 15, wherein said M and said N are not equal.
  17. 根据权利要求12所述的方法,其特征在于,在所述第一图像块满足所述预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。The method according to claim 12, wherein when the first image block satisfies the preset condition, constructing a first motion information candidate list for the first image block is based on the type of motion information The prediction mode of the first image block is determined.
  18. 根据权利要求17所述的方法,其特征在于,所述预测模式包括帧间预测或帧内块拷贝技术IBC;The method according to claim 17, wherein the prediction mode comprises inter-frame prediction or intra-frame block copy technology (IBC);
    当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括时域相邻块的运动信息;When the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block also includes motion information of temporal neighboring blocks;
    当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。When the prediction mode is the IBC, the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
  19. 根据权利要求12所述的方法,其特征在于,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;The method according to claim 12, wherein the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the encoding of the second image block. Block decoding synchronization;
    其中,所述第二图像块为所述第一图像块的空域相邻块。Wherein, the second image block is a spatial neighboring block of the first image block.
  20. 根据权利要求12所述的方法,其特征在于,所述方法还包括:The method according to claim 12, wherein the method further comprises:
    在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,不利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。After the first image block is encoded or decoded according to the motion information in the first motion information candidate list, the HMVP list is performed on the HMVP list without using the motion information used during encoding or decoding of the first image block. Update operation.
  21. 根据权利要求12所述的方法,其特征在于,所述方法还包括:The method according to claim 12, wherein the method further comprises:
    在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,基于所述第一图像块的预测模式,对HMVP列表进行操作。After encoding or decoding the first image block according to the motion information in the first motion information candidate list, an operation is performed on the HMVP list based on the prediction mode of the first image block.
  22. 根据权利要求21所述的方法,其特征在于,所述预测模式包括帧间预测或IBC;The method according to claim 21, wherein the prediction mode comprises inter prediction or IBC;
    所述基于所述第一图像块的预测模式,对HMVP列表进行操作,包括:The operating the HMVP list based on the prediction mode of the first image block includes:
    当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新;When the prediction mode is the inter-frame prediction, update the HMVP list by using the motion information used during encoding or decoding of the first image block;
    当所述预测模式为所述IBC时,保持所述HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
  23. 根据权利要求12所述的方法,其特征在于,当所述第一图像块的预测模式为IBC时,所述第一图像块的尺寸大小为K*L,所述K和所述L中的一个大于等于4,所述K和所述L中的另一个大于4。The method according to claim 12, wherein when the prediction mode of the first image block is IBC, the size of the first image block is K*L, and the size of the K and the L One is greater than or equal to 4, and the other of the K and L is greater than 4.
  24. 一种视频处理设备,其特征在于,所述视频处理设备包括:A video processing device, characterized in that the video processing device includes:
    存储器,用于存储计算机程序,所述计算机程序包括程序指令;The memory is used to store a computer program, the computer program including program instructions;
    处理器,调用所述程序指令,用于执行如下步骤:The processor calls the program instructions to execute the following steps:
    在当前帧的第一图像块满足高层语法中标识的预设条件时,利用基于历史的运动矢量预测HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies the preset condition identified in the high-level grammar, use the history-based motion vector prediction HMVP to construct a first motion information candidate list for the first image block;
    在所述当前帧的第一图像块不满足所述高层语法中标识的预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not meet the preset condition identified in the high-level grammar, the motion information of the neighboring blocks in the spatial domain and the HMVP are used to construct a second motion information candidate list for the first image block.
  25. 根据权利要求24所述的视频处理设备,其特征在于,所述第一图像块满足高层语法中标识的预设条件包括:The video processing device according to claim 24, wherein the first image block meeting a preset condition identified in a high-level grammar comprises:
    所述高层语法中标识的图像块的尺寸大小包括所述第一图像块的尺寸大小。The size of the image block identified in the high-level syntax includes the size of the first image block.
  26. 根据权利要求24所述的视频处理设备,其特征在于,所述第一图像块满足高层语法中标识的预设条件包括:The video processing device according to claim 24, wherein the first image block meeting a preset condition identified in a high-level grammar comprises:
    所述第一图像块的尺寸大小小于或等于所述高层语法中标识的图像块的尺寸大小。The size of the first image block is smaller than or equal to the size of the image block identified in the high-level syntax.
  27. 根据权利要求25或26所述的视频处理设备,其特征在于,所述高层语法中标识的图像块的尺寸大小包括至少一个,每一所述标识的图像块的尺寸大小为M*N,所述M和所述N均大于或等于4。The video processing device according to claim 25 or 26, wherein the size of the image block identified in the high-level syntax includes at least one, and the size of each identified image block is M*N, so Both said M and said N are greater than or equal to 4.
  28. 根据权利要求27所述的视频处理设备,其特征在于,所述M和所述N不相等。The video processing device according to claim 27, wherein the M and the N are not equal.
  29. 根据权利要求24所述的视频处理设备,其特征在于,在所述第一图像块满足高层语法中标识的预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。The video processing device according to claim 24, wherein when the first image block satisfies the preset condition identified in the high-level grammar, the motion information of the first motion information candidate list is constructed for the first image block The type of is determined according to the prediction mode of the first image block.
  30. 根据权利要求29所述的视频处理设备,其特征在于,所述预测模式 包括帧间预测或帧内块拷贝技术IBC;The video processing device according to claim 29, wherein the prediction mode comprises inter-frame prediction or intra-frame block copy technology (IBC);
    当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括所述时域相邻块的运动信息;When the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block further includes the motion information of the temporal neighboring block;
    当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。When the prediction mode is the IBC, the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
  31. 根据权利要求24所述的视频处理设备,其特征在于,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;The video processing device according to claim 24, wherein the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the encoding of the first image block. The decoding synchronization of the two image blocks;
    其中,所述第二图像块为所述第一图像块的空域相邻块。Wherein, the second image block is a spatial neighboring block of the first image block.
  32. 根据权利要求24所述的视频处理设备,其特征在于,所述处理器在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,不利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。The video processing device according to claim 24, wherein after the processor encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use all The motion information used in the encoding or decoding of the first image block is used to update the HMVP list.
  33. 根据权利要求24所述的视频处理设备,其特征在于,所述处理器在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,还用于执行以下操作:The video processing device according to claim 24, wherein the processor is further configured to encode or decode the first image block according to the motion information in the first motion information candidate list. Do the following:
    基于所述第一图像块的预测模式,对HMVP列表进行操作。Based on the prediction mode of the first image block, an operation is performed on the HMVP list.
  34. 根据权利要求33所述的视频处理设备,其特征在于,所述预测模式包括帧间预测或IBC;The video processing device according to claim 33, wherein the prediction mode comprises inter prediction or IBC;
    所述处理器在基于所述第一图像块的预测模式,对HMVP列表进行操作时,具体用于执行以下操作:When the processor operates the HMVP list based on the prediction mode of the first image block, it is specifically configured to perform the following operations:
    当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作;When the prediction mode is the inter-frame prediction, use the motion information used during encoding or decoding of the first image block to update the HMVP list;
    当所述预测模式为所述IBC时,保持所述HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
  35. 一种视频处理设备,其特征在于,所述视频处理设备包括:A video processing device, characterized in that the video processing device includes:
    存储器,用于存储计算机程序,所述计算机程序包括程序指令;The memory is used to store a computer program, the computer program including program instructions;
    处理器,调用所述程序指令,用于执行如下步骤:The processor calls the program instructions to execute the following steps:
    在当前帧的第一图像块满足预设条件时,利用基于历史的运动矢量预测HMVP为所述第一图像块构建第一运动信息候选列表;When the first image block of the current frame satisfies a preset condition, use the history-based motion vector prediction HMVP to construct a first motion information candidate list for the first image block;
    在所述当前帧的第一图像块不满足所述预设条件时,利用空域相邻块的运动信息和HMVP为所述第一图像块构建第二运动信息候选列表。When the first image block of the current frame does not satisfy the preset condition, a second motion information candidate list is constructed for the first image block by using the motion information of the neighboring blocks in the spatial domain and the HMVP.
  36. 根据权利要求35所述的视频处理设备,其特征在于,所述第一图像块满足预设条件包括:The video processing device according to claim 35, wherein the first image block satisfying a preset condition comprises:
    所述第一图像块的尺寸大小满足预设的尺寸大小。The size of the first image block meets the preset size.
  37. 根据权利要求36所述的视频处理设备,其特征在于,所述第一图像块的尺寸大小满足预设的尺寸大小包括:The video processing device according to claim 36, wherein the size of the first image block meeting a preset size comprises:
    所述第一图像块的尺寸大小小于或等于预设的尺寸大小;或,The size of the first image block is less than or equal to a preset size; or,
    所述预设的尺寸大小包括所述第一图像块的尺寸大小。The preset size includes the size of the first image block.
  38. 根据权利要求37所述的视频处理设备,其特征在于,所述预设的尺寸大小包括至少一个,每一所述预设的尺寸大小为M*N,所述M和所述N均大于或等于4。The video processing device according to claim 37, wherein the preset size includes at least one, each of the preset sizes is M*N, and the M and the N are both greater than or Equal to 4.
  39. 根据权利要求38所述的视频处理设备,其特征在于,所述M和所述N不相等。The video processing device according to claim 38, wherein said M and said N are not equal.
  40. 根据权利要求35所述的视频处理设备,其特征在于,在所述第一图像块满足所述预设条件时,为所述第一图像块构建第一运动信息候选列表的运动信息的类型为根据所述第一图像块的预测模式确定。The video processing device according to claim 35, wherein when the first image block satisfies the preset condition, the type of the motion information used to construct the first motion information candidate list for the first image block is Determined according to the prediction mode of the first image block.
  41. 根据权利要求40所述的视频处理设备,其特征在于,所述预测模式包括帧间预测或帧内块拷贝技术IBC;The video processing device according to claim 40, wherein the prediction mode comprises inter-frame prediction or intra-frame block copy technology (IBC);
    当所述预测模式为所述帧间预测时,为所述第一图像块构建第一运动信息候选列表的运动信息还包括时域相邻块的运动信息;When the prediction mode is the inter prediction, the motion information for constructing the first motion information candidate list for the first image block also includes motion information of temporal neighboring blocks;
    当所述预测模式为所述IBC时,为所述第一图像块构建第一运动信息候选列表的运动信息包括所述HMVP。When the prediction mode is the IBC, the motion information for constructing the first motion information candidate list for the first image block includes the HMVP.
  42. 根据权利要求35所述的视频处理设备,其特征在于,所述第一图像块的编码与所述当前帧的第二图像块的编码同步,或所述第一图像块的解码与所述第二图像块的解码同步;The video processing device according to claim 35, wherein the encoding of the first image block is synchronized with the encoding of the second image block of the current frame, or the decoding of the first image block is synchronized with the encoding of the first image block. The decoding synchronization of the two image blocks;
    其中,所述第二图像块为所述第一图像块的空域相邻块。Wherein, the second image block is a spatial neighboring block of the first image block.
  43. 根据权利要求35所述的视频处理设备,其特征在于,所述处理器在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或 解码之后,不利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新操作。The video processing device according to claim 35, wherein after the processor encodes or decodes the first image block according to the motion information in the first motion information candidate list, it does not use all The motion information used in the encoding or decoding of the first image block is used to update the HMVP list.
  44. 根据权利要求35所述的视频处理设备,其特征在于,所述处理器在根据所述第一运动信息候选列表中的运动信息,对所述第一图像块进行编码或解码之后,还用于执行以下操作:The video processing device according to claim 35, wherein the processor is further configured to encode or decode the first image block according to the motion information in the first motion information candidate list. Do the following:
    基于所述第一图像块的预测模式,对HMVP列表进行操作。Based on the prediction mode of the first image block, an operation is performed on the HMVP list.
  45. 根据权利要求44所述的视频处理设备,其特征在于,所述预测模式包括帧间预测或IBC;The video processing device according to claim 44, wherein the prediction mode comprises inter prediction or IBC;
    所述处理器在基于所述第一图像块的预测模式,对HMVP列表进行操作时,具体执行以下操作:When the processor operates the HMVP list based on the prediction mode of the first image block, it specifically performs the following operations:
    当所述预测模式为所述帧间预测时,利用所述第一图像块在编码或解码时使用的运动信息,对HMVP列表进行更新;When the prediction mode is the inter-frame prediction, update the HMVP list by using the motion information used during encoding or decoding of the first image block;
    当所述预测模式为所述IBC时,保持所述HMVP列表不变。When the prediction mode is the IBC, keep the HMVP list unchanged.
  46. 根据权利要求35所述的视频处理设备,其特征在于,当所述第一图像块的预测模式为IBC时,所述第一图像块的尺寸大小为K*L,所述K和所述L中的一个大于等于4,所述K和所述L中的另一个大于4。The video processing device according to claim 35, wherein when the prediction mode of the first image block is IBC, the size of the first image block is K*L, and the K and the L One of them is greater than or equal to 4, and the other of the K and the L is greater than 4.
  47. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1至11任一项所述的视频处理方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The video processing method described in any one of 1 to 11 is required.
  48. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求12至23任一项所述的视频处理方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The video processing method described in any one of claims 12 to 23.
PCT/CN2020/083376 2020-04-03 2020-04-03 Video processing method, video processing device, and computer-readable storage medium WO2021196238A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080005623.1A CN112868235A (en) 2020-04-03 2020-04-03 Video processing method, video processing device and computer readable storage medium
PCT/CN2020/083376 WO2021196238A1 (en) 2020-04-03 2020-04-03 Video processing method, video processing device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/083376 WO2021196238A1 (en) 2020-04-03 2020-04-03 Video processing method, video processing device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2021196238A1 true WO2021196238A1 (en) 2021-10-07

Family

ID=76001816

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/083376 WO2021196238A1 (en) 2020-04-03 2020-04-03 Video processing method, video processing device, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN112868235A (en)
WO (1) WO2021196238A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213590A (en) * 2019-06-25 2019-09-06 浙江大华技术股份有限公司 Time-domain motion vector acquisition, inter-prediction, Video coding method and apparatus
CN110460859A (en) * 2019-08-21 2019-11-15 浙江大华技术股份有限公司 Application method, codec and the storage device of historical movement vector list
US20200021836A1 (en) * 2018-07-10 2020-01-16 Tencent America LLC Method and apparatus for ordering and selection of affine merge candidates in motion compensation
CN110784723A (en) * 2018-07-30 2020-02-11 腾讯美国有限责任公司 Method and apparatus for generating merge candidate list, non-volatile computer-readable storage medium
CN110933414A (en) * 2018-06-29 2020-03-27 杭州海康威视数字技术股份有限公司 Motion information candidate list construction method and device and readable storage medium
CN110944196A (en) * 2018-09-24 2020-03-31 北京字节跳动网络技术有限公司 Simplified history-based motion vector prediction

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11394992B2 (en) * 2018-03-14 2022-07-19 Hfi Innovation Inc. Methods and apparatuses of generating average candidates in video coding systems
WO2020044196A1 (en) * 2018-08-26 2020-03-05 Beijing Bytedance Network Technology Co., Ltd. Combined history-based motion vector predictor and multi-motion model decoding

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110933414A (en) * 2018-06-29 2020-03-27 杭州海康威视数字技术股份有限公司 Motion information candidate list construction method and device and readable storage medium
US20200021836A1 (en) * 2018-07-10 2020-01-16 Tencent America LLC Method and apparatus for ordering and selection of affine merge candidates in motion compensation
CN110784723A (en) * 2018-07-30 2020-02-11 腾讯美国有限责任公司 Method and apparatus for generating merge candidate list, non-volatile computer-readable storage medium
CN110944196A (en) * 2018-09-24 2020-03-31 北京字节跳动网络技术有限公司 Simplified history-based motion vector prediction
WO2020065520A2 (en) * 2018-09-24 2020-04-02 Beijing Bytedance Network Technology Co., Ltd. Extended merge prediction
CN110213590A (en) * 2019-06-25 2019-09-06 浙江大华技术股份有限公司 Time-domain motion vector acquisition, inter-prediction, Video coding method and apparatus
CN110460859A (en) * 2019-08-21 2019-11-15 浙江大华技术股份有限公司 Application method, codec and the storage device of historical movement vector list

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
S.-T. HSIANG, C.-Y. LAI, C.-Y. CHEN, Y.-W. HUANG, S.-M. LEI (MEDIATEK): "AHG9: Fix on high-level syntax related to coding tree constraints", 17. JVET MEETING; 20200107 - 20200117; BRUSSELS; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 31 December 2019 (2019-12-31), XP030222783 *

Also Published As

Publication number Publication date
CN112868235A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
US10404994B2 (en) Methods and systems for motion vector derivation at a video decoder
CN111133756B (en) Neural network method and apparatus for video coding
WO2016050051A1 (en) Image prediction method and relevant device
JP2021520119A (en) How to get motion vector
JP2024012627A (en) Methods and devices for selectively applying bidirectional optical flow and decoder-side motion vector refinement for video coding
JP2024045200A (en) Decoder, program and method
JP2023052767A (en) Video processing method and encoder
WO2022068716A1 (en) Entropy encoding/decoding method and device
KR20210068102A (en) Method and apparatus for signaling predictor candidate list size
WO2022063265A1 (en) Inter-frame prediction method and apparatus
CN114073083A (en) Global motion for merge mode candidates in inter prediction
JP2023179684A (en) Picture prediction method and device, and computer readable storage medium
KR20150047379A (en) Video encoding devic and driving method thereof
JPWO2015145504A1 (en) Image decoding apparatus, image decoding method, and integrated circuit
US11140410B2 (en) Method and device for processing video signal using affine motion prediction
WO2022116207A1 (en) Coding method, decoding method, coding apparatus, and decoding apparatus
JP2024038295A (en) Motion vector prediction for video coding
WO2020006690A1 (en) Video processing method and device
CN116235496A (en) Encoding method, decoding method, encoder, decoder, and encoding system
TW202145784A (en) Inter-frame prediction method, encoder, decoder, and computer storage medium for enhancing diversity of motion messages in motion message candidate list to improve coding/decoding performance
WO2021196238A1 (en) Video processing method, video processing device, and computer-readable storage medium
WO2021056212A1 (en) Method and apparatus for video encoding and decoding
CN114128291A (en) Adaptive motion vector prediction candidates in frames with global motion
CN114080811A (en) Selective motion vector prediction candidates in frames with global motion
WO2023011420A1 (en) Encoding method and apparatus, and decoding method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20929516

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20929516

Country of ref document: EP

Kind code of ref document: A1