WO2019192301A1 - 视频图像处理方法与装置 - Google Patents

视频图像处理方法与装置 Download PDF

Info

Publication number
WO2019192301A1
WO2019192301A1 PCT/CN2019/078051 CN2019078051W WO2019192301A1 WO 2019192301 A1 WO2019192301 A1 WO 2019192301A1 CN 2019078051 W CN2019078051 W CN 2019078051W WO 2019192301 A1 WO2019192301 A1 WO 2019192301A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
motion vector
image block
current image
candidate
Prior art date
Application number
PCT/CN2019/078051
Other languages
English (en)
French (fr)
Inventor
郑萧桢
王苏红
王苫社
马思伟
Original Assignee
深圳市大疆创新科技有限公司
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/CN2018/081652 external-priority patent/WO2019191890A1/zh
Priority claimed from PCT/CN2018/112805 external-priority patent/WO2019192170A1/zh
Priority to CN202210376602.1A priority Critical patent/CN114938452A/zh
Priority to KR1020247022895A priority patent/KR20240110905A/ko
Priority to JP2020553581A priority patent/JP7533945B2/ja
Priority to EP19781713.3A priority patent/EP3780619A4/en
Application filed by 深圳市大疆创新科技有限公司, 北京大学 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201980002813.5A priority patent/CN110720219B/zh
Priority to KR1020207031587A priority patent/KR102685009B1/ko
Priority to CN202210376345.1A priority patent/CN115037942A/zh
Publication of WO2019192301A1 publication Critical patent/WO2019192301A1/zh
Priority to US17/039,903 priority patent/US11190798B2/en
Priority to US17/456,815 priority patent/US11997312B2/en
Priority to JP2024025977A priority patent/JP2024057014A/ja
Priority to US18/674,297 priority patent/US20240314356A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search

Definitions

  • the present application relates to the field of video coding and decoding, and in particular to a video image processing method and apparatus.
  • the main video coding standard adopts block-based motion compensation technology in the inter-frame prediction part.
  • the main principle is to find a most similar block in the coded image for the current image block. This process is called motion compensation.
  • CTUs Coding Tree Units
  • Each CTU can be further divided into square or rectangular coding units (CUs).
  • Each CU looks for the most similar block in the reference frame (typically the reconstructed frame near the time domain of the current frame) as the prediction block for the current CU.
  • the relative displacement between the current block (ie, the current CU) and the similar block (ie, the prediction block of the current CU) is called a Motion Vector (MV).
  • MV Motion Vector
  • a motion vector candidate list of a current CU is generally constructed in two ways, and the motion vector candidate list is also referred to as a merge candidate list.
  • the candidate motion vector of the spatial domain is included in the motion vector candidate list, and the motion vector (or motion information) of the encoded neighboring block of the current CU is usually filled into the motion vector candidate list.
  • the motion vector candidate list further includes a candidate motion vector in the time domain, and the Temporal Motion Vector Prediction (TMVP) uses the motion vector (or motion) of the corresponding position CU (ie, the co-located CU) of the current CU in the adjacent encoded image. information).
  • TMVP Temporal Motion Vector Prediction
  • the optimal one candidate motion vector is selected from the merge candidate list as the motion vector of the current CU; the predicted block of the current CU is determined according to the motion vector of the current CU.
  • ATMVP Advanced/Alternative temporal motion vector prediction
  • the basic idea of ATMVP technology is to perform motion compensation by acquiring motion information of multiple sub-blocks in the current CU.
  • the ATMVP technology introduces motion information of a plurality of sub-blocks in the current CU as candidates in a build candidate list (for example, a merge candidate list or an AMVP (Advanced Motion Vector Prediction) candidate list).
  • the implementation of ATMVP technology can be roughly divided into two steps. In the first step, a time domain vector is determined by scanning a candidate motion vector list of the current CU or a motion vector of a neighboring image block of the current CU. In the second step, the current CU is divided into N ⁇ N (N defaults to 4).
  • Sub-CUs determining corresponding blocks of the respective sub-blocks in the reference frame according to the time domain vector obtained in the first step, and determining the sub-blocks according to the motion vectors of the corresponding blocks in the reference frame of each sub-block Motion vector.
  • the second step of the current ATMVP technology performs frame-level adaptive setting on the size of the sub-CU, and the default size is 4 ⁇ 4. When a certain preset condition is satisfied, the size of the sub-CU will be set to 8 ⁇ 8.
  • the size setting of the sub-CU has some problems that do not match the current motion information storage granularity (8 ⁇ 8).
  • ATMVP technology and TMVP technology have redundant operations in some cases, and there is room for improvement in the process of constructing candidate motion vector lists.
  • the present application provides a video image processing method and apparatus, which can reduce the complexity of the ATMVP technology while maintaining the performance gain of the existing ATMVP technology.
  • a video image processing method comprising:
  • each sub-image block in the current image block is in one-to-one correspondence with each sub-image block in the related block;
  • N is less than M
  • the number of scans of the candidate motion vector in the process of acquiring the reference motion vector of the current image block can be reduced. It should be understood that applying the solution provided by the present application to the first step of the existing ATMVP technology can simplify the redundant operations that exist.
  • a video image processing method comprising:
  • each sub-image block in the current image block and the related block corresponds one-to-one;
  • a video image processing apparatus comprising:
  • a building module configured to sequentially scan N neighboring blocks in the preset M neighboring blocks of the current image block, and determine a target neighboring block according to the scan result, where N is less than M; according to the motion vector of the target neighboring block, Determining, by the current image block and the reference image of the current image block, a correlation block of the current image block; dividing the current image block and the related block into a plurality of sub-image blocks in the same manner, the current image block Each sub-image block in the one-to-one correspondence with each sub-image block in the related block;
  • a prediction module configured to respectively predict a corresponding sub-image block in the current image block according to a motion vector of each sub-image block in the correlation block.
  • a video image processing apparatus comprising:
  • a building module configured to determine M neighboring blocks of the current image block according to M candidates in a motion vector second candidate list of the current image block; and sequentially scan N neighboring blocks in the M neighboring blocks, Determining, according to the scan result, the target neighboring block, N is smaller than M; determining, according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block, the relevant block of the current image block;
  • a correlation block of a current image block determines a specific candidate in a motion vector first candidate list of the current image block; when it is determined that the specific candidate is employed, the current image block and the related block are in the same manner Dividing into a plurality of sub-image blocks, each sub-image block in the current image block is in one-to-one correspondence with each sub-image block in the related block;
  • a prediction module configured to respectively predict a corresponding sub-image block in the current image block according to a motion vector of each sub-image block in the correlation block.
  • a video image processing apparatus comprising a memory and a processor, the memory for storing an instruction, the processor for executing the instruction stored by the memory, and the instruction stored in the memory
  • the execution of the processor is for performing the method of the first aspect or any of the possible implementations of the first aspect.
  • a video image processing apparatus comprising a memory and a processor, the memory for storing instructions, the processor for executing the instructions stored by the memory, and instructions stored in the memory
  • the execution of the processor is for performing the method of any of the possible implementations of the second aspect or the second aspect.
  • a computer storage medium having stored thereon a computer program that, when executed by a computer, causes the computer to implement the method of the first aspect or any of the possible implementations of the first aspect.
  • a computer storage medium having stored thereon a computer program, the computer program being executed by a computer to cause the computer to implement the method of any of the possible implementations of the second aspect or the second aspect.
  • a computer program product comprising instructions, which when executed by a computer, cause the computer to implement the method of the first aspect or any of the possible implementations of the first aspect.
  • a computer program product comprising instructions that, when executed by a computer, cause the computer to implement the method of any of the possible implementations of the second aspect or the second aspect.
  • a video image processing method comprising:
  • the base motion vector list includes at least one set of bi-predictive base motion vector groups, where the bi-predictive base motion vector group includes a first base motion vector and a second base motion vector;
  • the current image block is predicted according to a motion vector of the current image block.
  • a video image processing method comprising:
  • Determining a base motion vector list the base motion vector list including a base motion vector group
  • a video image processing apparatus comprising:
  • a building module configured to determine a basic motion vector list, where the basic motion vector list includes at least one set of bi-predictive basic motion vector groups, where the bi-predictive basic motion vector group includes a first basic motion vector and a second basic motion vector Determining two motion vector offsets from a preset offset set, the two motion vector offsets respectively corresponding to the first base motion vector and the second base motion vector; Determining a motion vector of the current image block by a base motion vector, the second base motion vector, and the two motion vector offsets;
  • a prediction module configured to predict the current image block according to a motion vector of the current image block.
  • a fourteenth aspect a video image processing apparatus is provided, the apparatus comprising:
  • a determining module configured to determine a basic motion vector list, where the basic motion vector list includes a basic motion vector group
  • a processing module configured to: when the at least one basic motion vector in the basic motion vector group points to the specific reference image, discard the motion vector of the current image block according to the basic motion vector group and the motion vector offset.
  • a video image processing apparatus comprising a memory and a processor, the memory for storing an instruction, the processor is configured to execute the memory stored instruction, and is stored in the memory The execution of the instructions is such that the processor is operative to perform a method in any of the possible implementations of the eleventh or eleventh aspect.
  • a video image processing apparatus comprising a memory and a processor, the memory for storing an instruction, the processor for executing the memory stored instruction, and storing the memory
  • the execution of the instructions causes the processor to perform the method of any of the possible implementations of the twelfth or twelfth aspect.
  • a seventeenth aspect a computer storage medium having stored thereon a computer program, the computer program being executed by a computer to cause the computer to implement any of the eleventh or eleventh aspects of the possible implementation method.
  • a computer storage medium having stored thereon a computer program, the computer program being executed by a computer to cause the computer to implement any of the possible implementations of the twelfth or twelfth aspect method.
  • a computer program product comprising instructions which, when executed by a computer, cause the computer to implement a method in any one of the possible implementations of the eleventh or eleventh aspect.
  • a computer program product comprising instructions which, when executed by a computer, cause the computer to implement the method of any of the possible implementations of the twelfth or twelfth aspect.
  • FIG. 1 is a schematic flowchart of a video image processing method provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of acquiring candidate motion vectors of a current block by neighboring blocks of a current image block.
  • FIG. 3 is a schematic diagram of scaling processing of candidate motion vectors.
  • FIG. 4 is another schematic flowchart of a video image processing method provided by an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 6 is another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a video image processing method provided by an embodiment of the present application.
  • FIG. 8 is another schematic flowchart of a video image processing method according to an embodiment of the present application.
  • FIG. 9 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 10 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 11 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a candidate for acquiring a motion vector first candidate list.
  • FIG. 13 is still another schematic diagram of a candidate for constructing a motion vector first candidate list.
  • FIG. 14 and FIG. 15 are schematic flowcharts of a video image processing method according to an embodiment of the present application.
  • FIG. 16 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 17 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 18 is another schematic flowchart of a video image processing method according to an embodiment of the present application.
  • FIG. 19 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • FIG. 20 is still another schematic block diagram of a video image processing apparatus according to an embodiment of the present application.
  • a prediction block refers to a basic unit used for prediction in a frame of image.
  • the prediction block is also called a Prediction Unit (PU).
  • PU Prediction Unit
  • the image is divided into a plurality of image blocks. Further, each of the plurality of image blocks can be divided into a plurality of image blocks again, and so on.
  • the number of layers to be segmented can be different, and the operation methods assumed are different.
  • the names of image blocks on the same level may be different.
  • each image block of a plurality of image blocks into which a frame image is first divided is referred to as a Coding Tree Unit (CTU); each coding tree unit may include one code.
  • a Coding Unit (CU) is again divided into a plurality of coding units; one coding unit may be divided into one, two, four or other number of prediction units according to a prediction manner.
  • the coding tree unit is also referred to as a Largest Coding Unit (LCU).
  • LCU Largest Coding Unit
  • Prediction refers to finding image data similar to the prediction block, also referred to as a reference block of the prediction block.
  • the redundancy information in encoding/compression is reduced by encoding/compressing the difference between the prediction block and the reference block of the prediction block.
  • the difference between the prediction block and the reference block may be a residual obtained by subtracting the corresponding pixel value of the prediction block from the reference block.
  • Prediction includes intra prediction and inter prediction. Intra prediction refers to a reference block that looks up the prediction block within the frame in which the prediction block is located.
  • Inter prediction refers to a reference block that looks up the prediction block in a frame other than the frame in which the prediction block is located.
  • the prediction unit is the smallest unit in the image, and the prediction unit does not continue to be divided into multiple image blocks.
  • the "image block” or “current image block” mentioned hereinafter refers to one prediction unit (or one coding unit), and one image block can be further divided into a plurality of sub-image blocks, and each sub-image block can be further processed. prediction.
  • a motion vector candidate list is constructed, and the current image block is predicted according to the candidate motion vector selected in the motion vector candidate list.
  • the motion vector candidate list has multiple types of patterns. The following is an example of a plurality of types of motion vector candidate lists.
  • the encoding of the current image block can be completed by the following steps.
  • the current image block can be decoded by the following steps.
  • a motion vector candidate list is obtained by the method according to an embodiment of the present application.
  • the motion vector candidate list acquired by the decoding end is consistent with the motion vector candidate list acquired by the encoding end.
  • the motion vector MV1 of the current image block is obtained in the motion vector candidate list.
  • the predicted image block of the current image block is acquired, and the residual image is combined to obtain the current image block.
  • the motion vector of the current image block is equal to the prediction MV (Motion vector prediction, MVP).
  • this first type of mode is also known as the Merge mode.
  • the MV1 is used as the search starting point for the motion search, and the final search is performed.
  • the displacement of the position and the starting point of the search is recorded as the motion vector difference (MVD).
  • the predicted image block of the current image block is then determined from the reference image based on the motion vector MV1+MVD of the current image block.
  • the encoder is also sent to the MVD to the decoder.
  • this second type of mode is also referred to as the AMVP mode (ie, the normal inter prediction mode).
  • the motion vector candidate list in different types of modes may be constructed in the same or different manner.
  • the motion vector candidate list constructed in the same manner can be applied to only one of the type patterns, and can also be applied to different types of construction modes, and is not limited herein.
  • the motion vector candidate list of the two construction modes is hereinafter referred to as a motion vector first candidate list and a motion vector second candidate list.
  • the motion vector candidate list of the two construction modes is hereinafter referred to as a motion vector first candidate list and a motion vector second candidate list.
  • One difference between the two lists is that at least one of the motion vector first candidate lists includes a motion vector of the sub-image block, and each of the motion vector second candidate lists includes a motion vector of the image block.
  • the image block and the current image block are the same type of concept, and all refer to a prediction unit (or a coding unit), and the sub-image block refers to a plurality of segments obtained on the basis of the image block. Sub-image block.
  • the reference block of the current image block is determined according to the candidate, and then the residual of the image block and the reference block is calculated.
  • the candidate in the motion vector second candidate list is used for prediction, if the candidate used is the motion vector of the sub-image block, the reference block of each sub-image block in the current image block is determined according to the candidate, and then calculated.
  • the residual of each sub-image block in the current image block and its reference block, and the residual of each sub-image block is spliced into the residual of the current image block.
  • one of the candidates may be determined according to the ATMVP technique.
  • a motion vector determined according to the ATMVP technique may be added to the list as the first candidate.
  • the candidate is added to the motion vector second candidate list according to the motion vector of the preset number of spatial neighboring blocks at the preset position of the current image block.
  • the motion vectors determined according to the ATMVP technology are added to the list as candidates.
  • the candidate joining order of the two candidate lists may be other orders, and no limitation is imposed thereon.
  • the following uses the manner of constructing the second candidate list of motion vectors to exemplify how to determine one of the candidates according to the ATMVP technique.
  • the motion vector of an image block can contain two pieces of information: 1) the image to which the motion vector points; 2) the displacement.
  • the motion vector of an image block represents the image block having the displacement in the image pointed to by the motion vector.
  • the meaning of the motion vector includes: a reference image of the encoded/decoded image block, and a reference block of the encoded/decoded image block relative to the encoded/decoded image block. Displacement. It should be noted that the reference block of one image block mentioned herein refers to an image block representing the residual used to calculate the image block.
  • FIG. 1 is a schematic flowchart of a video image processing method according to an embodiment of the present application. The method includes the following steps.
  • the current image block is an image block to be encoded (or decoded).
  • the image frame in which the current image block is located is referred to as the current frame.
  • the current image block is a coding unit (CU).
  • the motion vector second candidate list of the current image block may be a Merge candidate list or an AMVP candidate list.
  • the motion vector second candidate list may be a regular motion vector candidate list (Normal Merge List) in the Merge candidate list. It should be understood that the motion vector second candidate list may also have another name.
  • the M candidate motion vectors may be determined according to motion vectors of M neighboring blocks within the current frame of the current image block.
  • the neighboring block may be an image block that is adjacent to the position of the current image block or has a certain positional spacing on the current frame. It should be understood that the M neighboring blocks are image blocks that have been encoded (or decoded) within the current frame.
  • the M neighboring blocks of the current image block are located at four positions A 1 (left) ⁇ B 1 (top) ⁇ B 0 (top right) around the current image block as shown in FIG. 2 ⁇ A 0 (lower left) image block.
  • the M (ie, M equals 4) candidate motion vectors of the current image block are determined based on the motion vectors of the image blocks of the 4 positions.
  • the neighboring block that is not available when a neighboring block that is not available appears in the M neighboring blocks, or a neighboring block that adopts an intra coding mode occurs in the M neighboring blocks, the neighboring block that is not available or the neighboring that adopts the intra coding mode The motion vector of the block is not available. Then, the motion vector of the neighboring block that is not available is not a candidate motion vector, and the motion vector that is not available is discarded to be added to the motion vector second candidate list of the current image block.
  • step S110 M candidate motion vectors have been added to the motion vector second candidate list.
  • step S120 the motion vector second candidate list may be directly scanned.
  • S120 sequentially scan N candidate motion vectors of the M candidate motion vectors, and determine a reference motion vector according to the scan result, where N is less than M.
  • N is less than M.
  • M and N are both natural numbers.
  • N candidate motion vectors out of the M candidate motion vectors are sequentially scanned. Fixedly scanning the N candidate motion vectors in the M candidate motion vectors in sequence, which may be fixed to scan the candidate motion vectors of the N candidate motion vectors that have been added to the candidate motion vector list; or, It is fixed to scan N candidate motion vectors of the M candidate motion vectors that have been added with the candidate motion vector list.
  • the process of determining the reference motion vector according to the scan result of the N candidate motion vectors may be: sequentially determining the N candidate motion vectors based on the preset condition, and determining the reference motion vector according to the determination result.
  • the preset condition includes that the image block may or may not adopt an intra prediction encoding mode, and the candidate motion vector points to a reference frame that is the same as the reference image of the current image block.
  • the reference image of the current image block is the reference image that is the closest to the image of the current image block; or the reference image of the current image block is the reference image preset by the codec end; or, the reference image of the current image block is The reference image specified in the video parameter set, sequence header, sequence parameter set, image header, image parameter set, and strip header.
  • the reference image of the current image block is a co-located frame of the current image block
  • the co-located frame is a frame for obtaining motion information for prediction in the strip-level information header.
  • the co-located frame is also referred to as a collocated picture.
  • step S120 only N of the M candidate motion vectors acquired in step S110 are scanned, so that the number of scans can be reduced.
  • the first N candidate motion vectors of the M candidate motion vectors may be sequentially scanned.
  • step S120 the last N candidate motion vectors of the M candidate motion vectors may be sequentially scanned; or, the N candidate motion vectors among the M candidate motion vectors may be sequentially scanned. This application does not limit this.
  • step S120 partial candidate motion vectors among the M candidate motion vectors are sequentially scanned.
  • step S120 partial candidate motion vectors in the candidate motion vectors currently added to the motion vector second candidate list are sequentially scanned.
  • the motion vector second candidate list of the current image block includes the M candidate motion vectors determined in step S110 and the candidate motion vectors determined in step S130.
  • other candidate motion vectors that continue to join the motion vector second candidate list are further determined according to other methods, where not Make restrictions.
  • the method may further include: S140, determining a motion vector of the current image block according to the motion vector second candidate list obtained in step S130.
  • the solution provided by the present application can be applied to the ATMVP technology.
  • the time domain vector of the current image block is obtained by scanning all airspace candidate motion vectors currently added in the motion vector second candidate list. For example, if the motion vector second candidate list is usually filled with 4 spatial candidate motion vectors, it may happen that four candidate motion vectors need to be scanned to obtain the time domain vector of the current image block.
  • N is less than M
  • the number of scans of the candidate motion vector in the process of acquiring the reference motion vector of the current image block can be reduced. It should be understood that applying the solution provided by the present application to the first step of the existing ATMVP technology can simplify the redundant operations that exist.
  • test configuration was RA configuration and LDB configuration.
  • the solution provided by this application was tested.
  • the test results showed that By reducing the number of scans, you can also maintain the performance gain of the ATMVP technology.
  • the solution provided by the present application can reduce the complexity of the ATMVP technology while maintaining the performance gain of the existing ATMVP technology.
  • the second candidate list of motion vectors formed by the constructed scheme provided by the present application can be applied to the encoding end and the decoding end.
  • the execution body of the method provided by the present application may be an encoding end or a decoding end.
  • the motion vector second candidate list formed by the construction scheme provided by the present application may be applied to the first type mode (for example, Merge mode) described above.
  • first type mode for example, Merge mode
  • step S110 four motion vector second candidate lists for adding the current image block are determined according to motion vectors of four neighboring blocks of the current image block in the current frame.
  • Candidate motion vector, ie M is equal to 4.
  • step S120 N candidate motion vectors out of the four candidate motion vectors are scanned, and N is less than 4.
  • N is equal to 1.
  • N is equal to 2 or 3.
  • step S120 it is determined one by one whether the N candidate motion vectors among the M candidate motion vectors satisfy the preset condition, and the reference motion vector is determined according to the determination result.
  • the definition of the preset condition is the same as the reference frame pointed by the candidate motion vector and the reference image of the current image block.
  • step S120 the N candidate motion vectors are sequentially scanned, when the first candidate motion vector that meets the preset condition is scanned, that is, when the first reference frame and the current frame are scanned.
  • the scanning is stopped, and the reference motion vector is determined according to the scanned candidate motion vector that meets the preset condition.
  • the number of scans may be equal to N or less than N.
  • the scanning is stopped, and the candidate motion vector is used as the reference motion vector of the current image block.
  • step S120 when the candidate motion vectors that meet the preset condition are not scanned in the N candidate motion vectors, that is, when the reference frames pointed by the N candidate motion vectors are not the same as the current frame of the current image block, At the same time, the default value is used as the value of the reference motion vector.
  • the default value is (0,0), that is, the reference motion vector is (0,0). It should be understood that the default value may have other definitions depending on the actual situation.
  • step S120 when the candidate motion vectors that meet the preset condition are not scanned in the N candidate motion vectors, that is, when the reference frames pointed by the N candidate motion vectors are not the same as the current frame of the current image block, the specific candidate motion vector in the motion vector second candidate list is subjected to scaling processing, and the reference motion vector is determined according to the specific candidate motion vector after the scaling processing.
  • the specific candidate motion vector may be the first motion vector or the last motion vector obtained in the scanning order among the N candidate motion vectors.
  • the specific candidate motion vector may also be a motion vector obtained in other scan orders among the N candidate motion vectors.
  • the specific candidate motion vector in the motion vector second candidate list is scaled, according to the specific candidate motion vector after the scaling process Determining the reference motion vector, comprising: performing a scaling process on the specific candidate motion vector in the second candidate list of motion vectors, such that the reference frame pointed by the specific candidate motion vector subjected to the scaling process is the same as the reference image of the current image block;
  • the specific candidate motion vector is used as the reference motion vector.
  • curr_pic represents the image of the current image block
  • col_pic represents the collocated picture of the current image block
  • neigh_ref_pic represents the reference frame pointed by the specific candidate motion vector.
  • the time distance between the reference image neigh_ref_pic pointed by the specific candidate motion vector and the image curr_pic of the image block corresponding to the specific motion vector, and the reference image col_pic of the current image block and the image curr_pic of the current image block The time distance between them determines the scaling of the particular motion vector.
  • one candidate motion vector of the N candidate motion vectors is scaled to be processed.
  • the reference frame is the same as the co-located frame of the current frame, and then this scaled candidate motion vector is used as the motion vector of the current image block. This can improve the accuracy of the motion vector of the current image block.
  • the specific candidate motion vector in this embodiment may be a candidate that is closest to the time frame in the reference frame of the N candidate motion vectors and the current frame of the current image block. Motion vector.
  • selecting one candidate motion vector whose reference frame is closest to the co-located frame of the current frame in the N candidate motion vectors is subjected to scaling processing, which can reduce the time required for performing the scaling processing, thereby improving the efficiency of acquiring the motion vector of the current image block.
  • the specific candidate motion vector in this embodiment may also be any one of the N candidate motion vectors.
  • the specific candidate motion vector in this embodiment is the one candidate motion vector scanned.
  • N is equal to 1.
  • a reference motion vector of the current image block is obtained by scanning one candidate motion vector in the motion vector second candidate list.
  • the candidate motion vector of the scan is different from the co-located frame of the current frame where the current image block is located, the candidate motion vector is scaled so that the reference frame of the candidate motion vector after the scaling process and the current frame are The co-located frame is the same; the candidate motion vector after the scaling process is used as the reference motion vector of the current image block.
  • the reference frame of the scanned candidate motion vector is the same as the co-located frame of the current frame, the candidate motion vector is used as the motion vector of the current image block.
  • the candidate motion vector is scaled so that the reference frame is the same as the co-located frame of the current frame, and then the scaled candidate motion vector is used as the candidate motion vector.
  • the motion vector of the current image block which can improve the accuracy of the motion vector of the current image block. Therefore, compared with the prior art, the solution provided by the embodiment of the present application can simplify the process of determining the motion vector of the current image block, and can improve the accuracy of the motion vector of the current image block.
  • determining, according to the reference motion vector, the current image block, and the reference image of the current image block, determining to continue to join the candidate motion vector in the motion vector second candidate list including: dividing the current image block into multiple a sub-image block; determining, according to the reference motion vector, a correlation block of the sub-image block in the reference image of the current image block; and determining a candidate motion vector to continue to join the motion vector second candidate list according to the motion vector of the correlation block.
  • the motion vector of the associated block of each sub-image block in the current image block is added as a candidate to the motion vector second candidate list.
  • the sub-image block is predicted according to the motion vector of the relevant block of each sub-image block in the current image block.
  • the representative motion vector of the relevant block of the current image block is added as a candidate to the motion vector second candidate list and the candidate is marked as determined according to the ATMVP technique.
  • the relevant block of the current image block is determined according to the mark and the candidate, and the current image block and the related block are divided into a plurality of sub-image blocks in the same manner, and each sub-image in the current image block is divided.
  • the block is in one-to-one correspondence with each sub-image block in the correlation block; and the corresponding sub-image block in the current image block is predicted according to the motion vector of each sub-image block in the correlation block.
  • the representative motion vector of the correlation block is used to replace the unobservable motion vector, and the corresponding sub-image block in the current image block is predicted.
  • the candidate determined according to the ATMVP technology is discarded to join the motion vector second candidate list.
  • the representative motion vector of the relevant block of the current image block may refer to a motion vector of a center position of the relevant block, or other motion vector representing the related block, which is not limited herein.
  • a related block may be referred to as a collocated block or a correlation block.
  • the current image block is a CU
  • the sub-image block obtained by dividing it may be referred to as a sub-CU.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are fixed to be greater than or equal to 64 pixels.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are each fixed to 8 x 8 pixels.
  • the size of the sub-image block is frame-adapted, and the size of the sub-image block defaults to 4 ⁇ 4.
  • the size of the sub-image block is set to 8 ⁇ 8.
  • the size of the sub-image block is set to 8 x 8, otherwise the default value of 4 x 4 is used.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the TMVP technology scales the MV of the co-located CU in the lower right corner or the center position of the current CU to obtain the time domain candidate motion vector of the current CU.
  • TMVP obtains MVP by traversing the code blocks of two fixed positions in the reference image, which is TB ⁇ TC, and directly uses the MV traversed to the MVP of TMVP.
  • the sub-picture block size in the ATMVP technology is set to 8 ⁇ 8, and it will use the MV in the existing merge list of the embodiments of the present application to perform the relevant block. The positioning will locate the MV of the relevant block as the MVP of the ATMVP.
  • constructing the merge list is to first construct the merge candidate list of ATMVP and then construct the merge candidate list of TMVP.
  • the TMVP candidate list is built in the merge list build process, and the ATMVP's merge candidate list is built during the affine merge list build process.
  • ATMVP and TMVP respectively construct two different lists, but it is reasonable to say that there is no need to add the same or the same set of MVs to the two merge candidate lists.
  • the merge candidate list constructed by TMVP and ATMVP may have some redundancy, that is, the two technologies may export the same set of time domain candidate motion information for the current CU.
  • the setting does not perform the TMVP operation.
  • the TMVP operation is not set. Therefore, the same or the same group of MVs can be avoided in the merge candidate list constructed by the ATMVP and the TMVP respectively, and the partial redundancy operation can be skipped, thereby effectively saving the coding and decoding time and improving the coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. This is due to the fact that the hardware design of the encoder and/or decoder requires as much time as possible for the same size processing area to complete the encoding or decoding. However, for areas with more small blocks, the time required to encode or decode will far exceed that of other areas.
  • the TMVP operation may not be set in the case where the width and height of the sub-image block and/or the related block of the sub-image block are both less than 8 pixels. Therefore, saving the pipeline time of small block coding or decoding is very meaningful for parallel processing of hardware.
  • the current coding technology is increasingly utilized for time domain correlation, and many time domain prediction techniques are adopted, such as ATMVP technology. Therefore, for small blocks, the performance impact of skipping the TMVP operation can be neglected, which can effectively save the codec time and improve the coding efficiency.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block may also be Other sizes, such as the size of the sub-image block and/or the size of the associated block of the sub-image block, are A ⁇ B, A ⁇ 64, B ⁇ 64, and A and B are integers of 4.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are 4 x 16 pixels, or 16 x 4 pixels.
  • determining, according to the reference motion vector, the current image block, and the reference image of the current image block, determining to continue to join the candidate motion vector in the motion vector second candidate list including: according to the reference motion vector, A correlation block of the current image block is determined in the reference image of the current image block; and a candidate motion vector that continues to join the motion vector second candidate list is determined according to the motion vector of the correlation block.
  • the encoded/decoded image is generally used as the reference image to be encoded/decoded.
  • a reference image may also be constructed to increase the similarity of the reference image to the current image to be encoded/decoded.
  • video surveillance belongs to this type of scenario.
  • a video surveillance scene it is usually monitored that the camera is stationary or only slow moving occurs, and the background is considered to be substantially unchanged.
  • objects such as people or cars photographed in a video surveillance camera often move or change, and the foreground can be considered to change frequently.
  • a specific reference image can be created that contains only high quality background information.
  • a plurality of image blocks may be included in the particular reference image, and any one of the image blocks is taken from a certain decoded image, and different image blocks in the particular reference image may be taken from different decoded images.
  • the background portion of the current image to be encoded/decoded can refer to the specific reference image, whereby the residual information of the inter prediction can be reduced, thereby improving the encoding/decoding efficiency.
  • the particular reference image has at least one of the following properties: a composite reference, a long-term reference image, an image that is not output.
  • the image that is not output refers to an image that is not outputted; in general, the image that is not output exists as a reference image of other images.
  • the specific reference image may be a constructed long-term reference image, or may be a construction frame that is not output, or may be a long-term reference image that is not output, and the like.
  • the construction frame is also referred to as a composite reference frame.
  • the non-specific reference image may be a reference image that does not have at least one of the following: a construction frame, a long-term reference image, an image that is not output.
  • the non-specific reference image may include a reference image other than the construction frame, or include a reference image other than the long-term reference image, or include a reference image other than the image that is not output, or include a long-term reference image other than the configuration.
  • the reference image includes a reference image other than the construction frame that is not output, or includes a reference image other than the long-term reference image that is not output, and the like.
  • the long-term reference image and the short-term reference image can be distinguished.
  • the short-term reference image is a concept corresponding to a long-term reference image.
  • the short-term reference image is present in the reference image buffer for a period of time after which the short-term reference image is removed from the reference image buffer after a number of move-in and move-out operations in the reference image buffer after the short-term reference image has passed.
  • the reference picture buffer may also be referred to as a reference picture list buffer, a reference picture list, a reference frame list buffer, or a reference frame list, etc., which are collectively referred to herein as reference picture buffers.
  • the long-term reference image (or a portion of the data in the long-term reference image) may remain in the reference image buffer, the long-term reference image (or a portion of the data in the long-term reference image) is not subject to the decoded reference image in the reference image buffer
  • the effect of the shift in and out operations is that the long-term reference image (or a portion of the data in the long-term reference image) is removed from the reference image buffer only when the decoding terminal issues an update instruction operation.
  • Short-term reference images and long-term reference images may be called differently in different standards.
  • short-term reference images in H.264/advanced video coding (AVC) or H.265/HEVC standards are called For short-term reference
  • the long-term reference picture is called a long-term reference.
  • AVC audio coding coding
  • IEEE Institute of electrical and electronics engineers
  • the long-term reference image is called For the background picture.
  • the long-term reference image is called a golden frame.
  • referring to a long-term reference image as a long-term reference frame does not mean that H.264/AVC or H.265/ must be used.
  • HEVC and other standards correspond to the technology.
  • the long-term reference image mentioned above may be constructed from image blocks taken from a plurality of decoded images, or may be updated by using a plurality of decoded images to update existing reference frames (eg, pre-stored reference frames).
  • existing reference frames eg, pre-stored reference frames
  • the specific reference image of the configuration may also be a short-term reference image.
  • the long-term reference image may not be a constructed reference image.
  • the specific reference image may include a long-term reference image
  • the non-specific reference image may include a short-term reference image
  • the type of reference frame may be identified by a special field in the code stream structure.
  • determining that the reference image is a long-term reference image
  • determining that the reference image is a specific reference image; or determining that the reference image is a specific reference image when determining that the reference image is a frame that is not output; or, determining When the reference image is a construction frame, the reference image is determined to be a specific reference image; or, when the reference image is determined to be a frame that is not output, and the reference image is further determined to be a construction frame, the reference image is determined to be a specific reference image.
  • each type of reference image may have a corresponding identifier.
  • whether the reference image is a specific reference image may be determined according to the identifier of the reference image.
  • the reference image when determining that the reference image has an identification of the long-term reference image, is determined to be a particular reference image.
  • the reference image when it is determined that the reference image has an identification that is not to be output, the reference image is determined to be a particular reference image.
  • the reference image when it is determined that the reference image has an identification of the constructed frame, the reference image is determined to be a particular reference image.
  • determining that the reference image has a specific reference image when the reference image has at least two of the following three identifiers: an identifier of the long-term reference image, an identifier that is not output, a construction frame, or a composite reference frame Logo. For example, when it is determined that the reference image has an identifier that is not output, and it is determined that the reference image has an identifier of the construction frame, the reference image is determined to be a specific reference image.
  • the image may have an identifier indicating whether it is an output frame, and when a certain image is indicated not to be output, indicating that the frame is a reference image, and further, determining whether the frame has an identifier of a constructed frame, if And determining that the reference image is a specific reference image. If an image is instructed to be output, it may be determined that the frame is not a specific reference image without making a determination as to whether or not to construct the frame. Alternatively, if an image is indicated not to be output, but has an identity that is not a constructed frame, then it may be determined that the frame is not a particular reference image.
  • determining that the reference image is specific when determining that the reference image satisfies one of the following conditions from a picture header, a picture parameter set (PPS), a slice header, and a slice header Reference image:
  • the reference image is a long-term reference image
  • the reference image is a construction reference image
  • the reference image is an image that is not output
  • the reference image is further determined to be a construction reference image.
  • a motion vector of a certain image block on another image is used to determine a motion vector of the image block.
  • the image block is referred to as a first image block
  • a certain image block on other images to be utilized is referred to as a time domain reference block or a related block of the first image block.
  • the first image block and the time domain reference block (or related block) of the first image block are located on different images.
  • the term "related block" is uniformly used in this article.
  • the motion vector of the relevant block when the ATMVP technology is applied to construct the AMVP candidate list, when determining the motion vector of the relevant block of the current image block according to the ATMVP technology, the motion vector of the relevant block needs to be scaled, and then the current image is determined according to the scaled motion vector.
  • the motion vector of the block Generally, the time distance between the reference image pointed by the motion vector of the relevant block and the image of the relevant block, and the time distance between the reference image of the current image block and the image of the current image block are determined, and the relevant block is determined.
  • the scale of the motion vector when the ATMVP technology is applied to construct the AMVP candidate list, when determining the motion vector of the relevant block of the current image block according to the ATMVP technology, the motion vector of the relevant block needs to be scaled, and then the current image is determined according to the scaled motion vector.
  • the motion vector of the block Generally, the time distance between the reference image pointed by the motion vector of the relevant block and the image of the relevant block, and the time distance between the reference image
  • the motion vector of the correlation block is referred to as MV 2
  • the reference frame index of the reference image pointed to by the motion vector MV 2 is x.
  • the reference frame index value x is the difference between the sequence number (for example, POC) of the reference image pointed to by MV 2 and the sequence number of the image of the relevant block.
  • the reference frame index of the reference image of the first image block is referred to as y.
  • the reference frame index value y is the difference between the sequence number of the reference image of the first image block and the sequence number of the image of the first image block.
  • the scaling of the motion vector MV 2 is y/x.
  • the product of the motion vector MV 2 and y/x may be used as the motion vector of the first image block.
  • the motion vector of the current image block is determined according to the motion vector of the relevant block, specifically: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference In the image, the motion vector of the current image block is determined according to the motion vector of the processed relevant block, wherein the motion vector of the processed relevant block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of the relevant block of the scaling step.
  • the motion vector of the current image block is determined according to the motion vector of the relevant block, specifically: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference In the case of an image, the motion vector of the current image block is determined from the motion vector of the relevant block.
  • step S120 includes: not scanning a candidate motion vector that meets a preset condition in the N candidate motion vectors, and performing a scaling process on the specific candidate motion vector in the motion vector second candidate list, according to the scaling process.
  • the specific candidate motion vector determines the reference motion vector.
  • the method further includes: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, determining to continue adding the motion vector according to the processed specific candidate motion vector A candidate motion vector in the second candidate list, wherein the processed specific candidate motion vector is the same as the specific candidate motion vector before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of a related block of the scaling step.
  • step S120 includes: not scanning a candidate motion vector that meets a preset condition in the N candidate motion vectors, and performing a scaling process on the specific candidate motion vector in the motion vector second candidate list, according to the scaling process.
  • the specific candidate motion vector determines the reference motion vector.
  • the method further includes: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, discarding to continue to join the motion vector according to the specific candidate motion vector.
  • Candidate motion vector for the candidate list includes: not scanning a candidate motion vector that meets a preset condition in the N candidate motion vectors, and performing a scaling process on the specific candidate motion vector in the motion vector second candidate list, according to the scaling process.
  • the specific candidate motion vector determines the reference motion vector.
  • the method further includes: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, discarding to continue to join the motion vector according to the specific candidate motion vector.
  • one of the N candidate motion vectors is scaled to have a reference frame and a current frame.
  • the co-located frames are the same, and then the scaled candidate motion vector is used as the motion vector of the current image block, so that the accuracy of the motion vector of the current image block can be improved.
  • the current image block is divided into sub-image blocks of size 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the size of the sub-block of the previous encoded image block. Information, therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the embodiment of the present application further provides a video image processing method, where the method includes the following steps:
  • Step S410 corresponds to step S110 described above, and the specific description refers to the above, and details are not described herein again.
  • S420 Scan at least part of the candidate motion vectors of the M candidate motion vectors, and determine a reference motion vector of the current image block according to the scan result.
  • step S420 may correspond to step S120 described above, as described in detail above.
  • all candidate motion vectors in the M candidate motion vectors are sequentially scanned, and a reference motion vector of the current image block is determined according to the scan result.
  • step S420 the specific manner of determining the reference motion vector of the current image block according to the scan result may refer to the related description in the foregoing embodiment, and details are not described herein again.
  • the current image block is divided into a plurality of sub-image blocks, wherein the size of the sub-image block is fixed to be greater than or equal to 64 pixels.
  • the current image block is a CU
  • the sub-image block obtained by dividing it may be referred to as a sub-CU.
  • the reference image of the current image block may be a co-located frame of the current image block.
  • the size of the sub-image block is frame-adapted, and the size of the sub-image block defaults to 4 ⁇ 4.
  • the size of the sub-image block is set to 8 ⁇ 8.
  • the size of the sub-image block is set to 8 x 8, otherwise the default value of 4 x 4 is used.
  • the size of the sub-image block of the current image block is fixed to be greater than or equal to 64 pixels, and information of the size of the sub-image block of the previous encoded image block is not required, and thus, the storage space can be saved.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to 8 ⁇ 8 pixels.
  • the size of the sub-image block is frame-adapted, and the size of the sub-image block defaults to 4 ⁇ 4.
  • the size of the sub-image block is set to 8 ⁇ 8.
  • the size of the sub-image block is set to 8 x 8, otherwise the default value of 4 x 4 is used.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block may also be Other sizes, such as the size of the sub-image block and/or the size of the associated block of the sub-image block, are A ⁇ B, A ⁇ 64, B ⁇ 64, and A and B are integers of 4.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are 4 x 16 pixels, or 16 x 4 pixels.
  • step S420 at least part of the candidate motion vectors are sequentially scanned, and when the first candidate motion vector meeting the preset condition is scanned, the scanning is stopped, and the first condition according to the scan meets the preset condition.
  • the candidate motion vector determines the reference motion vector.
  • Determining the reference motion vector according to the first candidate motion vector that meets the preset condition that is scanned may include: using the first candidate motion vector that meets the preset condition as the target neighboring block.
  • the preset condition includes: the reference image of the candidate motion vector is the same as the reference image of the current image block.
  • step S450 includes: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference image, determining to continue to join the motion vector second candidate list according to the motion vector of the processed relevant block.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of the relevant block of the scaling step.
  • step S450 includes: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference image, discarding the candidate for continuing to join the motion vector second candidate list according to the motion vector of the relevant block Motion vector.
  • the current image block is divided into sub-image blocks of size 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the previous image.
  • the information of the size of the sub-block of the image block has been encoded, and therefore, the storage space can be saved.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the method of determining the candidate to join the motion vector second candidate list according to the ATMVP technique is described above.
  • the second candidate list of the motion vector may be added to other candidates, which is not limited herein.
  • FIG. 5 is a schematic block diagram of a video image processing apparatus 500 according to an embodiment of the present application.
  • the apparatus 500 is for performing the method embodiment as shown in FIG.
  • the device 500 includes the following units.
  • An obtaining unit 510 configured to acquire M candidate motion vectors for adding a motion vector second candidate list of a current image block
  • a determining unit 520 configured to sequentially scan N candidate motion vectors of the M candidate motion vectors, and determine a reference motion vector according to the scan result, where N is less than M;
  • the determining unit 520 is further configured to determine, according to the reference motion vector, the current image block, and the reference image of the current image block, the candidate motion vector that continues to be added to the motion vector second candidate list;
  • the determining unit 520 is further configured to determine a motion vector of the current image block according to the motion vector second candidate list.
  • the time domain vector of the current image block is obtained by scanning all the candidate motion vectors currently added in the motion vector second candidate list. For example, if the motion vector second candidate list is usually filled with 4 candidate motion vectors, it may happen that four candidate motion vectors need to be scanned to obtain the time domain vector of the current image block.
  • N is less than M
  • the number of scans of the candidate motion vector in the process of acquiring the reference motion vector of the current image block can be reduced. It should be understood that applying the solution provided by the present application to the first step of the existing ATMVP technology can simplify the redundant operations that exist.
  • test configuration was RA configuration and LDB configuration.
  • the solution provided by this application was tested.
  • the test results showed that By reducing the number of scans, you can also maintain the performance gain of the ATMVP technology.
  • the solution provided by the present application can reduce the complexity of the ATMVP technology while maintaining the performance gain of the existing ATMVP technology.
  • the acquiring unit 510 is configured to acquire, according to the motion vector of the M neighboring blocks in the current frame, the M candidates for adding the motion vector second candidate list of the current image block. Motion vector.
  • the neighboring block is an image block adjacent to the position of the current image block or having a certain positional spacing on the current frame.
  • the determining unit 520 is configured to sequentially scan the first N candidate motion vectors of the M candidate motion vectors.
  • M is equal to 4 and N is less than 4.
  • N is equal to 1 or 2.
  • the determining unit 520 is configured to sequentially scan the N candidate motion vectors of the M candidate motion vectors based on the preset condition, and determine the reference motion vector according to the scan result.
  • the preset condition includes: the reference motion frame that is pointed to the same candidate motion vector as the reference image of the current image block.
  • the determining unit 520 is configured to sequentially scan the N candidate motion vectors, and when scanning the first candidate motion vector that meets the preset condition, stop scanning, and according to the scanned A candidate motion vector that meets a preset condition determines a reference motion vector.
  • the determining unit 520 is configured to: when a candidate motion vector that meets a preset condition is not scanned in the N candidate motion vectors, perform a specific candidate motion vector in the motion vector second candidate list.
  • the scaling process determines the reference motion vector based on the specific candidate motion vector after the scaling process.
  • the specific candidate motion vector is the first motion vector or the last motion vector obtained in the scanning order among the N candidate motion vectors.
  • the determining unit 520 is configured to perform a scaling process on the specific candidate motion vector in the motion vector second candidate list, so that the reference frame pointed by the specific candidate motion vector that is subjected to the scaling process and the current image block The reference image is the same; the specific candidate motion vector after the scaling process is used as the reference motion vector.
  • the determining unit 520 is configured to use a default value as a reference motion vector when a candidate motion vector that meets a preset condition is not scanned in the N candidate motion vectors.
  • the default value is a motion vector (0, 0).
  • the determining unit 520 is configured to divide the current image block into multiple sub-image blocks; determine, according to the reference motion vector, a related block of the sub-image block in the reference image of the current image block; The motion vector determines the candidate motion vector that continues to join the motion vector second candidate list.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to be greater than or equal to 64 pixels.
  • the current image block is a coding unit CU.
  • the determining unit 520 is configured to determine, according to the reference motion vector, a related block of the current image block in the reference image of the current image block, and determine to continue to join the motion vector according to the motion vector of the relevant block.
  • Candidate motion vector for the candidate list is configured to determine, according to the reference motion vector, a related block of the current image block in the reference image of the current image block, and determine to continue to join the motion vector according to the motion vector of the relevant block.
  • the determining unit 520 is configured to: when the motion vector of the relevant block points to a specific reference image, or the reference image of the current image block is a specific reference image, determine to continue according to the motion vector of the processed relevant block. A candidate motion vector in the second candidate list of motion vectors is added, wherein the motion vector of the processed correlation block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a related block of the scaling step. Sport vector.
  • the determining unit 520 is configured to: when the motion vector of the relevant block points to a specific reference image, or the reference image of the current image block is a specific reference image, discard the motion to determine to continue to join the motion according to the motion vector of the relevant block.
  • Vector candidate motion vector for the second candidate list is configured to: when the motion vector of the relevant block points to a specific reference image, or the reference image of the current image block is a specific reference image, discard the motion to determine to continue to join the motion according to the motion vector of the relevant block.
  • the determining unit 520 is configured to: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is the specific reference image, determine to continue to join the motion according to the processed specific candidate motion vector.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a related block of the scaling step. Sport vector.
  • the determining unit 520 is configured to: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, discard the determination to continue to join the motion vector according to the specific candidate motion vector.
  • the candidate motion vector of the second candidate list is configured to: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, discard the determination to continue to join the motion vector according to the specific candidate motion vector.
  • the candidate motion vector of the second candidate list is configured to: when the specific candidate motion vector points to the specific reference image, or the reference image of the current image block is a specific reference image, discard the determination to continue to join the motion vector according to the specific candidate motion vector.
  • the motion vector second candidate list is a Merge candidate list.
  • the reference image of the current image block is a co-located frame of the current image block.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to 8 x 8 pixels.
  • the setting does not perform the TMVP operation, and may be skipped. Partially redundant operation saves coding and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • both the obtaining unit 510 and the determining unit 520 in this embodiment may be implemented by a processor.
  • the embodiment of the present application further provides a video image processing apparatus 600.
  • the apparatus 600 is for performing the method embodiment as shown in FIG.
  • the device 600 includes the following units.
  • a determining unit 610 configured to acquire M candidate motion vectors for adding a motion vector second candidate list of the current image block
  • a determining unit 620 configured to sequentially scan at least part of the candidate motion vectors of the M candidate motion vectors, and determine a reference motion vector of the current image block according to the scan result;
  • a dividing unit 630 configured to divide the current image block into a plurality of sub-image blocks, wherein the size of the sub-image block is fixed to be greater than or equal to 64 pixels;
  • the determining unit 620 is further configured to determine, according to the reference motion vector, a related block of the sub-image block in the reference image of the current image block;
  • the determining unit 620 is further configured to determine, according to the motion vector of the relevant block, the candidate motion vector that continues to join the motion vector second candidate list.
  • the size of the sub-image block is frame-adapted, and the size of the sub-image block defaults to 4 ⁇ 4.
  • the size of the sub-image block is set to 8 ⁇ 8.
  • the size of the sub-image block is set to 8 x 8, otherwise the default value of 4 x 4 is used.
  • the size of the sub-image block of the current image block is fixed to be greater than or equal to 64 pixels, and information of the size of the sub-image block of the previous encoded image block is not required, and thus, the storage space can be saved.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to 8 x 8 pixels.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the determining unit 620 is configured to sequentially scan at least part of the candidate motion vectors, and when scanning the first candidate motion vector that meets the preset condition, stop scanning, and according to the scanned A candidate motion vector that meets a preset condition determines a reference motion vector.
  • the determining unit 620 is configured to use the first candidate motion vector that meets the preset condition as the target neighboring block.
  • the preset condition includes: the reference image of the candidate motion vector is the same as the reference image of the current image block.
  • the obtaining unit 610, the determining unit 620, and the dividing unit 630 in this embodiment may all be implemented by a processor.
  • the motion vector of one image block contains two pieces of information: 1) the image to which the motion vector points; 2) the displacement.
  • the motion vector of an image block contains only the information "displacement".
  • the image block additionally provides index information for indicating a reference image of the image block.
  • the meaning of the motion vector includes: the reference block of the encoded/decoded image block is located on the reference image relative to the encoded/decoded image block and located in the reference image The displacement of the image block.
  • step S120 When determining the reference block of the encoded/decoded image block, it is required to determine the encoded/decoded by the index information of the reference image of the encoded/decoded image block and the motion vector of the encoded/decoded image block.
  • the reference block of the image block Then, in the video image processing method shown in FIG. 1, step S120 does not scan the candidate motion vector in the motion vector second candidate list, but directly scans the image block corresponding to the candidate motion vector.
  • a video image processing method is provided below for a new definition of the motion vector (ie, containing "displacement" information but not "image pointed").
  • the methods for determining candidate motion vectors based on the ATMVP technology respectively provided for these two different meanings of "motion vector” are basically the same, and the above explanation also applies to the video image processing method provided below, the difference Mainly in the constructing the motion vector second candidate list, when determining the candidate to join the motion vector second candidate list according to the ATMVP technology, in the video image processing method described above, it is added to the added motion vector second candidate list.
  • the motion vector is scanned, and in the video image processing method provided below, the image block corresponding to the motion vector in the motion vector second candidate list is scanned.
  • the embodiment of the present application provides a video image processing method, and the method includes the following steps.
  • the current image block is an image block to be encoded (or decoded).
  • the current image block is a coding unit (CU).
  • the image frame in which the current image block is located is referred to as the current frame.
  • the neighboring block is an image block that is adjacent to the position of the current image block or has a certain positional spacing on the current image.
  • the M neighboring blocks are image blocks that have been encoded (or decoded) in the current frame.
  • the order of the four adjacent blocks of the current image block is determined in turn.
  • S720 sequentially scan N neighboring blocks in the M neighboring blocks, and determine a target neighboring block according to the scan result, where N is smaller than M.
  • the process of determining the target neighboring block according to the scanning result of the N neighboring blocks may be: sequentially determining N neighboring blocks based on the preset condition, and determining the target neighboring block according to the determining result.
  • the preset condition is defined such that the reference image of the neighboring block is the same as the reference image of the current image block.
  • the reference image of the current image block is the reference image that is the closest to the image of the current image block; or the reference image of the current image block is the reference image preset by the codec end; or, the reference image of the current image block is The reference image specified in the video parameter set, sequence header, sequence parameter set, image header, image parameter set, and strip header.
  • the reference image of the current image block is a co-located frame of the current image block
  • the co-located frame is a frame for obtaining motion information for prediction in the strip-level information header.
  • step S720 only N neighboring blocks among the M neighboring blocks acquired in step S710 are scanned, so that the number of scans can be reduced.
  • step S720 the first N neighboring blocks in the M neighboring blocks may be sequentially scanned.
  • the first N neighboring blocks acquired in step S720 refer to the N neighboring blocks first determined in the preset order.
  • step S720 the last N neighboring blocks of the M neighboring blocks may be sequentially scanned; or, the N neighboring blocks in the middle of the M neighboring blocks may be sequentially scanned. This application does not limit this.
  • step S740 includes: determining a reference block of the current image block according to the motion vector of the relevant block and the reference image.
  • step S740 includes: constructing a candidate block list of a current image block, the candidate block in the candidate block list includes M neighboring blocks and related blocks; encoding and decoding the current image block according to the reference block of the candidate block in the candidate block list .
  • the candidate block list is a list of merge candidates for the current image block. In one example.
  • the candidate block list is an AMVP candidate list of the current image block.
  • the index of the candidate block of the current block is written into the code stream. After the index is obtained, the candidate block corresponding to the index is found from the candidate block list, the reference block of the current image block is determined according to the reference block of the candidate block, or the current image block is determined according to the motion vector of the candidate block. Sport vector.
  • the reference block of the candidate block is directly determined as the reference block of the current image block, or the motion vector of the candidate block is directly determined as the motion vector of the current image block.
  • the encoding end also writes the MVD of the current block into the code stream. After acquiring the MVD, the decoding end adds the motion vector of the candidate block to the MVD as the motion vector of the current block, and then determines the reference block of the current block according to the motion vector and the reference image of the current block.
  • N is smaller than M
  • the number of scans of the candidate neighboring blocks in the process of acquiring the target neighboring block of the current image block is reduced, thereby reducing complexity.
  • step S710 it is determined that the current image block is 4 neighboring blocks in the current frame, that is, M is equal to 4.
  • step S720 N neighboring blocks of the 4 neighboring blocks are scanned, and N is less than 4.
  • N is equal to 1.
  • step S720 only the first of the four adjacent blocks is scanned.
  • N is equal to 2 or 3.
  • step S720 The manner in which the target neighboring block is determined based on the scanning result of the N neighboring blocks in step S720 will be described below.
  • step S720 the N neighboring blocks are sequentially scanned, and when the first neighboring block that meets the preset condition is scanned, the scanning is stopped, and according to the scanned first neighboring condition that meets the preset condition.
  • the block determines the target neighboring block.
  • the preset condition is defined as the reference image of the neighboring block being the same as the reference image of the current image block.
  • the definition of the preset condition is that the reference image of the adjacent block is the same as the reference image of the current image block as an example.
  • the first neighboring block that meets the preset condition is taken as the target neighboring block.
  • the method further includes: performing scaling processing on motion vectors of the specific neighboring blocks in the M neighboring blocks, The current image block is encoded/decoded according to the motion vector after the scaling process.
  • the reference block of the current image block is determined according to the motion vector after the scaling process and the reference image of the current image block.
  • the specific neighboring block is the first neighboring block or the last neighboring block obtained in the scanning order among the N neighboring blocks.
  • the specific neighboring block may also be a neighboring block obtained in other scanning orders among the N neighboring blocks.
  • the current image block is encoded/decoded according to the motion vector after the scaling process, including: performing a scaling process on the motion vector of the specific neighboring block, so that the reference frame of the scaled motion vector is pointed to the current image block.
  • the reference image is the same; the image block pointed to by the scaled motion vector in the reference image of the current image block is used as the reference block of the current image block.
  • step S720 when a neighboring block that meets a preset condition is not scanned in the N neighboring blocks, the default block is used as a candidate reference block of the current image block.
  • the default block is the image block pointed to by the motion vector (0,0).
  • step S730 The process of determining the relevant block of the current image block according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block in step S730 will be described below.
  • determining a related block of the current image block according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block including: dividing the current image block into multiple sub-image blocks; A correlation block of the sub-image block is determined in the reference image of the current image block according to the motion vector of the target neighboring block, and the relevant block of the current image block includes the relevant block of the sub-image block.
  • a related block may be referred to as a collocated block or a correlation block.
  • the current image block is a CU
  • the sub-image block obtained by dividing it may be referred to as a sub-CU.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are fixed to be greater than or equal to 64 pixels.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to 8 x 8 pixels.
  • the size of the sub-image block is frame-adapted, and the size of the sub-image block defaults to 4 ⁇ 4.
  • the size of the sub-image block is set to 8 ⁇ 8.
  • the size of the sub-image block is set to 8 x 8, otherwise the default value of 4 x 4 is used.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block may also be Other sizes, such as the size of the sub-image block and/or the size of the associated block of the sub-image block, are A ⁇ B, A ⁇ 64, B ⁇ 64, and A and B are integers of 4.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are 4 x 16 pixels, or 16 x 4 pixels.
  • determining a relevant block of the current image block according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block including: according to the motion vector of the target neighboring block, at the current A related block of the current image block is determined in the reference image of the image block.
  • step S740 includes: when the reference image of the relevant block is a specific reference image, or the reference image of the current image block is a specific reference image, determining the current according to the motion vector of the processed relevant block and the reference image of the current image block. a candidate reference block of the image block; wherein the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of the relevant block of the scaling step.
  • step S740 includes: when the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discarding the candidate reference block of the current image block according to the motion vector of the relevant block.
  • step S720 includes: when the motion vector of the specific neighboring block points to the specific reference image, or the reference image of the current image block is the specific reference image, according to the motion vector of the processed relevant block and the reference of the current image block.
  • the image determines a reference block of the current image block; wherein the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of a related block of the scaling step.
  • N is smaller than M
  • the number of scans of the candidate neighboring blocks in the process of acquiring the target neighboring block of the current image block can be reduced, thereby reducing complexity.
  • the motion vector of one of the N neighboring blocks is scaled to make the reference frame and the current frame The same-bit frame is the same, and then this scaled motion vector is used as the motion vector of the current image block, which can improve the accuracy of the motion vector of the current image block.
  • the current image block is divided into sub-image blocks of size 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the size of the sub-block of the previous encoded image block. Information, therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • a motion vector of a certain image block on another image is used to determine a motion vector of the image block.
  • the image block is referred to as a first image block
  • a certain image block on other images to be utilized is referred to as a time domain reference block or a related block of the first image block. It can be understood that the first image block and the time domain reference block (or related block) of the first image block are located on different images. Then, in determining the motion vector of the first image block using the motion vector of the time domain reference block (or related block), it may be necessary to scale the motion vector of the time domain reference block (or related block). For the convenience of description, the term "related block" is uniformly used in this article.
  • the motion vector of the relevant block is required.
  • the scaling is performed, and then the motion vector of the current image block is determined according to the scaled motion vector.
  • the time distance between the reference image pointed by the motion vector of the relevant block and the image of the relevant block, and the time distance between the reference image of the current image block and the image of the current image block are determined, and the relevant block is determined.
  • the scale of the motion vector is determined.
  • the motion vector of the correlation block is referred to as MV 2
  • the reference frame index of the reference image pointed to by the motion vector MV 2 is x.
  • the reference frame index value x is the difference between the sequence number (for example, POC) of the reference image pointed to by MV 2 and the sequence number of the image of the relevant block.
  • the reference frame index of the reference image of the first image block is referred to as y.
  • the reference frame index value y is the difference between the sequence number of the reference image of the first image block and the sequence number of the image of the first image block.
  • the scaling of the motion vector MV 2 is y/x.
  • the product of the motion vector MV 2 and y/x may be used as the motion vector of the first image block.
  • the motion vector of the current image block is determined according to the motion vector of the relevant block, specifically: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference In the image, the motion vector of the current image block is determined according to the motion vector of the processed relevant block, wherein the motion vector of the processed relevant block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a motion vector of the relevant block of the scaling step.
  • the motion vector of the current image block is determined according to the motion vector of the relevant block, specifically: when the motion vector of the relevant block points to the specific reference image, or the reference image of the current image block is a specific reference In the case of an image, the motion vector of the current image block is determined from the motion vector of the relevant block.
  • the embodiment of the present application further provides a video image processing method, where the method includes the following steps.
  • Step S810 may correspond to step S710 in the above embodiment.
  • S820 Scan at least a part of the neighboring blocks of the M neighboring blocks in sequence, and determine the target neighboring block according to the scan result.
  • a part of the neighboring blocks in the M neighboring blocks are sequentially scanned, and the target neighboring block is determined according to the scanning result.
  • all neighboring blocks in the M neighboring blocks are sequentially scanned, and the target neighboring block is determined according to the scanning result.
  • the current image block is divided into a plurality of sub-image blocks, wherein the size of the sub-image block is fixed to be greater than or equal to 64 pixels.
  • S840 Determine, according to the motion vector of the target neighboring block and the sub-image block, the relevant block of the current image block in the reference image of the current image block.
  • the reference image of the current image block is a reference image that is closest to the image in which the current image block is located.
  • the reference image of the current image block is a reference image preset by the codec end.
  • the reference image of the current image block is a reference image specified in a video parameter set, a sequence header, a sequence parameter set, an image header, an image parameter set, and a slice header.
  • S850 encode/decode the current image block according to the motion vector of the relevant block.
  • the size of the sub-image block of the current image block is fixed to be greater than or equal to 64 pixels, and information of the size of the sub-image block of the previous encoded image block is not required, and thus, the storage space can be saved.
  • the size of the sub-image block and/or the size of the time domain reference block of the sub-image block are fixed to 8 ⁇ 8 pixels.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the TMVP operation is not performed, and the partial redundancy operation may be skipped. It effectively saves code and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block may also be Other sizes, such as the size of the sub-image block and/or the size of the associated block of the sub-image block, are A ⁇ B, A ⁇ 64, B ⁇ 64, and A and B are integers of 4.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are 4 x 16 pixels, or 16 x 4 pixels.
  • step S820 includes: sequentially scanning at least a portion of the neighboring blocks, and when scanning the first neighboring block that meets the preset condition, stopping scanning, and according to the scanned first neighboring block that meets the preset condition Determine the target neighboring block.
  • the first neighboring block that meets the preset condition is taken as the target neighboring block.
  • the preset condition is defined as: the reference image of the neighboring block is the same as the reference image of the current image block.
  • step S840 includes: determining, according to the motion vector of the target neighboring block and the sub-image block, the relevant block of the sub-image block in the reference image of the current image block, where the relevant block of the current image block, including the sub-image block Related block.
  • FIG. 9 is a schematic block diagram of a video image processing apparatus 900 according to an embodiment of the present application.
  • the apparatus 900 is for performing the method embodiment as shown in FIG.
  • the device 900 includes the following units.
  • the obtaining unit 910 is configured to acquire M neighboring blocks of the current image block.
  • a determining unit 920 configured to sequentially scan N neighboring blocks in the M neighboring blocks, and determine a target neighboring block according to the scan result, where N is less than M;
  • the determining unit 920 is further configured to determine, according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block, the relevant block of the current image block;
  • the encoding/decoding unit 930 is configured to encode/decode the current image block according to the motion vector of the relevant block.
  • N is smaller than M
  • the number of scans of the candidate neighboring blocks in the process of acquiring the target neighboring block of the current image block is reduced, thereby reducing complexity.
  • M is equal to 4 and N is less than 4.
  • N is equal to 1 or 2.
  • the determining unit 920 is configured to sequentially scan the first N neighboring blocks in the M neighboring blocks.
  • the acquiring unit 910 is configured to sequentially acquire M neighboring blocks of the current image block in a preset order; the first N neighboring blocks refer to N neighboring blocks that are first determined in a preset order. .
  • the determining unit 920 is configured to sequentially scan the N neighboring blocks, and when scanning the first neighboring block that meets the preset condition, stop scanning, and according to the first scanned A neighboring block that meets a preset condition determines a target neighboring block.
  • the determining unit 920 is configured to use the first neighboring block that meets the preset condition as the target neighboring block.
  • the preset condition includes: the reference image of the neighboring block is the same as the reference image of the current image block.
  • the encoding/decoding unit 930 is configured to determine a reference block of the current image block according to the motion vector of the relevant block and the reference image.
  • the encoding/decoding unit 930 is configured to construct a candidate block list of a current image block, where candidate blocks in the candidate block list include M neighboring blocks and related blocks; according to candidate blocks in the candidate block list
  • the reference block encodes and decodes the current image block.
  • the encoding/decoding unit 930 is further configured to: when a neighboring block that meets a preset condition is not scanned in the N neighboring blocks, a motion vector of a specific neighboring block in the M neighboring blocks. The scaling process is performed, and the current image block is encoded/decoded according to the motion vector after the scaling process.
  • the encoding/decoding unit 930 is configured to determine a reference block of the current image block according to the motion vector after the scaling process and the reference image of the current image block.
  • the specific neighboring block is the first neighboring block or the last neighboring block obtained in the scanning order among the N neighboring blocks.
  • the encoding/decoding unit 930 is configured to perform a scaling process on a motion vector of a specific neighboring block, so that the reference frame pointed by the motion vector after the scaling process is the same as the reference image of the current image block;
  • the scaled processed motion vector is the image block pointed to in the reference image of the current image block as the reference block of the current image block.
  • the determining unit 920 is configured to: when a neighboring block that meets a preset condition is not scanned in the N neighboring blocks, use the default block as a reference block of the current image block.
  • the default block is an image block pointed to by a motion vector (0, 0).
  • the determining unit 920 is configured to:
  • a correlation block of the sub-image block is determined in the reference image of the current image block according to the motion vector of the target neighboring block, and the relevant block of the current image block includes the relevant block of the sub-image block.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block are fixed to be greater than or equal to 64 pixels.
  • the setting does not perform the TMVP operation, and may be skipped. Partially redundant operation saves coding and decoding time and improves coding efficiency.
  • the TMVP operation in the case where the width and/or height of the current CU is less than 8 pixels, the TMVP operation is not set. In other words, in the case where at least one of the width and the height of the sub-image block and/or the relevant block of the sub-image block is less than 8 pixels, the TMVP operation is not set. For the above situation, the performance impact of skipping the TMVP operation is negligible, which can effectively save the codec time and improve the coding efficiency.
  • the current image block is a coding unit CU.
  • the determining unit 920 is configured to determine, according to the motion vector of the target neighboring block, a related block of the current image block in the reference image of the current image block.
  • the neighboring block is an image block adjacent to the position of the current image block or having a certain positional spacing on the current image.
  • the encoding/decoding unit 930 is configured to: when the reference image of the relevant block is a specific reference image, or the reference image of the current image block is a specific reference image, according to the motion vector of the processed related block. And a reference image of the current image block determines a reference block of the current image block;
  • the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping the correlation of the scaling step.
  • the motion vector of the block is a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping the correlation of the scaling step.
  • the encoding/decoding unit 930 is configured to: when the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discard the current image according to the motion vector of the relevant block.
  • the reference block of the block is configured to: when the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discard the current image according to the motion vector of the relevant block.
  • the reference block of the block is configured to: when the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discard the current image according to the motion vector of the relevant block.
  • the reference block of the block is configured to: when the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discard the current image according to the motion vector of the relevant block.
  • the determining unit 920 is configured to: when the motion vector of the specific neighboring block points to the specific reference image, or the reference image of the current image block is the specific reference image, according to the motion vector sum of the processed related block
  • the reference image of the current image block determines a reference block of the current image block; wherein the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the motion vector of the processed related block includes: a motion vector obtained by scaling a motion vector of the relevant block according to a scaling value of 1 or skipping a related block of the scaling step. Sport vector.
  • the obtaining unit 910, the determining unit 920, and the encoding/decoding unit 930 in this embodiment may all be implemented by a processor.
  • the embodiment of the present application further provides a video image processing apparatus 1000.
  • the apparatus 1000 is for performing the method embodiment as shown in FIG.
  • the device 1000 includes the following units.
  • the acquiring unit 1010 is configured to acquire M neighboring blocks of the current image block.
  • the determining unit 1020 is configured to sequentially scan at least a part of the neighboring blocks of the M neighboring blocks, and determine the target neighboring block according to the scan result;
  • a dividing unit 1030 configured to divide the current image block into a plurality of sub-image blocks, wherein the size of the sub-image block is fixed to be greater than or equal to 64 pixels;
  • the determining unit 1020 is further configured to determine, according to the motion vector of the target neighboring block and the sub-image block, the relevant block of the current image block in the reference image of the current image block;
  • the encoding/decoding unit 1040 is configured to encode/decode the current image block according to the motion vector of the relevant block.
  • the size of the sub-image block of the current image block is fixed to be greater than or equal to 64 pixels, and information of the size of the sub-image block of the previous encoded image block is not required, and thus, the storage space can be saved.
  • the size of the sub-image block and/or the size of the time domain reference block of the sub-image block are fixed to 8 ⁇ 8 pixels.
  • the size of the sub-image block of the current image block is set to 8 ⁇ 8, on the one hand, it can adapt to the storage granularity of the motion vector specified in the video standard VVC, and on the other hand, it is not necessary to store the last encoded image.
  • the information of the size of the sub-image block of the block therefore, can save storage space.
  • the setting does not perform the TMVP operation, and may be skipped. Partially redundant operation saves coding and decoding time and improves coding efficiency.
  • the setting in a case where at least one of the width and the high of the current CU is less than 8, the setting does not perform the TMVP operation, and the performance impact caused by skipping the TMVP operation is negligible, thereby effectively saving the codec. Time to improve coding efficiency.
  • the size of the sub-image block and/or the size of the relevant block of the sub-image block may also be Other sizes, such as the size of the sub-image block and/or the size of the associated block of the sub-image block, are A ⁇ B, A ⁇ 64, B ⁇ 64, and A and B are integers of 4.
  • the size of the sub-picture block and/or the size of the associated block of the sub-picture block are 4 x 16 pixels, or 16 x 4 pixels.
  • At least part of the neighboring blocks of the M neighboring blocks are sequentially scanned, and determining the target neighboring block according to the scanning result includes: sequentially scanning at least part of the neighboring blocks, and when scanning to the first one When the conditional neighboring block is set, the scanning is stopped, and the target neighboring block is determined according to the scanned neighboring block that meets the preset condition.
  • the determining unit 1020 is configured to use the first neighboring block that meets the preset condition as the target neighboring block.
  • the preset condition includes: the reference image of the neighboring block is the same as the reference image of the current image block.
  • the determining unit 1020 is configured to determine, according to the motion vector of the target neighboring block and the sub-image block, the relevant block of the sub-image block in the reference image of the current image block, where the current image block is related. Block, including related blocks of sub-image blocks.
  • the obtaining unit 1010, the determining unit 1020, the dividing unit 1030, and the encoding/decoding unit 1040 in this embodiment may all be implemented by a processor.
  • the embodiment of the present application further provides a video image processing apparatus 1100.
  • Apparatus 1100 can be used to perform the method embodiments described above.
  • the apparatus 1100 includes a processor 1110 for storing instructions, a processor 1110 for executing instructions stored by the memory 1120, and execution of instructions stored in the memory 1120 for causing the processor 1110 to perform the method according to the above The method of the examples.
  • the device 1100 may further include a communication interface 1130 for communicating with an external device.
  • the processor 1110 is configured to control the communication interface 1130 to receive and/or transmit signals.
  • the apparatus 500, 600, 900, 1000, and 1100 provided by the present application can be applied to an encoder as well as to a decoder.
  • the motion vector second candidate list is explained above, and the motion vector first candidate list will be explained below.
  • an affine motion compensation model can be introduced in the codec technology.
  • Affine transformation motion compensation describes the affine motion field of an image block through a set of MVs of control points.
  • the affine transform motion compensation model uses a four-parameter Affine model, and the set of control points includes two control points (such as the upper left corner and the upper right corner of the image block).
  • the affine transform motion compensation model uses a six-parameter Affine model, and the set of control points includes three control points (eg, an upper left corner point, an upper right corner point, and a lower left corner point of the image block).
  • the added candidate when constructing the first candidate list of motion vectors, may be an MV of a set of control points, or a Control Point Motion Vector Prediction (CPMVP).
  • the motion vector first candidate list may be used in the Merge mode. Specifically, it may be referred to as an Affine Merge mode.
  • the motion vector first candidate list may be referred to as an affine merge candidate list.
  • the prediction in the motion vector first candidate list is directly used as the CPMV (Control Point Motion Vector) of the current image block, that is, the affine motion estimation process is not required.
  • candidates determined according to the ATMVP technique may be added to the motion vector first candidate list.
  • control point motion vector group of the relevant block of the current image block is added as a candidate to the motion vector first candidate list.
  • the candidate performs prediction in the motion vector first list
  • the current image block is predicted according to the control point motion vector group of the relevant block of the current image block.
  • the representative motion vector of the relevant block of the current image block is added as a candidate to the motion vector first candidate list as described above.
  • the candidate is also marked as determined according to the ATMVP technique.
  • the relevant block of the current image block is determined according to the marker and the candidate, and the current image block and the related block are divided into multiple sub-image blocks in the same manner, currently Each sub-image block in the image block is in one-to-one correspondence with each sub-image block in the correlation block; and the motion vector of the corresponding sub-image block in the current image block is respectively performed according to the motion vector of each sub-image block in the correlation block. prediction.
  • the representative motion vector of the relevant block is used to replace the unobservable motion vector, and the corresponding sub-image block in the current image block is predicted.
  • the candidates determined according to the ATMVP technique are discarded to join the motion vector second candidate list.
  • a sub-image block in a correlation block is not available, or a sub-image block in the correlation block adopts an intra coding mode, it is determined that a sub-image block in which a motion vector is not available appears in the correlation block.
  • each candidate in the motion vector first candidate list includes a motion vector of a group of control points; when the representative motion vector of the relevant block of the current image block is added to the motion vector first candidate list, The consistency of the data format, the representative motion vector of the relevant block may be inserted into the motion vector of each of the candidate points (that is, the motion vector of each control point in the candidate is assigned to the relevant block) Represents motion vector).
  • the representative motion vector of the relevant block of the current image block may refer to a motion vector of a center position of the relevant block, or other motion vector representing the related block, which is not limited herein.
  • the method for determining the relevant block of the current image block includes two methods:
  • Method 1 sequentially scan N neighboring blocks in the preset M neighboring blocks of the current image block, and determine a target neighboring block according to the scan result, where N is less than M, and M is less than or equal to 4; according to the motion vector of the target neighboring block And the current image block and the reference image of the current image block, and the relevant block of the current image block is determined.
  • Method 2 determining, according to the M candidates in the motion vector second candidate list of the current image block, determining M neighboring blocks of the current image block; sequentially scanning N neighboring blocks in the M neighboring blocks, and determining according to the scan result
  • the target neighboring block, N is smaller than M, and M is less than or equal to 4; and the relevant block of the current image block is determined according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block.
  • the M candidates in the motion vector second candidate list may refer to M neighboring blocks of the current image block.
  • the method for determining a candidate to join the motion vector first candidate list includes: determining, from a neighboring block of the current image block, a control point motion of the neighboring block predicted by using the affine transformation mode in a specific scanning order a vector group; a control point motion vector group of each determined neighboring block is added as a candidate to the motion vector first candidate list.
  • the neighboring block that uses the affine change mode for prediction refers to that the motion vector of the neighboring block is determined according to the candidate in the affine merge candidate list. That is, the candidate is from the affine motion model of the airspace neighboring block of the current image block using the affine mode; that is, the CPMV of the airspace neighboring block using the affine mode is taken as the CPMVP of the current block.
  • control point motion vector group may include motion vectors of two control points of the neighboring block (eg, an upper left corner point and an upper right corner point of the neighboring block), or include three control points of the neighboring block.
  • Motion vectors (such as the upper left corner, upper right corner, and lower left corner of the image block) depend on whether the four-parameter Affine model or the six-parameter Affine model is used.
  • control point motion vector group of the neighboring block predicted by the affine transformation mode is determined according to a specific scanning order, including:
  • FIG. 12 is a schematic diagram of a candidate for acquiring a motion vector first candidate list by a neighboring block of a current image block.
  • On the left side of the current image block sequentially scanning in the scanning order of image block A->image block D->image block E, and adding the control point motion vector group of the first image block satisfying the preset condition as a candidate
  • the first candidate list of motion vectors On the upper side of the current image block, sequentially scanning in the scanning order of the image block B->image block C, and adding the control point motion vector group of the first image block satisfying the preset condition as a candidate to the first candidate of the motion vector List.
  • the candidate is determined to be determined in the scanning sequence.
  • a method for determining candidates added to a motion vector first candidate list includes:
  • the motion vector first candidate list it is added to the motion vector first candidate list by constructing candidates.
  • a preset value for example, 5
  • the constructed candidate is to combine the motion information of the neighboring blocks of the current image block partial control point and add it as a CPMVP to the motion vector first candidate list.
  • FIG. 13 is a schematic diagram of a candidate for constructing a motion vector first candidate list by neighboring blocks of a current image block.
  • the current image block has four control points, which are CP1, CP2, CP3, and CP4.
  • image blocks A0 and A1 are spatial neighboring blocks of CP1;
  • image blocks A2, B2 and B3 are spatial neighboring blocks of CP2;
  • image blocks B0 and B1 are spatial neighboring blocks of CP2, and T is a time domain phase of CP4 Neighboring block.
  • the coordinates of the control points CP1, CP2, CP3 and CP4 are: (0, 0), (W, 0), (H, 0) and (W, H), respectively, and W and H represent the width and height of the current CU, respectively.
  • the acquisition priority of the adjacent block motion information of each control point is:
  • the priority is B2->B3->A2.
  • B2 is available
  • the MV of B2 is used as the MV of the control point CP1
  • B2 is not available
  • the MV of B3 is used as the MV of the control point CP1
  • MV of A1 is used as the MV of the control point CP1
  • B2, B3 and A1 are not available, the motion information of the control point CP1 is not available.
  • the acquisition priority is: B1->B0; for CP3, the acquisition priority is: A1->A0; for CP4, the MV of T is directly used as the MV of control point CP4.
  • two or more of the MVs of the four control points can be combined to obtain one or more candidates, and two of the combinations are selected: ⁇ CP1, CP2 ⁇ , ⁇ CP1, CP3 ⁇ .
  • the combination mode ⁇ CP1, CP3 ⁇ needs to convert the MVs of the selected two control points into the MVs (CP1 and CP2) of the upper left and upper right control points of the current CU according to the four parameter model.
  • one or more candidates can be obtained by combining three of the MVs of the four control points, and four combinations are selected: ⁇ CP1, CP2, CP4 ⁇ , ⁇ CP1, CP2, CP3 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ .
  • the combination modes ⁇ CP1, CP2, CP3 ⁇ , ⁇ CP2, CP3, CP4 ⁇ , ⁇ CP1, CP3, CP4 ⁇ need to convert the MVs of the selected three control points into the upper left corner of the current CU according to the six-parameter model, upper right MV (CP1, CP2 and CP3) of the corner and bottom left control points
  • the candidate generated by the combined construct is considered to be unavailable.
  • the method of determining candidates to join the motion vector first candidate list includes populating using a default vector.
  • the default vector can be a zero vector or other vector.
  • the candidate that joins the motion vector first candidate list determining whether the number of candidates currently added to the first candidate list has reached a preset value; if not, using The default vector is populated into the first candidate list until the number of candidates in the first candidate list reaches a preset value.
  • the affine motion model is used according to the affine motion model.
  • the candidate derives a motion vector of the sub-image block in the current image block.
  • the candidate adopted is a candidate determined by using the ATMVP technology
  • the reference block of each sub-image block in the current image block is determined according to the motion vector of each sub-image block in the relevant block according to the above description
  • the reference block of each sub-image block is spliced into reference blocks of the current image block, and the residual of the current image block is calculated according to the reference block.
  • a video image processing method provided by an embodiment of the present application is described below by way of example with reference to FIG. 14 and FIG. As shown in FIG. 14, the method includes the following steps.
  • S1410 sequentially scan N neighboring blocks in the preset M neighboring blocks of the current image block, and determine a target neighboring block according to the scan result, where N is smaller than M.
  • M is less than or equal to 4.
  • S1420 Determine, according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block, a related block of the current image block.
  • S1430 The current image block and the related block are divided into a plurality of sub-image blocks in the same manner, and each sub-image block in the current image block is in one-to-one correspondence with each sub-image block in the related block.
  • S1440 Perform prediction on a corresponding sub-image block in the current image block according to motion vectors of each sub-image block in the relevant block.
  • the method includes the following steps.
  • S1520 Scan N consecutive neighboring blocks in the M neighboring blocks, and determine a target neighboring block according to the scan result, where N is smaller than M.
  • M is less than or equal to 4.
  • the specific candidate may be the candidate determined according to the ATMVP technology mentioned above.
  • S1560 Perform prediction on the corresponding sub-image block in the current image block according to the motion vector of each sub-image block in the relevant block.
  • FIG. 16 is a schematic block diagram of a video image processing apparatus 1600 according to an embodiment of the present application.
  • the apparatus 1600 is for performing the method embodiment as shown in FIG.
  • the device 1600 includes the following units.
  • the construction module 1610 sequentially scans N neighboring blocks in the preset M neighboring blocks of the current image block, and determines a target neighboring block according to the scan result, where N is smaller than M; according to the motion vector of the target neighboring block, the current Determining, by the image block and the reference image of the current image block, a correlation block of the current image block; dividing the current image block and the related block into a plurality of sub-image blocks in the same manner, in the current image block Each sub-image block is in one-to-one correspondence with each sub-image block in the relevant block;
  • the prediction module 1620 respectively predicts a corresponding sub-image block in the current image block according to a motion vector of each sub-image block in the correlation block.
  • N is equal to 1 or 2.
  • the prediction module is further configured to: perform representative motion of the related block before performing prediction on the corresponding sub-image block in the current image block according to motion vectors of each sub-image block in the correlation block The vector is added as a candidate to the first candidate list of motion vectors;
  • the prediction module When it is determined that the candidate is adopted, the prediction module respectively predicts a corresponding sub-image block in the current image block according to a motion vector of each sub-image block in the relevant block.
  • the predicting the corresponding sub-image block in the current image block according to the motion vector of each sub-image block in the correlation block includes:
  • the motion vectors of the sub-image blocks in the correlation block are respectively used as motion vectors of the corresponding sub-image blocks in the current image block.
  • the representative motion vector of the correlation block is added as the first candidate to the motion vector first candidate list.
  • the representative motion vector of the correlation block includes a motion vector of a center position of the correlation block.
  • the prediction module is further configured to: when a sub-image block in which a motion vector is not available in the correlation block, use a representative motion vector of the correlation block as a motion vector of the sub-image block in which the motion vector is not available. And predicting a corresponding sub-image block in the current image block.
  • the prediction module is further configured to: when a sub-image block in which a motion vector is not available in the correlation block, and a representative motion vector of the correlation block is not available, discarding each sub-image block according to the correlation block The motion vectors respectively predict the corresponding sub-image blocks in the current image block.
  • the prediction module is further configured to: when the sub-image block in the correlation block is not available, or the sub-image block in the correlation block adopts an intra coding mode, determine that no motion is available in the relevant block Vector sub-image block.
  • the building block is further configured to: determine other candidates, add the other candidates to the motion vector first candidate list, wherein at least one of the other candidates includes motion of the sub-image block Vector.
  • the building block is further configured to: when determining to adopt one of the other candidates, determine a motion vector of the sub-image block in the current image block according to the adopted candidate.
  • the at least one candidate includes a set of motion vectors for the control points.
  • the prediction module is also used to:
  • the adopted candidate is subjected to affine transformation according to an affine transformation model
  • each of the at least one candidate includes a motion vector of two control points
  • each of the at least one candidate includes a motion vector of three control points.
  • the constructing module is further configured to: determine, from a neighboring block of the current image block, a control point motion vector group of the neighboring block predicted by using the affine transform mode in a specific scan order;
  • the control point motion vector group of each determined neighboring block is added as one candidate to the motion vector first candidate list.
  • determining, from a neighboring block of the current image block, a control point motion vector group of a neighboring block that is predicted by using an affine transformation mode, in a specific scan order includes:
  • the building module is further configured to: construct a motion vector of the partial control point according to a neighboring block of the partial control point of the current image block;
  • the constructing the motion vector of the partial control point according to the neighboring block of the partial control point of the current image block includes:
  • a specific neighboring block of the control point is sequentially scanned in a third scanning order, and a motion vector of a specific neighboring block that satisfies a preset condition is used as a motion vector of the control point.
  • the build module is also used to:
  • the motion vector of the partial control points of the current image block is discarded and added to the motion vector first candidate list.
  • the build module is also used to:
  • the motion vector of the current image block is determined according to the motion vector of the candidate.
  • determining the motion vector of the current image block according to the motion vector of the candidate includes:
  • the candidate for the confirmation is used as a motion vector of the current image block, or the candidate used for the confirmation is scaled as a motion vector of the current image block.
  • the constructing a second candidate list of motion vectors includes:
  • a candidate joining the motion vector second candidate list is determined according to a motion vector of a plurality of neighboring blocks of the current image block on the current image.
  • the plurality of neighboring blocks of the current image block on the current image include the preset M neighboring blocks.
  • the build module is also used to:
  • the N neighboring blocks refer to N neighboring blocks that are first determined in the predetermined order.
  • the build module is also used to:
  • the N neighboring blocks in the M neighboring blocks are sequentially scanned, and the target neighboring blocks are determined according to the scan result, including:
  • the determining the target neighboring block according to the scanned neighboring block that meets the preset condition including:
  • the first neighboring block that meets the preset condition is used as the target neighboring block.
  • the preset conditions include:
  • the reference image of the neighboring block is the same as the reference image of the current image block.
  • the building module is further configured to perform a scaling process on a motion vector of a specific one of the M neighboring blocks when a neighboring block that meets the preset condition is not scanned in the N neighboring blocks
  • the prediction module is further configured to predict the current image block according to the motion vector after the scaling process.
  • the predicting the current image block according to the motion vector after the scaling process comprises:
  • the specific neighboring block is the first neighboring block or the last neighboring block obtained in the scanning order among the N neighboring blocks.
  • performing a scaling process on a motion vector of a specific one of the M neighboring blocks, and performing prediction on the current image block according to the motion vector after the scaling process including:
  • An image block pointed to by the scaled motion vector in the reference image of the current image block is used as a reference block of the current image block.
  • a default block is used as a reference block of the current image block.
  • the default block is an image block pointed to by a motion vector (0, 0).
  • the size of the sub-image block and/or the size of the associated block of the sub-image block is fixed to be greater than or equal to 64 pixels.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are fixed to 8 x 8 pixels, or 16 x 4 pixels or 4 x 16 pixels.
  • the setting is not By performing TMVP operations, partial redundancy operations can be skipped, which saves coding and decoding time and improves coding efficiency.
  • the current image block is a coding unit CU.
  • determining the relevant block of the current image block according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block including:
  • the neighboring block is an image block that is adjacent to or at a positional spacing of the current image block on the current image.
  • the predicting the current image block according to the motion vector of the correlation block includes:
  • the reference image of the relevant block is a specific reference image, or the reference image of the current image block is a specific reference image, determining, according to the processed motion vector of the correlation block and the reference image of the current image block, a reference block of the current image block;
  • the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the processed motion vector of the related block includes:
  • the motion vector of the relevant block of the scaling step is skipped.
  • the predicting the current image block according to the motion vector of the correlation block includes:
  • the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discarding the reference block of the current image block according to the motion vector of the relevant block is discarded.
  • the build module is also used to:
  • the motion vector of the specific neighboring block points to a specific reference image, or the reference image of the current image block is a specific reference image, determining according to the processed motion vector of the correlation block and the reference image of the current image block a reference block of the current image block;
  • the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the processed motion vector of the related block includes:
  • the motion vector of the relevant block of the scaling step is skipped.
  • M is less than or equal to four.
  • FIG. 17 is a schematic block diagram of a video image processing apparatus 1700 according to an embodiment of the present application.
  • the apparatus 1700 is for performing a method embodiment as shown in FIG.
  • the device 1700 includes the following units.
  • a constructing module 1710 configured to determine M neighboring blocks of the current image block according to M candidates in a motion vector second candidate list of the current image block; and sequentially scan N neighboring blocks in the M neighboring blocks Determining, according to the scan result, the target neighboring block, N is smaller than M; determining, according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block, the relevant block of the current image block; The relevant block of the current image block determines a specific candidate in the motion vector first candidate list of the current image block; when it is determined that the specific candidate is adopted, the current image block and the related block are used in the same
  • the method is divided into a plurality of sub-image blocks, and each sub-image block in the current image block is in one-to-one correspondence with each sub-image block in the related block;
  • the prediction module 1720 is configured to respectively predict a corresponding sub-image block in the current image block according to a motion vector of each sub-image block in the correlation block.
  • At least one of the motion vector first candidate lists includes a motion vector of a sub-image block, each of the motion vector second candidate lists including a motion vector of the image block.
  • N is equal to 1 or 2.
  • the M candidates include motion vectors of M neighboring blocks of the current image block on the current image.
  • the N neighboring blocks in the M neighboring blocks are sequentially scanned, and the target neighboring blocks are determined according to the scan result, including:
  • the determining the target neighboring block according to the scanned neighboring block that meets the preset condition including:
  • the first neighboring block that meets the preset condition is used as the target neighboring block.
  • the preset conditions include:
  • the reference image of the neighboring block is the same as the reference image of the current image block.
  • the building module is further configured to perform a scaling process on a motion vector of a specific one of the M neighboring blocks when a neighboring block that meets the preset condition is not scanned in the N neighboring blocks
  • the prediction module is further configured to predict the current image block according to the motion vector after the scaling process.
  • the predicting the current image block according to the scaled motion vector includes:
  • the specific neighboring block is the first neighboring block or the last neighboring block obtained in the scanning order among the N neighboring blocks.
  • performing a scaling process on a motion vector of a specific one of the M neighboring blocks, and performing prediction on the current image block according to the motion vector after the scaling process including:
  • An image block pointed to by the scaled motion vector in the reference image of the current image block is used as a reference block of the current image block.
  • a default block is used as a reference block of the current image block.
  • the default block is an image block pointed to by a motion vector (0, 0).
  • the size of the sub-image block and/or the size of the associated block of the sub-image block is fixed to be greater than or equal to 64 pixels.
  • the size of the sub-image block and/or the size of the associated block of the sub-image block are fixed to 8 x 8 pixels, or 16 x 4 pixels or 4 x 16 pixels.
  • the setting is not By performing TMVP operations, partial redundancy operations can be skipped, which saves coding and decoding time and improves coding efficiency.
  • the current image block is a coding unit CU.
  • determining the relevant block of the current image block according to the motion vector of the target neighboring block, the current image block, and the reference image of the current image block including:
  • the neighboring block is an image block that is adjacent to or at a positional spacing of the current image block on the current image.
  • the predicting the current image block according to the motion vector of the correlation block includes:
  • the reference image of the relevant block is a specific reference image, or the reference image of the current image block is a specific reference image, determining, according to the processed motion vector of the correlation block and the reference image of the current image block, a reference block of the current image block;
  • the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the processed motion vector of the related block includes:
  • the motion vector of the relevant block of the scaling step is skipped.
  • the predicting the current image block according to the motion vector of the correlation block includes:
  • the reference image of the relevant block is a specific reference image, or the reference image of the current block is a specific reference image, discarding the reference block of the current image block according to the motion vector of the relevant block is discarded.
  • the building module is further configured to: when the motion vector of the specific neighboring block points to a specific reference image, or the reference image of the current image block is a specific reference image, according to the processed motion vector of the related block And determining a reference block of the current image block with a reference image of the current image block;
  • the motion vector of the processed related block is the same as the motion vector of the relevant block before processing.
  • the processed motion vector of the related block includes:
  • the motion vector of the relevant block of the scaling step is skipped.
  • the predicting the corresponding sub-image block in the current image block according to the motion vector of each sub-image block in the correlation block includes:
  • the motion vectors of the sub-image blocks in the correlation block are respectively used as motion vectors of corresponding sub-image blocks in the current image block.
  • determining, according to the relevant block of the current image block, the specific candidate in the motion vector first candidate list of the current image block including:
  • a representative motion vector of a correlation block of the current image block is added to the motion vector first candidate list as the specific candidate.
  • the representative motion vector of the correlation block is added as the first candidate to the motion vector first candidate list.
  • the representative motion vector of the correlation block includes a motion vector of a center position of the correlation block.
  • the prediction module is further configured to: when a sub-image block in which a motion vector is not available in the correlation block, use a representative motion vector of the correlation block as a motion vector of the sub-image block in which the motion vector is not available. And predicting a corresponding sub-image block in the current image block.
  • the prediction module is further configured to: when a sub-image block in which a motion vector is not available in the correlation block, and a representative motion vector of the correlation block is not available, discarding each sub-image block according to the correlation block The motion vectors respectively predict the corresponding sub-image blocks in the current image block.
  • the prediction module is further configured to: when the sub-image block in the correlation block is not available, or the sub-image block in the correlation block adopts an intra coding mode, determine that no motion is available in the relevant block Vector sub-image block.
  • the prediction module is further configured to: when determining, adopting one of the motion vector second candidate lists other than the specific candidate, emulating the adopted candidate according to an affine transformation model Shot transformation
  • each candidate includes a motion vector of a group of control points.
  • each of the at least one candidate includes a motion vector of two control points
  • each of the at least one candidate includes a motion vector of three control points.
  • the prediction module is further configured to: determine, from a neighboring block of the current image block, a control point motion vector group of the neighboring block predicted by using the affine transformation mode in a specific scanning order;
  • the control point motion vector group of each determined neighboring block is added as one candidate to the motion vector first candidate list.
  • determining, from a neighboring block of the current image block, a control point motion vector group of a neighboring block that is predicted by using an affine transformation mode, in a specific scan order includes:
  • the building module is further configured to: construct a motion vector of the partial control point according to a neighboring block of the partial control point of the current image block;
  • the constructing the motion vector of the partial control point according to the neighboring block of the partial control point of the current image block includes:
  • a specific neighboring block of the control point is sequentially scanned in a third scanning order, and a motion vector of a specific neighboring block that satisfies a preset condition is used as a motion vector of the control point.
  • the building module is further configured to: when the motion vectors of the partial control points point to different reference frames respectively, discard the motion vector of the partial control points of the current image block into the motion vector first candidate list .
  • the building module is further configured to: when the number of candidates in the motion vector first candidate list is greater than a preset value, abandon adding a motion vector of a part of the control points of the current image block to the motion vector First candidate list.
  • the building module is further configured to: construct a motion vector second candidate list, wherein the candidate joining the motion vector second candidate list is a motion vector of one image block;
  • the motion vector of the current image block is determined according to the motion vector of the candidate.
  • determining the motion vector of the current image block according to the motion vector of the candidate includes:
  • the candidate for the confirmation is used as a motion vector of the current image block, or the candidate used for the confirmation is scaled as a motion vector of the current image block.
  • the constructing a second candidate list of motion vectors includes:
  • the constructing module is further configured to: sequentially add motion vectors of the preset M neighboring blocks as M candidates in a preset order, and add the motion vector second candidate list;
  • the N neighboring blocks refer to N neighboring blocks that are first determined in the predetermined order.
  • the building block is further configured to: when the motion vector of one or more neighboring blocks of the M neighboring blocks is not available, discarding determining to join the motion according to the motion vector of the one or more neighboring blocks Vector candidate for the second candidate list.
  • M is less than or equal to four.
  • the embodiment of the present application further provides a video image processing method 1800, where the method includes the following steps:
  • S1810 Determine a basic motion vector list, where the basic motion vector list includes at least one set of bi-predictive basic motion vector groups, where the bi-predictive basic motion vector group includes a first basic motion vector and a second basic motion vector;
  • S1820 Determine two motion vector offsets from a preset offset set, where the two motion vector offsets respectively correspond to the first base motion vector and the second base motion vector;
  • S1840 predict the current image block according to the motion vector of the current image block.
  • the video image processing method of the embodiment of the present application offsets the basic motion vector in the bi-predictive basic motion vector group based on the preset offset set, and the finite calculation can obtain the more accurate current image block.
  • the motion vector makes the predicted residual smaller, so that the coding efficiency can be improved.
  • the video image processing method of the embodiment of the present application may be used to improve the Merge with Motion Vector Difference (MMVD) technology, also known as the Ultimate Motion Vector Expression (UMVE) technology. .
  • MMVD Merge with Motion Vector Difference
  • UMVE Ultimate Motion Vector Expression
  • it is applied to the MMVD technology to construct a merge candidate list, also referred to as a motion vector candidate list.
  • the video image processing method 1800 may further include the steps of: acquiring a merge candidate list including P combination and motion vector candidates, wherein P is an integer greater than or equal to .
  • S1810 determining the basic motion vector list may include: determining the basic motion vector list according to the merge candidate list. For example, when P is greater than or equal to 2, two sets of combined motion vector candidates in the merge candidate list are taken to form the base motion vector list.
  • the two sets of combined motion vector candidates may be the first two sets of combined motion vector candidates in the merge candidate list, or may be other two sets of combined motion vector candidates when the first two sets of combined motion vector candidates do not meet the condition.
  • the two sets of combined motion vector candidates may be any two sets of combined motion vector candidates that meet the condition in the merge candidate list, which is not limited in this embodiment of the present application.
  • the basic motion vector list is formed by filling with a motion vector (0, 0).
  • the MMVD technology first constructs a base motion vector list (base MVP) by using a combined motion vector candidate (candidate) in an existing merge candidate list (or various motion vector candidate lists obtained in different types of modes described in the foregoing). List). For example, traversing the merged motion vector candidate in the existing merge candidate list (merge list), if the number of merged motion vector candidates in the existing merge list is greater than 2, the first 2 combinations in the merge list and the motion vector candidate are taken. Form the MMVD's base MVP list; otherwise, use MV(0,0) to populate the base MVP list that forms the MMVD.
  • the basic motion vector list may be filled with other default motion vectors, such as (1, 1), (2, 2), etc., which is not limited in this embodiment of the present application.
  • each of the two sets of base MVs in the base MVP list may be a unidirectionally predicted base motion vector or a bi-predictive base motion vector group.
  • the base MVP list may include more groups or fewer groups of basic MVs, which is not limited in this application. This paper only discusses the case of bi-predictive basic motion vector groups.
  • the two basic motion vectors included in the bi-predictive basic motion vector group can be the forward basic motion vector and the backward basic motion vector.
  • Basic motion vectors such as two forward base motion vectors or two backward base motion vectors.
  • the rate distortion loss meets the preset condition, which may be that the rate distortion loss is less than a preset threshold, or the rate distortion loss is the smallest, for example, the prediction residual is the smallest, and the like.
  • the MMVD technology may offset the base motion vector predictor according to a certain rule, and generate a new motion vector predictor candidate as the MMVD motion vector predictor, and put it into the MMVD motion vector candidate list.
  • the motion vector offset (offset) in the offset set may have 8 choices (2 1 , 2 2 , . . . , 2 8 ), that is, the preset offset set. Is ⁇ 2,4,8,16,32,64,128,256 ⁇ .
  • the base motion vector (also called the base motion vector candidate) in the base MVP list has 2 sets, and the motion vector offset (offset) has 8 choices (2 1 , 2 2 , ..., 2 8 ).
  • motion vector offsets may be added or subtracted (2 choices) thereon.
  • the MMVD motion vector predictor base motion vector predictor + motion vector offset offset.
  • the embodiment of the present application may also select a part of the 64 improved modes, or derive more improved modes from the 64 improved modes, and is not limited to the 64 improved modes.
  • the number of the improvement modes may be 32 or 128, and the like, which is not limited by the embodiment of the present application. From these motion vectors, a motion vector that causes the rate distortion loss to satisfy the preset condition is determined as a motion vector of the current image block for encoding and/or decoding.
  • the motion vector that causes the rate distortion loss to satisfy the preset condition may also be equivalently determined as a motion vector that is determined from the combination of the plurality of sets of motion vector offsets such that the rate distortion loss satisfies a preset condition.
  • the offset combination determines a motion vector of the current image block according to the first base motion vector, the second base motion vector, and the motion vector offset combination that causes the rate distortion loss to satisfy a preset condition.
  • the two motion vector offsets in the motion vector offset combination may be the same or different.
  • the plurality of sets of motion vector offset combinations including the two motion vector offsets may be a motion vector offset that traverses the preset offset set to form a plurality of sets of motion vector offset combinations.
  • one of the motion vector offsets of the plurality of sets of motion vector offsets may be a motion vector offset of the offset set calculated by the preset algorithm, and the calculated motion vector is offset.
  • the amount traverses the motion vector offset in the offset set as another motion vector offset in the plurality of sets of motion vector offset combinations to form a plurality of sets of motion vector offset combinations.
  • a simple traversal to find a suitable combination of motion vector offsets can reduce the amount of computation as a whole and improve the encoding/decoding efficiency.
  • one of the motion vector offsets of the plurality of sets of motion vector offsets may be a motion vector offset of the offset set calculated by the preset algorithm, and the calculated motion vector is offset.
  • the quantity is a fixed value
  • another motion vector offset in the plurality of sets of motion vector offset combinations is obtained by scaling the fixed value to form a plurality of sets of motion vector offset combinations.
  • the multiple sets of motion vector offset combinations including the two motion vector offsets may be formed in other manners, which is not limited in this embodiment of the present application.
  • the selected motion vector offset may be scaled to obtain a new motion.
  • the first base motion vector and the second base motion vector are then adjusted by a new motion vector offset. Determining, according to the first base motion vector, the second base motion vector, and the two new motion vector offsets, the motion vector of the current image block according to the adjusted; and the current image according to the motion vector of the current image block The block is predicted.
  • the motion vector offset when determining the motion vector predictor of the current image block according to the MMVD technique, if the distance of the current image from the reference image of the two base motion vectors is different, the motion vector offset needs to be scaled, and then The scaled motion vector offset is added (or subtracted) to the base motion vector prediction to determine the motion vector of the current image block.
  • the two motion vector offsets are used to: the first basis according to the two motion vector offsets
  • the motion vector and the second base motion vector are adjusted.
  • a ratio of a distance of the current image to a reference image of the first base motion vector and a distance of the current image to a reference image of the second base motion vector is equal to the first base motion vector
  • the ratio of the motion vector offset used to the motion vector offset used by the second base motion vector is determined by the distance between the image of the current image block (eg, the first image block) and the reference image of the current image block in the two reference directions, which are the current base motions.
  • Vector reference image This implementation can further reduce the number of motion vector offset combinations to be tried, which can further improve the encoding/decoding efficiency.
  • the adjustment may be to add two motion vector offsets to the first base motion vector and the second base motion vector, respectively; or the adjustment may be to respectively divide the two motion vector offsets with the first base motion vector and The second basic motion vector is subtracted.
  • the optional motion vector offset has 8 choices of ⁇ 2, 4, 8, 16, 32, 64, 128, 256 ⁇ .
  • the motion vector offset selected by the current process is X.
  • the frame number of the current image is P2
  • the frame number of the reference image of the first base motion vector and the frame number of the reference image of the second base motion vector are P0 and P1, respectively.
  • the offset X is added to the second base motion vector by a motion vector offset of 2*X.
  • the ratio of the distance of the current image to the reference image of the first base motion vector and the distance of the current image to the reference image of the second base motion vector is not a multiple of 2
  • an appropriate ratio multiple of 2
  • the ratio of the motion vector offset used by the base motion vector to the motion vector offset used by the second base motion vector is as close as possible to the distance of the current image to the reference image of the first base motion vector and the current image to the second basis
  • a motion vector offset 4*X is added to the first base motion vector, and a motion vector offset X is added to the second base motion vector.
  • the reference image of the current image block is a specific reference image
  • the time distance between the specific reference image and the image of the current image block is not defined
  • scaling the motion vector offset is meaningless.
  • the method when the first base motion vector and/or the second base motion vector point to a specific reference image, at least one of the two motion vector offsets
  • the method includes: a motion vector offset obtained by scaling the initial motion vector offset according to a scaling value of 1 or a motion vector offset obtained by skipping the scaling operation.
  • the motion vector of the current image block is determined according to the processed motion vector offset and the base motion vector group, wherein The processed motion vector offset is the same as the motion vector offset before processing.
  • at least one of the two motion vector offsets after the processing includes: a motion vector offset obtained by scaling the initial motion vector offset according to a scaling value of 1, or , skip the motion vector offset obtained by the scaling operation.
  • one of the two motion vector offsets may be calculated by a predetermined algorithm.
  • Another motion vector offset eg, when the first base motion vector points to a particular reference image, or the second base motion vector points to a particular reference image, or both the first base motion vector and the second base motion vector point to a particular reference image (eg, The offset2) can be obtained by scaling the offset1 to a scale of 1, or by not scaling the offset1.
  • another motion vector offset eg, offset 2 may be obtained by scaling a certain initial offset by a scaling of 1 or without performing a scaling operation.
  • the initial motion vector offset may be offset 1 or other initial offsets, which is not limited in this embodiment of the present application.
  • the method is performed by an encoding end, the method further comprising: encoding according to the predicted result and transmitting a code stream to the decoding end, where the code stream includes an indication for causing the rate distortion loss to satisfy a preset.
  • the index of the conditional motion vector offset combination informs the decoding end of the index of the determined combination of the motion vector offsets, so that the decoding end can know the two motion vector offsets by a small amount of calculation, which can simplify the decoding end.
  • the index may be another motion vector offset (for example, The ratio of offset2) to offset1.
  • the method is performed by a decoding end, the method further comprising: receiving a code stream sent by the encoding end, where the code stream includes an index for indicating a combination of two motion vector offsets;
  • the determining the two motion vector offsets from the preset offset set includes: determining the two motion vector offsets according to the index.
  • the encoding end informs the decoding end of the index of the determined combination of the motion vector offsets, so that the decoding end can know the two motion vector offsets by a small amount of calculation, which can simplify the decoding end.
  • the index may be another motion vector offset (for example, The ratio of offset2) to offset1.
  • FIG. 19 is a schematic block diagram of a video image processing apparatus 1900 according to an embodiment of the present application.
  • the apparatus 1900 is for performing the method embodiment as shown in FIG.
  • the device 1900 includes the following modules.
  • the construction module 1910 is configured to determine a basic motion vector list, where the basic motion vector list includes at least one set of bi-predictive basic motion vector groups, where the bi-predictive basic motion vector group includes a first basic motion vector and a second basic motion Vector; determining, from a preset offset set, two motion vector offsets, the two motion vector offsets respectively corresponding to the first base motion vector and the second base motion vector; Determining a motion vector of the current image block by the first base motion vector, the second base motion vector, and the two motion vector offsets;
  • the prediction module 1920 is configured to predict the current image block according to the motion vector of the current image block.
  • the video image processing apparatus of the embodiment of the present application offsets the base motion vector in the bi-predictive basic motion vector group based on a preset offset set, and obtains a more accurate current image block by a limited number of calculations.
  • the motion vector makes the predicted residual smaller, so that the coding efficiency can be improved.
  • the building module 1910 determines two motion vector offsets from a preset offset set, including: the building module determines, from the offset set, that the two motions are included a plurality of sets of motion vector offset combinations of vector offsets; the constructing module 1910 determining a current image based on the first base motion vector, the second base motion vector, and the two motion vector offsets a motion vector of the block, comprising: the constructing module 1910 determining, from the plurality of sets of motion vector offset combinations, a motion vector offset combination that causes a rate distortion loss to satisfy a preset condition, according to the first base motion vector And determining, by the second base motion vector and the motion vector offset combination that causes the rate distortion loss to meet a preset condition, determining a motion vector of the current image block.
  • the video image processing apparatus 1900 is used in an encoding end, and the video image processing apparatus 1900 further includes a sending module, configured to: encode according to the predicted result, and send a code stream to the decoding end, where the code An index for indicating a motion vector offset combination that causes the rate distortion loss to satisfy a preset condition is included in the stream.
  • the video image processing apparatus 1900 is used by the decoding end, and the video image processing apparatus 1900 further includes a receiving module, configured to: receive a code stream sent by the encoding end, where the code stream includes An index of the combination of the two motion vector offsets is formed; the constructing module 1910 determines the two motion vector offsets from the preset offset set, comprising: determining the two motion vector offsets according to the index.
  • a receiving module configured to: receive a code stream sent by the encoding end, where the code stream includes An index of the combination of the two motion vector offsets is formed; the constructing module 1910 determines the two motion vector offsets from the preset offset set, comprising: determining the two motion vector offsets according to the index.
  • the method when the first base motion vector and/or the second base motion vector point to a specific reference image, at least one of the two motion vector offsets
  • the method includes: a motion vector offset obtained by scaling the initial motion vector offset according to a scaling value of 1 or a motion vector offset obtained by skipping the scaling operation.
  • the two motion vector offsets are used to: according to the two motions
  • the vector offset adjusts the first base motion vector and the second base motion vector.
  • a ratio of a distance of the current image to a reference image of the first base motion vector and a distance of the current image to a reference image of the second base motion vector is equal to A ratio of a motion vector offset used by the first base motion vector to a motion vector offset used by the second base motion vector.
  • the building module 1910 is further configured to: obtain a merge candidate list, where the merge candidate list includes a P combination and a motion vector candidate, where P is an integer greater than or equal to 1;
  • the constructing module 1910 determines the base motion vector list, and the constructing module 1910 determines the base motion vector list according to the merge candidate list.
  • the building module 1910 determines the basic motion vector list according to the merge candidate list, including: when P is greater than or equal to 2, the building module 1910 takes the merge candidate The first two sets of combined motion vector candidates in the list form the base motion vector list.
  • the building module 1910 determines the basic motion vector list according to the merge candidate list, including: when P is less than 2, the building block 1910 uses a motion vector (0, 0) Filling forms the base motion vector list.
  • the preset offset set is ⁇ 2, 4, 8, 16, 32, 64, 128, 256 ⁇ .
  • the current image block is one coding unit CU.
  • the current image block is a bi-predicted image block.
  • FIG. 20 is a schematic block diagram of a video image processing apparatus 2000 provided by an embodiment of the present application.
  • the video image processing apparatus 2000 shown in FIG. 20 may include a processor 2010 and a memory 2020 in which computer instructions are stored, and when the processor 2010 executes the computer instructions, the video image processing apparatus 2000 is caused to Performing the following steps: determining a basic motion vector list, where the basic motion vector list includes at least one set of bi-predictive basic motion vector groups, where the bi-predictive basic motion vector group includes a first base motion vector and a second base motion vector; Determining two motion vector offsets from a preset offset set, the two motion vector offsets respectively corresponding to the first base motion vector and the second base motion vector; according to the first The base motion vector, the second base motion vector, and the two motion vector offsets determine a motion vector of the current image block; and predict the current image block according to the motion vector of the current image block.
  • the video image processing apparatus 2000 of the embodiment of the present application may further include a network interface for transmitting a code stream. For example, receiving a code stream transmitted by an encoding device.
  • the processor 2010 determines two motion vector offsets from a preset offset set, including: determining, from the offset set, a plurality of groups including two motion vector offsets a motion vector offset combination; the processor 2010 determines a motion vector of the current image block according to the first base motion vector, the second base motion vector, and the two motion vector offsets, including: Determining, in the plurality of sets of motion vector offset combinations, a motion vector offset combination that causes the rate distortion loss to satisfy a preset condition, according to the first base motion vector, the second base motion vector, and the rate-distortion The motion vector offset combination that satisfies the preset condition is lost, and the motion vector of the current image block is determined.
  • the video image processing apparatus 2000 is used by the encoding end, and the processor 2010 is further configured to perform encoding according to the predicted result and send a code stream to the decoding end, where the code stream includes an indication for making the rate The index of the motion vector offset combination whose distortion loss satisfies the preset condition.
  • the video image processing apparatus 2000 is used by the decoding end, and the processor 2010 is further configured to receive a code stream sent by the encoding end, where the code stream includes an offset for indicating two motion vectors. Forming a combined index; the processor 2010 determining the two motion vector offsets from the preset offset set includes: determining the two motion vector offsets according to the index.
  • the method when the first base motion vector and/or the second base motion vector point to a specific reference image, at least one of the two motion vector offsets, includes: a motion vector offset obtained by scaling the initial motion vector offset according to a scaling value of 1 or a motion vector offset obtained by skipping the scaling operation.
  • the two motion vector offsets are used to: according to the two motion vectors The offset adjusts the first base motion vector and the second base motion vector.
  • a ratio of a distance of the current image to a reference image of the first base motion vector and a distance of the current image to a reference image of the second base motion vector is equal to the The ratio of the motion vector offset used by the first base motion vector to the motion vector offset used by the second base motion vector.
  • the processor 2010 is further configured to: acquire a merge candidate list, where the merge candidate list includes a P combination and a motion vector candidate, where P is an integer greater than or equal to 1; the processor 2010 determines The basic motion vector list includes: determining the basic motion vector list according to the merge candidate list.
  • the processor 2010 determines the basic motion vector list according to the merge candidate list, including: when P is greater than or equal to 2, taking the first two combined motion vectors in the merge candidate list The candidate forms the base motion vector list.
  • the processor 2010 determines the basic motion vector list according to the merge candidate list, including: forming a base motion vector with a motion vector (0, 0) when P is less than 2. List.
  • the preset offset set is ⁇ 2, 4, 8, 16, 32, 64, 128, 256 ⁇ .
  • the current image block is one coding unit CU.
  • the current image block is a bi-predictive image block.
  • the video image processing apparatus 2000 shown in FIG. 20 or the video image processing apparatus 1900 shown in FIG. 19 can be used to perform the operations or processes in the above-described method embodiments, and the video image processing apparatus 2000 or the video image processing apparatus 1900 The operations and/or functions of the respective modules and devices are respectively implemented in order to implement the corresponding processes in the foregoing method embodiments. For brevity, no further details are provided herein.
  • the embodiment of the present application further provides a video image processing method, including: determining a basic motion vector list, where the basic motion vector list includes a basic motion vector group; and when the basic motion vector group has at least one basic motion vector, pointing to a specific When referring to the image, discarding the motion vector of the current image block based on the base motion vector group and the motion vector offset. That is, when at least one of the base motion vectors in the base motion vector group points to the specific reference image, the corresponding motion vector is discarded in the MV candidate list.
  • a merge candidate list is obtained, where the merge candidate list includes a P combination and a motion vector candidate, where P is an integer greater than or equal to 1; the determining the base motion vector list includes: The merge candidate list determines the base motion vector list.
  • determining the basic motion vector list according to the merge candidate list including: when P is greater than or equal to 2, taking the first two sets of combined motion vector candidates in the merge candidate list Forming the base motion vector list.
  • the determining the basic motion vector list according to the merge candidate list comprises: forming a base motion vector list with a motion vector (0, 0) when P is less than 2. .
  • the current image block is a coding unit CU.
  • the present application provides a video image processing apparatus, including: a determining module, configured to determine a basic motion vector list, the basic motion vector list includes a basic motion vector group; and a processing module, configured to: when the basic motion When there is at least one base motion vector in the vector group pointing to the specific reference image as a specific reference image, discarding the motion vector of the current image block according to the base motion vector group and the motion vector offset is discarded.
  • the video image processing apparatus further includes a building module, configured to: acquire a merge candidate list, where the merge candidate list includes a P combination and a motion vector candidate, where P is greater than or equal to 1
  • the determining module is specifically configured to: determine the basic motion vector list according to the merge candidate list.
  • the determining module is specifically configured to: when the P is greater than or equal to 2, take the first two sets of combined motion vector candidates in the merge candidate list to form the basic motion vector list.
  • the determining module is specifically configured to: when the P is less than 2, fill the basic motion vector list with a motion vector (0, 0).
  • the current image block is a coding unit CU.
  • the present application also provides a video image processing apparatus including a processor and a memory, wherein the memory stores computer instructions, and when the processor executes the computer instructions, the video image processing apparatus performs the following steps: determining a base motion a vector list, the base motion vector list includes a base motion vector group; when at least one of the base motion vector groups points to a specific reference image, discarding the base motion vector group and the motion vector offset A motion vector of the current image block is determined.
  • the devices of the embodiments of the present application may be implemented based on a memory and a processor, each memory is configured to store instructions for executing the method of the embodiments of the present application, and the processor executes the foregoing instructions, so that the device performs the embodiments of the present application. Methods.
  • the processor mentioned in the embodiments of the present application may include a central processing pancy (CPU), a network processor (NP), or a combination of a CPU and an NP.
  • the processor may further include a hardware chip.
  • the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • the PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (GAL), or any combination thereof.
  • the memory mentioned in the embodiments of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (ROM), a programmable read only memory (ROMM), an erasable programmable read only memory (erasable PROM, EPROM), or an electrical Erasing an programmable EPROM (EEPROM), a flash memory, a hard disk drive (HDD), or a solid-state drive (SSD).
  • the volatile memory can be a random access memory (RAM) that acts as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DDR SDRAM double data rate synchronous DRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronously connected dynamic random access memory
  • DR RAM direct memory bus random access memory
  • processor is a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, the memory (storage module) is integrated in the processor.
  • memories described herein are intended to comprise, without being limited to, these and any other suitable types of memory.
  • the embodiment of the present application further provides a computer readable storage medium having stored thereon instructions that, when executed on a computer, cause the computer to perform the steps of the foregoing method embodiments.
  • the embodiment of the present application further provides a computing device, which includes the above computer readable storage medium.
  • the embodiment of the present application further provides a computer program product comprising instructions, wherein when the computer runs the finger of the computer program product, the computer executes the steps of the foregoing method embodiment.
  • the embodiment of the present application further provides a computer chip, which causes a computer to perform the steps of the foregoing method embodiments.
  • Embodiments of the present application can be applied to the field of aircraft, especially drones.
  • circuits, sub-circuits, and sub-units of various embodiments of the present application is merely illustrative. Those of ordinary skill in the art will appreciate that the circuits, sub-circuits, and sub-units of the various examples described in the embodiments disclosed herein can be further separated or combined.
  • the device provided by the embodiment of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented in software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer instructions When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server or data center via wired (eg coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (DVD)), or a semiconductor medium (for example, an SSD) or the like.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a high-density digital video disc (DVD)
  • DVD high-density digital video disc
  • SSD semiconductor medium
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

提供一种视频图像处理方法与装置,该方法包括:对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。在保证编解码性能的前提下,可以降低复杂度。

Description

视频图像处理方法与装置
本申请要求申请号为PCT/CN2018/081652、PCT/CN2018/095710、PCT/CN2018/103693、PCT/CN2018/107436和PCT/CN2018/112805的PCT申请的优先权,其全部内容通过引用结合在本申请中。
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本申请涉及视频编解码领域,并具体涉及一种视频图像处理方法与装置。
背景技术
目前,主要的视频编码标准在帧间预测部分都采用了基于块的运动补偿技术,其主要原理是为当前图像块在已编码图像中寻找一个最相似块,该过程称为运动补偿。例如,对于一帧图像,先分成等大的编码区域(Coding Tree Unit,CTU),例如,大小为64×64或128×128。每个CTU可以进一步划分成方形或矩形的编码单元(Coding Unit,CU)。每个CU在参考帧中(一般为当前帧的时域附近的已重构帧)寻找最相似块作为当前CU的预测块。当前块(即当前CU)与相似块(即当前CU的预测块)之间的相对位移称为运动矢量(Motion Vector,MV)。在参考帧中寻找最相似块作为当前块的预测块的过程就是运动补偿。
当前技术中,通常根据两种方式构建当前CU的运动矢量候选列表,运动矢量候选列表也称为merge候选列表。运动矢量候选列表中包括空域的候选运动矢量,通常是将当前CU的已编码的邻近块的运动矢量(或运动信息)填充至运动矢量候选列表中。运动矢量候选列表中还包括时域的候选运动矢量,时域运动矢量预测(Temproal Motion Vector Prediction,TMVP)利用当 前CU在邻近已编码图像中对应位置CU(即同位CU)的运动矢量(或运动信息)。从merge候选列表中选择最优的一个候选运动矢量作为当前CU的运动矢量;根据当前CU的运动矢量确定当前CU的预测块。
高级/可选时域运动矢量预测技术(Advanced/Alternative temporal motion vector prediction,ATMVP)是一种运动矢量预测机制。ATMVP技术的基本思想是通过获取当前CU内多个子块的运动信息进行运动补偿。ATMVP技术在构建候选列表(例如merge候选列表或者AMVP(Advanced Motion Vector Prediction)候选列表)中引入当前CU内多个子块的运动信息作为候选。ATMVP技术的实现大致可以分为两个步骤。第一步,通过扫描当前CU的候选运动矢量列表或当前CU的相邻图象块的运动矢量,确定一个时域矢量;第二步,将当前CU划分为N×N(N默认为4)的子块(sub-CU),根据第一步获取的时域矢量确定各个子块在参考帧中的对应块,并根据各个子块在参考帧中对应块的运动矢量,确定各个子块的运动矢量。
在当前ATMVP技术的第一步中,通过扫描当前CU的候选运动矢量列表或当前CU的相邻图象块的运动矢量,确定一个时域矢量的过程还存在改进的空间。当前ATMVP技术的第二步对sub-CU的大小进行了帧级自适应的设置,默认大小为4×4。当满足一定预设条件时,sub-CU的大小将被设置为8×8。sub-CU的大小设定存在一些与目前运动信息存储颗粒度(8×8)不匹配的问题。ATMVP技术与TMVP技术在某些情况下存在冗余操作,构造候选运动矢量列表的过程存在改进的空间。
发明内容
本申请提供一种视频图像处理方法与装置,在保持现有ATMVP技术的性能增益的前提下,可以降低ATMVP技术的复杂度。
第一方面,提供一种视频图像处理方法,该方法包括:
对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;
根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;
将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
在本申请提供的方案中,在获取当前图像块的参考运动矢量过程中,仅对已经获取的M个候选运动矢量中的N(N小于M)个候选运动矢量依次扫描,相对于现有技术,可以减少在获取当前图像块的参考运动矢量过程中对候选运动矢量的扫描次数。应理解,将本申请提供的方案应用于现有ATMVP技术的第一步中,可以对其存在的冗余操作进行简化。
第二方面,提供一种视频图像处理方法,该方法包括:
根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块;
对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;
根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;
根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者;
当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
第三方面,提供一种视频图像处理装置,该装置包括:
构建模块,用于对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
预测模块,用于根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
第四方面,提供一种视频图像处理装置,该装置包括:
构建模块,用于根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块;对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者;当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
预测模块,用于根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
第五方面,提供一种视频图像处理装置,该装置包括存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行第一方面或第一方面的任一可能的实现方式中的方法。
第六方面,提供一种视频图像处理装置,该装置包括存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行第二方面或第二方面的任一可能的实现方式中的方法。
第七方面,提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第一方面或第一方面的任一可能的实现方式中的方法。
第八方面,提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第二方面或第二方面的任一可能的实现方式中的方法。
第九方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第一方面或第一方面的任一可能的实现方式中的方法。
第十方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第二方面或第二方面的任一可能的实现方式中的方法。
第十一方面,提供一种视频图像处理方法,该方法包括:
确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;
从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;
根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;
根据所述当前图像块的运动矢量对所述当前图像块进行预测。
第十二方面,提供一种视频图像处理方法,该方法包括:
确定基础运动矢量列表,所述基础运动矢量列表中包括基础运动矢量组;
当所述基础运动矢量组中有至少一个基础运动矢量指向特定参考图像时,放弃根据所述基础运动矢量组和运动矢量偏移量确定所述当前图像块的运动矢量。
第十三方面,提供一种视频图像处理装置,该装置包括:
构建模块,用于确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;
预测模块,用于根据所述当前图像块的运动矢量对所述当前图像块进行预测。
第十四方面,提供一种视频图像处理装置,该装置包括:
确定模块,用于确定基础运动矢量列表,所述基础运动矢量列表中包括基础运动矢量组;
处理模块,用于当所述基础运动矢量组中有至少一个基础运动矢量指向特定参考图像时,放弃根据所述基础运动矢量组和运动矢量偏移量确定所述当前图像块的运动矢量。
第十五方面,提供一种视频图像处理装置,该装置包括存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并 且对所述存储器中存储的指令的执行使得,所述处理器用于执行第十一方面或第十一方面的任一可能的实现方式中的方法。
第十六方面,提供一种视频图像处理装置,该装置包括存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行第十二方面或第十二方面的任一可能的实现方式中的方法。
第十七方面,提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第十一方面或第十一方面的任一可能的实现方式中的方法。
第十八方面,提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第十二方面或第十二方面的任一可能的实现方式中的方法。
第十九方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第十一方面或第十一方面的任一可能的实现方式中的方法。
第二十方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第十二方面或第十二方面的任一可能的实现方式中的方法。
附图说明
图1是本申请实施例提供的视频图像处理方法的示意性流程图。
图2是通过当前图像块的邻近块获取当前块的候选运动矢量的示意图。
图3是对候选运动矢量进行缩放处理的示意图。
图4是本申请实施例提供的视频图像处理方法的另一示意性流程图。
图5是本申请实施例提供的视频图像处理装置的示意性框图。
图6是本申请实施例提供的视频图像处理装置的另一示意性框图。
图7是本申请实施例提供的视频图像处理方法的示意性流程图。
图8是本申请实施例提供的视频图像处理方法的另一示意性流程图。
图9是本申请实施例提供的视频图像处理装置的再一示意性框图。
图10是本申请实施例提供的视频图像处理装置的再一示意性框图。
图11是本申请实施例提供的视频图像处理装置的再一示意性框图。
图12是获取运动矢量第一候选列表的候选者的示意图。
图13是构造运动矢量第一候选列表的候选者的再一示意图。
图14和图15是本申请实施例提供的视频图像处理方法的示意性流程图。
图16是本申请实施例提供的视频图像处理装置的再一示意性框图。
图17是本申请实施例提供的视频图像处理装置的再一示意性框图。
图18是本申请实施例提供的视频图像处理方法的另一示意性流程图。
图19是本申请实施例提供的视频图像处理装置的再一示意性框图。
图20是本申请实施例提供的视频图像处理装置的再一示意性框图。
具体实施方式
下面将结合附图,对本申请实施例中的技术方案进行描述。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。
在视频编解码中,预测步骤用于减少图像中的冗余信息。预测块指的是一帧图像中用于预测的基本单元,在一些标准中,该预测块也称为预测单元(Prediction Unit,PU)。在对一帧图像进行编码/压缩之前,图像被分成多个图像块,进一步的,该多个图像块中的每一个图像块可以再次被分成多个图像块,以此类推。不同的编码方法中,分割的层级数量可以不同,所承担的操作方法也不同。不同的编码标准中,对同一层级上的图像块的名称可能不同。例如,在一些视频标准中,一帧图像第一次被分割成的多个图像块中的每个图像块称为编码树单元(Coding Tree Unit,CTU);每个编码树单元可以包含一个编码单元(Coding Unit,CU)或者再次分割成多个编码单元;一个编码单元可以根据预测方式分割成一个、两个、四个或者其他数量的预测单元。在一些视频标准中,该编码树单元也被称为最大编码单元(Largest Coding Unit,LCU)。
预测指的是查找与该预测块相似的图像数据,也称为该预测块的参考块。通过对该预测块和该预测块的参考块之间的差异进行编码/压缩,以减少编码/压缩中的冗余信息。其中,预测块与参考块的差异可以是由该预测块与 该参考块的相应像素值相减得到的残差。预测包括帧内预测和帧间预测。帧内预测指的是在预测块所在帧内查找该预测块的参考块,帧间预测指的是在除预测块所在帧以外的其他帧内查找该预测块的参考块。
在现有的一些视频标准中,预测单元是图像中最小的单元,预测单元不会继续被划分成多个图像块。下文中所提到的“图像块”或“当前图像块”指的是一个预测单元(或一个编码单元),而且一个图像块可以被继续划分成多个子图像块,每个子图像块可以进一步做预测。
本方案中,在对当前图像块进行预测之前,会构建运动矢量候选列表,根据在该运动矢量候选列表中选中的候选运动矢量对当前图像块进行预测。运动矢量候选列表有多类模式。下面先对运动矢量候选列表的多类模式进行举例说明。
在第一类模式中,作为第一个示例,在编码端,在构建好运动矢量候选列表之后,可以通过如下步骤完成当前图像块的编码。
1)从运动矢量候选列表中选出最优的一个运动矢量(记为MV1),将选出的MV1作为当前图像块的运动矢量,并获得该MV1在运动矢量候选列表中的索引。
2)根据当前图像块的运动矢量MV1,从参考图像(即参考帧)中确定当前图像块的预测图像块。即,确定当前图像块的预测图像块在参考帧中的位置。
3)获得当前图像块与预测图像块之间的残差。
4)向解码端发送当前图像块的运动矢量MV1在运动矢量候选列表中的索引以及步骤3)获得的残差。
作为示例,在解码端,可以通过如下步骤解码出当前图像块。
1)从编码端接收残差与当前图像块的运动矢量在运动矢量候选列表中的索引。
2)通过根据本申请实施例的方法获取运动矢量候选列表。解码端获取的运动矢量候选列表与编码端获取的运动矢量候选列表一致。
3)根据索引,在运动矢量候选列表中获取当前图像块的运动矢量MV1。
4)根据运动矢量MV1,获取当前图像块的预测图像块,再结合残差,解码得到当前图像块。
也即在第一类模式中,当前图像块的运动矢量等于预测MV(Motion  vector prediction,MVP)。在一些标准中,该第一类模式也称为Merge模式。
在第二类模式中,与第一类模式不同的是,编码端从运动矢量候选列表选出最优的一个运动矢量MV1后,还以该MV1为搜索起点进行运动搜索,将最终搜索到的位置与搜索起点的位移记为运动矢量差值(Motion vector difference,MVD)。然后根据当前图像块的运动矢量MV1+MVD,从参考图像中确定当前图像块的预测图像块。编码端还向解码端发送给MVD。在一些标准中,该第二类模式也称为AMVP模式(即普通帧间预测模式)。
不同类型模式下的运动矢量候选列表的构建方式可以相同也可以不同。同一种方式构建的运动矢量候选列表可以只适用其中的一种类型模式,也可以适用不同类型的构建模式,在此不做限制。
本方案中将提供两种构建方式的运动矢量候选列表,为描述方便,下文中称该两种构建方式的运动矢量候选列表为运动矢量第一候选列表和运动矢量第二候选列表。该两种列表的一个区别在于:运动矢量第一候选列表中的至少一个候选者包括子图像块的运动矢量,运动矢量第二候选列表中的每个候选者包括图像块的运动矢量。如上文所说,这里的图像块和当前图像块是同一类型的概念,指的均是一个预测单元(或一个编码单元),子图像块指的是在该图像块的基础上分割得到的多个子图像块。在采用运动矢量第一候选列表中的候选者进行预测时,根据该候选者确定当前图像块的参考块,然后计算该图像块与该参考块的残差。在采用运动矢量第二候选列表中的候选者进行预测时,若采用的候选者为子图像块的运动矢量,则根据该候选者确定当前图像块中的各子图像块的参考块,然后计算当前图像块中的各子图像块与其参考块的残差,将各子图像块的残差拼接成该当前图像块的残差。
其中,在确定运动矢量第一候选列表和/或运动矢量第二候选列表中候选者时,可以根据ATMVP技术确定其中一个候选者。在一个示例中,在构建运动矢量第一候选列表时,可以将根据ATMVP技术确定的运动矢量作为第一个候选者加入列表中。在一个示例中,在构建运动矢量第二候选列表时,可以在根据当前图像块的预设位置处的预设数量个空域邻近块的运动矢量往运动矢量第二候选列表中加入候选者之后,将根据ATMVP技术确定的运动矢量作为候选者加入列表中,当然,该两个候选列表的候选者加入顺序可以是其他顺序,对此不做限制。
下面先结合运动矢量第二候选列表的构建方式使用方式对如何根据 ATMVP技术确定其中一个候选者进行举例说明。
在描述运动矢量第二候选列表的构建方式中,为方便理解,在此对运动矢量进行解释。一个图像块的运动矢量可以包含两个信息:1)该运动矢量所指向的图像;2)位移。一个图像块的运动矢量所表示的含义为,在该运动矢量所指向的图像中与该图像块具有该位移的图像块。对于已编/解码的图像块,其运动矢量所包含的含义包括:该已编/解码的图像块的参考图像,以及已编/解码的图像块的参考块相对该已编/解码的图像块的位移。需注意的是,本文中提到的一个图像块的参考块,指的是表示用于计算该图像块的残差的图像块。
图1为本申请实施例提供的视频图像处理方法的示意性流程图。该方法包括如下步骤。
S110,确定用于加入当前图像块的运动矢量第二候选列表的M个候选运动矢量。
当前图像块为待进行编码(或解码)的图像块。当前图像块所在的图像帧称为当前帧。例如,当前图像块为一个编码单元(CU)。
例如,当前图像块的运动矢量第二候选列表可以是Merge候选列表或者AMVP候选列表。例如,该运动矢量第二候选列表可以是Merge候选列表中的常规运动矢量候选列表(Normal Merge List)。应理解,运动矢量第二候选列表也可以有别的名称。
M个候选运动矢量可以是根据当前图像块在当前帧内的M个邻近块的运动矢量确定的。邻近块可以为在当前帧上与当前图像块的位置相邻或具有一定位置间距的图像块。应理解,这M个邻近块为当前帧内已完成编码(或解码)的图像块。
作为示例,如图2所示,当前图像块的M个邻近块为图2中所示的位于当前图像块周围4个位置A 1(左)→B 1(上)→B 0(右上)→A 0(左下)的图像块。根据这4个位置的图像块的运动矢量确定当前图像块的M(即M等于4)个候选运动矢量。
此外,当该M个邻近块中出现不可获得的邻近块时,或者该M个邻近块中出现采用帧内编码模式的邻近块时,该不可获得的邻近块或该采用帧内编码模式的邻近块的运动矢量不可获得。那么,该不可获得的邻近块的运动矢量不作为候选运动矢量,放弃将该不可获得的运动矢量加入当前图像块的 运动矢量第二候选列表。
作为一种可能的实现方式,完成步骤S110之后,M个候选运动矢量已加入运动矢量第二候选列表。在步骤S120中,可以直接对运动矢量第二候选列表进行扫描。
S120,对M个候选运动矢量中的N个候选运动矢量依次扫描,根据扫描结果确定参考运动矢量,N小于M。其中,M和N都是自然数。
无论该M个候选运动矢量都被加入运动矢量第二候选列表中,还是该M个候选运动矢量中出现部分运动矢量不可得而导致该M个候选运动矢量中只有部分候选运动矢量被加入运动矢量第二候选列表中,都固定对该M个候选运动矢量中的N个候选运动矢量依次扫描。固定对该M个候选运动矢量中的N个候选运动矢量依次扫描,可以指的是固定对该N个候选运动矢量中已被加入候选运动矢量列表的候选运动矢量进行扫描;或者,可以指的是固定对该M个候选运动矢量中已被加候选运动矢量列表的N个候选运动矢量进行扫描。
根据N个候选运动矢量的扫描结果确定参考运动矢量的过程可以是,基于预设条件对N个候选运动矢量依次进行判断,根据判断结果确定参考运动矢量。
作为示例,预设条件包括:图像块可以获得或不采用帧内预测编码模式,且所述候选运动矢量指向的参考帧与当前图像块的参考图像相同。
其中,当前图像块的参考图像为与当前图像块所在的图像时间距离最近的参考图像;或,当前图像块的参考图像为编解码端预设的参考图像;或,当前图像块的参考图像为在视频参数集、序列头、序列参数集、图像头、图像参数集、条带头中指定的参考图像。
例如,该当前图像块的参考图像为当前图像块的同位帧,同位帧即为在条带级信息头中设定的用于获取运动信息进行预测的帧。在一些应用场景中,该同位帧也被称为位置相关帧(collocated picture)。
应理解,根据未来技术的演进,该预设条件可能会被赋予其它不同的定义,相应的方案也落入本申请保护范围。
下文将对根据N个候选运动矢量的扫描结果确定参考运动矢量的过程进行详细描述。
在步骤S120中,仅对步骤S110中获取的M个候选运动矢量中的N个 运动矢量进行扫描,这样可以减少扫描次数。
可选地,在步骤S120中,可以对M个候选运动矢量中的前N个候选运动矢量依次扫描。
可选地,在步骤S120中,可以对M个候选运动矢量中的最后N个候选运动矢量依次扫描;或者,可以对M个候选运动矢量中的中间的N个候选运动矢量依次扫描。本申请对此不作限定。
作为一个示例,在步骤S120中,对M个候选运动矢量中的部分候选运动矢量依次扫描。
作为另一个示例,在步骤S120中,对当前已加入运动矢量第二候选列表中的候选运动矢量中的部分候选运动矢量进行依次扫描。
S130,根据参考运动矢量、当前图像块以及当前图像块的参考图像,确定继续加入运动矢量第二候选列表中的候选运动矢量。
当前图像块的运动矢量第二候选列表中包括步骤S110中确定的M个候选运动矢量以及步骤S130中确定的候选运动矢量。在一个示例中,在根据S130确定继续加入运动矢量第二候选列表的第M+1个候选运动矢量之后,还根据其他方法确定继续加入运动矢量第二候选列表的其他候选运动矢量,在此不做限制。
在构建完运动矢量第二候选列表后,如图1所示,该方法还可以包括:S140,根据步骤S130得到的运动矢量第二候选列表,确定当前图像块的运动矢量。
应理解,本申请提供的方案可应用于ATMVP技术中。在现有ATMVP技术的第一步中,通过对运动矢量第二候选列表中当前已加入的所有空域候选运动矢量进行扫描,来获取当前图像块的时域矢量。例如,运动矢量第二候选列表通常会被填充4个空域候选运动矢量,则可能会出现如下情形:需要扫描4个候选运动矢量,才能获得当前图像块的时域矢量。
而在本申请实施例中,在获取当前图像块的参考运动矢量过程中,仅对已经获取的M个候选运动矢量中的N(N小于M)个候选运动矢量依次扫描,相对于现有技术,可以减少在获取当前图像块的参考运动矢量过程中对候选运动矢量的扫描次数。应理解,将本申请提供的方案应用于现有ATMVP技术的第一步中,可以对其存在的冗余操作进行简化。
申请人在通用视频编码(Versatile Video Coding)最新的参考软件 VTM-2.0上,选取官方通测序列作为测试序列,测试配置为RA配置及LDB配置,对本申请提供的方案进行测试,测试结果显示,减少扫描次数后,还可以保持ATMVP技术的性能增益。
因此,本申请提供的方案在保持现有ATMVP技术的性能增益的前提下,可以降低ATMVP技术的复杂度。
应理解,本申请提供的构建的方案所形成的运动矢量第二候选列表可以应用于编码端与解码端。换句话说,本申请提供的方法的执行主体可以为编码端,也可以为解码端。
作为示例,本申请提供的构建方案所形成的运动矢量第二候选列表的可以应用于上述的第一类模式(例如Merge模式)。
可选地,在本实施例中,在步骤S110中,根据当前图像块在当前帧内的4个邻近块的运动矢量,确定用于加入当前图像块的运动矢量第二候选列表中的4个候选运动矢量,即M等于4。在步骤S120中,对4个候选运动矢量中的N个候选运动矢量进行扫描,N小于4。
例如,N等于1。例如,在步骤S120中,仅对运动矢量第二候选列表中的第一个候选运动矢量进行扫描。再例如,N等于2或3。
下文将对步骤S120中,根据N个候选运动矢量的扫描结果确定当前图像块的参考运动矢量的方式进行描述。
在步骤S120中,逐个判断M个候选运动矢量中的N个候选运动矢量是否满足预设条件,根据判断结果确定参考运动矢量。本文中以预设条件的定义为候选运动矢量指向的参考帧与当前图像块的参考图像相同为例进行描述。
可选地,在步骤S120中,对N个候选运动矢量依次进行扫描,当扫描到第一个符合预设条件的候选运动矢量时,即当扫描到第一个参考帧与当前帧的同位帧相同的候选运动矢量时,停止扫描,且根据扫描到的第一个符合预设条件的候选运动矢量确定参考运动矢量。
应理解,在扫描到第一个符合预设条件的候选运动矢量时,扫描次数可能等于N,也可能小于N。例如,当第一个扫描的候选运动矢量就满足预设条件,则停止扫描,将这个候选运动矢量作为当前图像块的参考运动矢量。
可选地,在步骤S120中,当在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量时,即当N个候选运动矢量指向的参考帧均与当前图像 块的同位帧不同时,将默认值作为参考运动矢量的值。
例如,默认值为(0,0),即参考运动矢量为(0,0)。应理解,根据实际情况,默认值还可以有别的定义。
可选地,在步骤S120中,当在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量时,即当N个候选运动矢量指向的参考帧均与当前图像块的同位帧不同时,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,根据缩放处理后的特定候选运动矢量确定参考运动矢量。
该特定候选运动矢量可以为N个候选运动矢量中,按扫描顺序得到的第一个运动矢量或者最后一个运动矢量。
该特定候选运动矢量还可以为N个候选运动矢量中,按其它扫描顺序得到的运动矢量。
当预设条件的定义为候选运动矢量指向的参考帧与当前图像块的参考帧相同时,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,根据缩放处理后的特定候选运动矢量确定参考运动矢量,包括:对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,使得经过缩放处理的特定候选运动矢量指向的参考帧与当前图像块的参考图像相同;将经过缩放处理后的特定候选运动矢量作为参考运动矢量。
如图3所示,curr_pic表示当前图像块所在图像,col_pic表示当前图像块的同位帧(collocated picture),neigh_ref_pic表示特定候选运动矢量指向的参考帧。一种实现方式中,基于特定候选运动矢量指向的参考图像neigh_ref_pic与该特定运动矢量对应的图像块所在图像curr_pic之间的时间距离,以及当前图像块的参考图像col_pic与当前图像块所在图像curr_pic之间的时间距离,确定该特定运动矢量的缩放比例。
应理解,图像帧与图像帧之间的运动程度差异性较差,在当前帧与其同位帧之间运动剧烈的情况下,如果使用运动矢量(0,0)作为定位当前块的对应块的依据,并没有考虑帧间的运动,而是直接假设当前块在同位帧中的绝对坐标没有发生任何改变,实际上有很大概率,当前块在同位帧中的坐标与其在当前帧中的坐标不同,因此,会产生较大的偏差。
本申请实施例,在N个候选运动矢量中未扫描到参考帧与当前帧的同位帧相同的候选运动矢量的情况下,对N个候选运动矢量中的一个候选运动矢量进行缩放处理,使其参考帧与当前帧的同位帧相同,然后将这个经过缩放 处理的候选运动矢量作为当前图像块的运动矢量。这样可以提高当前图像块的运动矢量的准确度。
可选地,当N为小于M且大于1的整数时,本实施例中的特定候选运动矢量可以为N个候选运动矢量中参考帧与当前图像块的同位帧在时域上距离最近的候选运动矢量。
应理解,选择N个候选运动矢量中参考帧与当前帧的同位帧距离最近的一个候选运动矢量进行缩放处理,可以减少进行缩放处理的耗时,从而提高获取当前图像块的运动矢量的效率。
可选地,当N为小于M且大于1的整数时,本实施例中的特定候选运动矢量也可以为N个候选运动矢量中的任意一个候选运动矢量。
应理解,当N等于1时,本实施例中的特定候选运动矢量就是所扫描的这一个候选运动矢量。
可选地,作为一个实施例,N等于1,在步骤S120中,通过扫描该运动矢量第二候选列表中的一个候选运动矢量,获取该当前图像块的参考运动矢量。当扫描的这个候选运动矢量指向参考帧与当前图像块所在的当前帧的同位帧不同时,对这个候选运动矢量进行缩放处理,使得经过该缩放处理的候选运动矢量的参考帧与该当前帧的同位帧相同;将经过该缩放处理后的候选运动矢量作为该当前图像块的参考运动矢量。当扫描的这一个候选运动矢量的参考帧与当前帧的同位帧相同时,则将这个候选运动矢量作为当前图像块的运动矢量。
在本实施例中,通过扫描该候选运动矢量列表中的一个候选运动矢量,获取该当前图像块的运动矢量,有效减少了在获取当前图像块的运动矢量过程中扫描候选运动矢量的次数;在扫描的候选运动矢量的参考帧与当前帧的同位帧不同时,对这个候选运动矢量进行缩放处理,使其参考帧与当前帧的同位帧相同,然后以这个经过缩放处理后的候选运动矢量作为当前图像块的运动矢量,这样可以提高当前图像块的运动矢量的准确度。因此,相对于现有技术,本申请实施例提供的方案,既可以简化确定当前图像块的运动矢量的过程,又可以提高当前图像块的运动矢量的准确度。
应理解,当预设条件的定义发生变化时,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理的过程也要相应发生变化,即保证缩放处理之后的特定候选运动矢量满足预设条件。
下文将对步骤S130中,根据参考运动矢量、当前图像块以及当前图像块的参考图像,确定继续加入运动矢量第二候选列表中的候选运动矢量的过程进行描述。
可选地,作为一种实现方式,根据参考运动矢量、当前图像块以及当前图像块的参考图像,确定继续加入运动矢量第二候选列表中的候选运动矢量,包括:将当前图像块划分成多个子图像块;根据参考运动矢量,在当前图像块的参考图像中确定子图像块的相关块;根据相关块的运动矢量,确定继续加入运动矢量第二候选列表的候选运动矢量。
在一个示例中,将当前图像块中每个子图像块的相关块的运动矢量作为候选者加入到运动矢量第二候选列表中。在采用该候选者进行预测时,根据当前图像块中每个子图像块的相关块的运动矢量对该子图像块进行预测。
在一个示例中,将当前图像块的相关块的代表运动矢量作为候选者加入到运动矢量第二候选列表中,并标记该候选者为根据ATMVP技术确定的。当采用该候选者进行预测时,根据该标记和候选者确定当前图像块的相关块,将当前图像块和该相关块采用相同的方式划分成多个子图像块,当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;根据该相关块中各子图像块的运动矢量分别对当前图像块中对应的子图像块进行预测。可选的,当相关块中出现运动矢量不可获得的子图像块时,采用该相关块的代表运动矢量替代该不可获得的运动矢量,对当前图像块中对应的子图像块进行预测。可选的,当相关块中出现运动矢量不可获得的子图像块,且相关块的代表运动矢量均不可获得时,放弃将根据ATMVP技术确定的候选者加入到该运动矢量第二候选列表中。一种示例中,当相关块中的子图像块不可获得,或者相关块中的子图像块采用帧内编码模式时,确定该相关块中出现不可获得运动矢量的子图像块。
可选的,当前图像块的相关块的代表运动矢量可以指的是该相关块的中心位置的运动矢量,或者其他代表该相关块的运动矢量,在此不做限制。
在一些视频编解码标准中,相关块可以被称为collocated block或者corresponding block。
例如,当前图像块为一个CU,将其划分之后得到的子图像块可称为sub-CU。可选地,子图像块的大小和/或子图像块的相关块的大小固定为大于或等于64个像素。可选地,子图像块的大小和/或子图像块的相关块的大 小均固定为8×8个像素。
在目前ATMVP技术中,对子图像块的大小进行帧级自适应的设置,子图像块的大小默认为4×4,当满足一定条件时,子图像块的大小被设置为8×8。例如,在编码端,在编码当前图像块时,计算同一时域层的上一个编码图像块进行ATMVP模式编码时CU中的各个子图像块的平均块大小,当平均块大小大于阈值,当前图像块的子图像块的尺寸被设置为8×8,否则使用默认值4×4。目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。应理解,当将子图像块的大小设置为4×4,该子图像块的运动矢量的大小(也为4×4)不符合当前标准中运动矢量的存储粒度。此外,目前ATMVP技术中,编码当前图像块时,还需要存储同一时域层的上一个已编码图像块的子图像块的大小的信息。
在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
目前,TMVP技术将当前CU右下角或中心位置在邻近已编码图像中同位CU的MV进行伸缩得到当前CU的时域候选运动矢量。换句话说,TMVP是通过遍历参考图像中两个固定位置的编码块来得到MVP的,该顺序是TB→TC,直接将第一个遍历到的MV作为TMVP的MVP。在本申请实施例中,ATMVP技术中的子图像块大小设置为8×8,它将使用本申请各实施例的已有的合并(merge)候选列表(merge list)中的一个MV进行相关块的定位,将定位到的相关块的MV作为ATMVP的MVP。
在视频标准VVC的一些版本中,构建merge列表是先进行ATMVP的merge候选列表构建再进行TMVP的merge候选列表构建。在视频标准VVC另外的一些版本中,TMVP的merge候选列表构建在merge list构建过程进行,ATMVP的merge候选列表构建在affine merge list构建过程中进行。ATMVP和TMVP分别构建两个不同的列表,然而从道理上说不需要在两个merge候选列表中添加同一个或同一组MV。在当前CU的宽和高均等于8的情况下,TMVP与ATMVP分别构建的merge候选列表可能存在一定的冗余,即两种技术可能会为当前CU导出同一组时域候选运动信息。
在本申请一些实施例中,在当前CU的宽和高均等于8的情况下,设置 不进行TMVP操作。换句话说,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作。由此可以避免ATMVP和TMVP分别构建的merge候选列表中包括同一个或同一组MV,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。这是由于:编码器和/或解码器的硬件设计尽量要求相同大小的处理区域完成编码或解码的时间一致。然而对于包含较多小块的区域,需要编码或解码的流水时间就会远远超过其他区域。进一步地,可以在子图像块和/或所述子图像块的相关块的宽和高均小于8像素的情况下,设置不进行TMVP操作。因此节省小块编码或解码的流水时间对于硬件的并行处理十分有意义。除此之外,目前编码技术对于时域相关性的利用率越来越高,许多时域预测技术被采纳,如ATMVP技术。因此对小块而言,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,在保证子图像块的大小和/或子图像块的相关块的大小固定为等于64个像素的前提下,子图像块的大小和/或子图像块的相关块的大小还可以为别的尺寸,例如子图像块的大小和/或子图像块的相关块的大小为A×B,A≤64,B≤64,A和B均为4的整数。例如,子图像块的大小和/或子图像块的相关块的大小为4×16个像素,或者为16×4个像素。
可选地,作为另一种实现方式,根据参考运动矢量、当前图像块以及当前图像块的参考图像,确定继续加入运动矢量第二候选列表中的候选运动矢量,包括:根据参考运动矢量,在当前图像块的参考图像中确定当前图像块的相关块;根据相关块的运动矢量,确定继续加入运动矢量第二候选列表的候选运动矢量。
在编/解码技术中,一般采用已编/解码的图像来作为当前待编/解码的参考图像。在一些实施例中,还可以构造一个参考图像,来提高参考图像与当前待编/解码图像的相似度。
举例而言,视频内容中存在一类特定的编/解码场景,在该场景中背景基本不发生改变,只有视频中的前景发生改变或者运动。例如,视频监控就属于该类场景。在视频监控场景中通常监控摄像头固定不动或者只发生 缓慢的移动,可以认为背景基本不发生变化。与此相对,在视频监控镜头中所拍摄到的人或车等物体则经常发生移动或改变,可以认为前景是经常变化的。在这类场景中,可以造一个特定的参考图像,该特定的参考图像中只包含高质量的背景信息。该特定参考图像中可以包括多个图像块,任意一个图像块均是从某个已解码图像中取出的,该特定参考图像中的不同图像块可能取自于不同的已解码图像。在进行帧间预测时,当前待编/解码图像的背景部分可通过参考该特定参考图像,由此能够减少帧间预测的残差信息,从而提高编/解码效率。
上述是对特定参考图像的一个具体举例。在一些实现方式中,特定参考图像具有以下至少一种性质:构造帧(composite reference)、长期参考图像、不被输出的图像。其中,该不被输出的图像,指的是不被输出显示的图像;一般来说,该不被输出的图像是作为其他图像的参考图像存在的。例如,该特定参考图像可以是构造的长期参考图像,或者可以是不被输出的构造帧,或者可以是不被输出的长期参考图像等等。在一些实现方式中,构造帧也被称为合成参考帧。
在一些实现方式中,非特定参考图像可以是不具有以下至少一种性质的参考图像:构造帧、长期参考图像、不被输出的图像。例如,该非特定参考图像可以包括除构造帧以外的参考图像,或者包括除长期参考图像以外的参考图像,或者包括除不被输出的图像以外的参考图像,或者包括除构造的长期参考图像以外的参考图像,或者包括除不被输出的构造帧以外的参考图像,或者包括除不被输出的长期参考图像以外的参考图像等等。
在一些实现方式中,视频中的图像可作为参考图像时,可以区分长期参考图像和短期参考图像的。其中,该短期参考图像是与长期参考图像相对应的一个概念。短期参考图像存在于参考图像缓冲区中一段时间,经过该短期参考图像之后的已解码的参考图像在参考图像缓冲区中的若干移入和移出操作之后,短期参考图像会被移出参考图像缓冲区。参考图像缓冲区也可以称为参考图像列表缓存、参考图像列表、参考帧列表缓存或参考帧列表等,本文中将其统称为参考图像缓冲区。
长期参考图像(或长期参考图像中的一部分数据)可以一直存在于参考图像缓冲区中,该长期参考图像(或长期参考图像中的一部分数据)不受已解码的参考图像在参考图像缓冲区中的移入和移出操作的影响,只有在解 码端发出更新指令操作时该长期参考图像(或长期参考图像中的一部分数据)才会被移出参考图像缓冲区。
短期参考图像和长期参考图像在不同的标准中的叫法可能不同,例如,在H.264/高级视频编码(advanced video coding,AVC)或者H.265/HEVC等标准中短期参考图像被称为短期参考帧(short-term reference),长期参考图像被称为长期参考帧(long-term reference)。又如在信源编码标准(audio video coding standard,AVS)1-P2、AVS2-P2、电气和电子工程师协会(institute of electrical and electronics engineers,IEEE)1857.9-P4等标准中,长期参考图像被称为背景帧(background picture)。又如在VP8、VP9等标准中,长期参考图像被称为黄金帧(golden frame)。
应理解,本申请实施例中采用特定的术语,并不代表必须应用到特定场景下,例如,将长期参考图像称为长期参考帧并不代表必须用到H.264/AVC或者H.265/HEVC等标准对应的技术中。
以上提到的长期参考图像可以是从多个已解码图像中取出的图像块构造得到的,或者利用多个已解码图像对已有参考帧(例如,预存的参考帧)进行更新得到。当然,该构造的特定参考图像也可以是短期参考图像。或者,长期参考图像也可以不是构造的参考图像。
在上述的实现方式中,特定参考图像可以包括长期参考图像,非特定参考图像可以包括短期参考图像。
可选地,参考帧的类型可以在码流结构中通过特殊字段标识出来。
可选地,在确定参考图像为长期参考图像时,确定该参考图像为特定参考图像;或,在确定参考图像为不被输出的帧时,确定该参考图像为特定参考图像;或,在确定参考图像为构造帧时,确定该参考图像为特定参考图像;或,在确定参考图像为不被输出的帧,且进一步确定该参考图像为构造帧时,确定该参考图像为特定参考图像。
可选地,各种类型的参考图像都可以具有相应的标识,此时对于解码端而言,可以依据参考图像所具有的标识来确定该参考图像是否为特定参考图图像。
在一些实现方式中,在确定参考图像具有长期参考图像的标识时,确定该参考图像为特定参考图像。
在一些实现方式中,在确定参考图像具有不被输出的标识时,确定该参 考图像为特定参考图像。
在一些实现方式中,在确定参考图像具有构造帧的标识时,确定该参考图像为特定参考图像。
在一些实现方式中,在确定参考图像具有以下三个标识中的至少两个标识时,确定参考图像为特定参考图像:长期参考图像的标识、不被输出的标识、构造帧或合成参考帧的标识。例如,确定参考图像具有不被输出的标识,且确定参考图像具有构造帧的标识时,确定参考图像为特定参考图像。
具体地,图像可以具有指示是否是被输出帧的标识,当某一图像被指示是不被输出时,则表明该帧为参考图像,进一步地,判断该帧是否具有构造帧的标识,如果是,则确定所述参考图像为特定参考图像。如果某一图像被指示被输出,则可以不进行是否是构造帧的判断,直接确定该帧不是特定参考图像。或者,如果某一图像被指示不被输出,但是具有不是构造帧的标识,则可以确定该帧不是特定参考图像。
可选地,从图像头(picture header)、图像参数集(PPS,picture parameter set)、条带头(slice header)中解析参数确定所述参考图像满足以下条件之一时,确定所述参考图像为特定参考图像:
所述参考图像为长期参考图像;
所述参考图像为构造参考图像;
所述参考图像为不被输出图像;
所述参考图像为不被输出图像时,进一步判断所述参考图像为构造参考图像。
在本申请实施例的一些实现方式中,在确定当前图像块的运动矢量的过程中,涉及到要利用其它图像上的某个图像块的运动矢量来确定该图像块的运动矢量。为描述方便,称该图像块为第一图像块,称所要利用的其他图像上的某个图像块为该第一图像块的时域参考块或相关块。可以理解的是,第一图像块和该第一图像块的时域参考块(或相关块)位于不同的图像上。那么,在利于该时域参考块(或相关块)的运动矢量来确定第一图像块的运动矢量的过程中,可能需要对该时域参考块(或相关块)的运动矢量进行缩放。为描述方便,本文中统一采用“相关块”这个术语。
例如,ATMVP技术应用在构建AMVP候选列表中时,在根据ATMVP技术确定当前图像块的相关块的运动矢量时,需要对该相关块的运动矢量 进行缩放,然后根据缩放后的运动矢量确定当前图像块的运动矢量。一般来说,基于相关块的运动矢量指向的参考图像与该相关块所在图像之间的时间距离,以及当前图像块的参考图像与当前图像块所在图像之间的时间距离,确定该相关块的运动矢量的缩放比例。
在一个示例中,称该相关块的运动矢量为MV 2,该运动矢量MV 2所指向的参考图像的参考帧索引值为x。其中,该参考帧索引值x为MV 2所指向的参考图像的顺序编号(例如POC)与该相关块所在图像的顺序编号之差。称第一图像块的参考图像的参考帧索引值为y。其中,该参考帧索引值y为第一图像块的参考图像的顺序编号与该第一图像块所在图像的顺序编号之差。那么,对运动矢量MV 2的缩放比例为y/x。可选的,可以将运动矢量MV 2与y/x的乘积作为第一图像块的运动矢量。
然而,当相关块的运动矢量MV 2是指向一个特定参考图像时,或者,当第一图像块的参考图像是一个特定参考图像时,由于特定参考图像与第一图像块所在图像的时间距离定义不明确,对相关块的运动矢量MV 2缩放便没有意义。
可选地,在本实施例中,当根据相关块的运动矢量确定当前图像块的运动矢量时,具体地:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量确定当前图像块的运动矢量,其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
例如,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,在本实施例中,当根据相关块的运动矢量确定当前图像块的运动矢量时,具体地:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定当前图像块的运动矢量。
在一些实施例中,步骤S120包括:在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,根据缩放处理后的特定候选运动矢量确定参考运动矢量。这种情形下,可选地,该方法还包括:当特定候选运动矢量指向特定参考图 像,或者当前图像块的参考图像为特定参考图像时,根据处理后的特定候选运动矢量确定继续加入运动矢量第二候选列表中的候选运动矢量,其中,处理后的特定候选运动矢量和处理前的特定候选运动矢量相同。
其中,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
在一些实施例中,步骤S120包括:在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,根据缩放处理后的特定候选运动矢量确定参考运动矢量。这种情形下,可选地,该方法还包括:当特定候选运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据特定候选运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
上述可知,本申请实施例,在获取当前图像块的参考运动矢量过程中,仅对已经获取的M个候选运动矢量中的N(N小于M)个候选运动矢量依次扫描,相对于现有技术,可以减少在获取当前图像块的参考运动矢量过程中对候选运动矢量的扫描次数。应理解,将本申请提供的方案应用于现有ATMVP技术中,可以对其存在的冗余操作进行简化。
在N个候选运动矢量中未扫描到参考帧与当前帧的同位帧相同的候选运动矢量的情况下,对N个候选运动矢量中的一个候选运动矢量进行缩放处理,使其参考帧与当前帧的同位帧相同,然后将这个经过缩放处理的候选运动矢量作为当前图像块的运动矢量,这样可以提高当前图像块的运动矢量的准确度。
将当前图像块划分为大小为8×8的子图像块,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于 上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
如图4所示,本申请实施例还提供一种视频图像处理方法,该方法包括如下步骤,
S410,获取用于加入当前图像块的运动矢量第二候选列表的M个候选运动矢量。
步骤S410对应于上文描述的步骤S110,具体描述参考上文,这里不再赘述。
S420,对M个候选运动矢量中的至少部分候选运动矢量依次扫描,根据扫描结果确定当前图像块的参考运动矢量。
作为一种可选实现方式,对M个候选运动矢量中的部分候选运动矢量依次扫描,根据扫描结果确定当前图像块的参考运动矢量。在这种实现方式下,步骤S420可对应于上文描述的步骤S120,具体描述参见上文。
作为另一种可选实现方式,对M个候选运动矢量中的全部候选运动矢量依次扫描,根据扫描结果确定当前图像块的参考运动矢量。
应理解,在步骤S420中,根据扫描结果确定当前图像块的参考运动矢量的具体方式可以参考上文实施例中的相关描述,这里不再赘述。
S430,将当前图像块划分成多个子图像块,其中,子图像块的大小固定为大于或等于64个像素。
例如,当前图像块为一个CU,将其划分之后得到的子图像块可称为sub-CU。
S440,根据参考运动矢量在当前图像块的参考图像中确定子图像块的相关块。
当前图像块的参考图像可以是当前图像块的同位帧。
S450,根据相关块的运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
在目前ATMVP技术中,对子图像块的大小进行帧级自适应的设置,子图像块的大小默认为4×4,当满足一定条件时,子图像块的大小被设置为8×8。例如,在编码端,在编码当前图像块时,计算同一时域层的上一个编码图像块进行ATMVP模式编码时CU中的各个子图像块的平均块大小,当平均块大小大于阈值,当前图像块的子图像块的尺寸被设置为8×8,否则使 用默认值4×4。换言之,在现有技术中,在编码当前图像块时,还需要存储同一时域层的上一个已编码图像块的子图像块的大小的信息。
在本申请实施例中,将当前图像块的子图像块的大小固定为大于或等于64个像素,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
可选地,在本实施例中,子图像块的大小和/或子图像块的相关块的大小均固定为8×8个像素。
在目前ATMVP技术中,对子图像块的大小进行帧级自适应的设置,子图像块的大小默认为4×4,当满足一定条件时,子图像块的大小被设置为8×8。例如,在编码端,在编码当前图像块时,计算同一时域层的上一个编码图像块进行ATMVP模式编码时CU中的各个子图像块的平均块大小,当平均块大小大于阈值,当前图像块的子图像块的尺寸被设置为8×8,否则使用默认值4×4。目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。应理解,当将子图像块的大小设置为4×4,该子图像块的运动矢量的大小(也为4×4)不符合当前标准中运动矢量的存储粒度。此外,目前ATMVP技术中,编码当前图像块时,还需要存储同一时域层的上一个已编码图像块的子图像块的大小的信息。
在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,在保证子图像块的大小和/或子图像块的相关块的大小固定为等于64个像素的前提下,子图像块的大小和/或子图像块的相关块的大小还可 以为别的尺寸,例如子图像块的大小和/或子图像块的相关块的大小为A×B,A≤64,B≤64,A和B均为4的整数。例如,子图像块的大小和/或子图像块的相关块的大小为4×16个像素,或者为16×4个像素。
可选地,在步骤S420中,对至少部分候选运动矢量依次进行扫描,当扫描到第一个符合预设条件的候选运动矢量时,停止扫描,且根据扫描到的第一个符合预设条件的候选运动矢量确定参考运动矢量。
根据扫描到的第一个符合预设条件的候选运动矢量确定参考运动矢量,可以包括:将第一个符合预设条件的候选运动矢量作为目标邻近块。
可选地,预设条件包括:候选运动矢量的参考图像与当前图像块的参考图像相同。
可选地,步骤S450包括:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量确定继续加入运动矢量第二候选列表中的候选运动矢量,其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
例如,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,步骤S450包括:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
因此,图4所示的实施例,将当前图像块划分为大小为8×8的子图像块,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节 省编解码的时间,提高编码效率。
上面描述了如何根据ATMVP技术确定加入运动矢量第二候选列表的候选者的方法。一些实现方式中,还可以往运动矢量第二候选列表加入其它的候选者,在此不做限制。
上文结合图1和图4描述了本申请的方法实施例,下文将描述上文方法实施例对应的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。
图5为本申请实施例提供的视频图像处理装置500的示意性框图。该装置500用于执行如图1所示的方法实施例。该装置500包括如下单元。
获取单元510,用于获取用于加入当前图像块的运动矢量第二候选列表的M个候选运动矢量;
确定单元520,用于对M个候选运动矢量中的N个候选运动矢量依次扫描,根据扫描结果确定参考运动矢量,N小于M;
确定单元520还用于,根据参考运动矢量、当前图像块以及当前图像块的参考图像,确定继续加入运动矢量第二候选列表中的候选运动矢量;
确定单元520还用于,根据运动矢量第二候选列表,确定当前图像块的运动矢量。
在现有ATMVP技术的第一步中,通过对运动矢量第二候选列表中当前已加入的所有候选运动矢量进行扫描,来获取当前图像块的时域矢量。例如,运动矢量第二候选列表通常会被填充4个候选运动矢量,则可能会出现如下情形:需要扫描4个候选运动矢量,才能获得当前图像块的时域矢量。
而在本申请实施例中,在获取当前图像块的参考运动矢量过程中,仅对已经获取的M个候选运动矢量中的N(N小于M)个候选运动矢量依次扫描,相对于现有技术,可以减少在获取当前图像块的参考运动矢量过程中对候选运动矢量的扫描次数。应理解,将本申请提供的方案应用于现有ATMVP技术的第一步中,可以对其存在的冗余操作进行简化。
申请人在通用视频编码(Versatile Video Coding)最新的参考软件VTM-2.0上,选取官方通测序列作为测试序列,测试配置为RA配置及LDB配置,对本申请提供的方案进行测试,测试结果显示,减少扫描次数后,还可以保持ATMVP技术的性能增益。
因此,本申请提供的方案在保持现有ATMVP技术的性能增益的前提下,可以降低ATMVP技术的复杂度。
可选地,作为一个实施例,获取单元510用于,根据当前图像块在当前帧内的M个邻近块的运动矢量,获取用于加入当前图像块的运动矢量第二候选列表的M个候选运动矢量。
可选地,作为一个实施例,邻近块为在当前帧上与当前图像块的位置相邻或具有一定位置间距的图像块。
可选地,作为一个实施例,确定单元520用于,对M个候选运动矢量中的前N个候选运动矢量依次扫描。
可选地,作为一个实施例,M等于4,N小于4。
可选地,作为一个实施例,N等于1或2。
可选地,作为一个实施例,确定单元520用于,基于预设条件,对M个候选运动矢量中的N个候选运动矢量依次扫描,根据扫描结果确定参考运动矢量。
可选地,作为一个实施例,预设条件包括:指向的参考帧与当前图像块的参考图像相同的候选运动矢量。
可选地,作为一个实施例,确定单元520用于,对N个候选运动矢量依次进行扫描,当扫描到第一个符合预设条件的候选运动矢量时,停止扫描,且根据扫描到的第一个符合预设条件的候选运动矢量确定参考运动矢量。
可选地,作为一个实施例,确定单元520用于,当在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量时,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,根据缩放处理后的特定候选运动矢量确定参考运动矢量。
可选地,作为一个实施例,特定候选运动矢量为N个候选运动矢量中,按扫描顺序得到的第一个运动矢量或者最后一个运动矢量。
可选地,作为一个实施例,确定单元520用于,对运动矢量第二候选列表中的特定候选运动矢量进行缩放处理,使得经过缩放处理的特定候选运动矢量指向的参考帧与当前图像块的参考图像相同;将经过缩放处理后的特定候选运动矢量作为参考运动矢量。
可选地,作为一个实施例,确定单元520用于,当在N个候选运动矢量中未扫描到符合预设条件的候选运动矢量时,将默认值作为参考运动矢量。
可选地,作为一个实施例,默认值为运动矢量(0,0)。
可选地,作为一个实施例,确定单元520用于,将当前图像块划分成多个子图像块;根据参考运动矢量,在当前图像块的参考图像中确定子图像块的相关块;根据相关块的运动矢量,确定继续加入运动矢量第二候选列表的候选运动矢量。
可选地,作为一个实施例,子图像块的大小和/或子图像块的相关块的大小固定为大于或等于64个像素。
可选地,作为一个实施例,当前图像块为一个编码单元CU。
可选地,作为一个实施例,确定单元520用于,根据参考运动矢量,在当前图像块的参考图像中确定当前图像块的相关块;根据相关块的运动矢量,确定继续加入运动矢量第二候选列表的候选运动矢量。
可选地,作为一个实施例,确定单元520用于,当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量确定继续加入运动矢量第二候选列表中的候选运动矢量,其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
可选地,作为一个实施例,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,作为一个实施例,确定单元520用于,当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
可选地,作为一个实施例,确定单元520用于,当特定候选运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的特定候选运动矢量确定继续加入运动矢量第二候选列表中的候选运动矢量,其中,处理后的特定候选运动矢量和处理前的特定候选运动矢量相同。
可选地,作为一个实施例,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,作为一个实施例,确定单元520用于,当特定候选运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据特定候选运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
可选地,作为一个实施例,运动矢量第二候选列表为Merge候选列表。
可选地,作为一个实施例,当前图像块的参考图像为当前图像块的同位帧。
可选地,作为一个实施例,子图像块的大小和/或子图像块的相关块的大小均固定为8×8个像素。
可选地,在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
可选地,在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,本实施例中的获取单元510和确定单元520均可以由处理器实现。
如图6所示,本申请实施例还提供一种视频图像处理装置600。该装置600用于执行如图4所示的方法实施例。该装置600包括如下单元。
确定单元610,用于获取用于加入当前图像块的运动矢量第二候选列表的M个候选运动矢量;
确定单元620,用于对M个候选运动矢量中的至少部分候选运动矢量依次扫描,根据扫描结果确定当前图像块的参考运动矢量;
划分单元630,用于将当前图像块划分成多个子图像块,其中,子图像块的大小固定为大于或等于64个像素;
确定单元620还用于,根据参考运动矢量在当前图像块的参考图像中确定子图像块的相关块;
确定单元620还用于,根据相关块的运动矢量确定继续加入运动矢量第二候选列表的候选运动矢量。
在目前ATMVP技术中,对子图像块的大小进行帧级自适应的设置,子图像块的大小默认为4×4,当满足一定条件时,子图像块的大小被设置为8×8。例如,在编码端,在编码当前图像块时,计算同一时域层的上一个编码图像块进行ATMVP模式编码时CU中的各个子图像块的平均块大小,当 平均块大小大于阈值,当前图像块的子图像块的尺寸被设置为8×8,否则使用默认值4×4。换言之,在现有技术中,在编码当前图像块时,还需要存储同一时域层的上一个已编码图像块的子图像块的大小的信息。
在本申请实施例中,将当前图像块的子图像块的大小固定为大于或等于64个像素,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
可选地,作为一个实施例,子图像块的大小和/或子图像块的相关块的大小均固定为8×8个像素。
目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
可选地,作为一个实施例,确定单元620用于,对至少部分候选运动矢量依次进行扫描,当扫描到第一个符合预设条件的候选运动矢量时,停止扫描,且根据扫描到的第一个符合预设条件的候选运动矢量确定参考运动矢量。
可选地,作为一个实施例,确定单元620用于,将第一个符合预设条件的候选运动矢量作为目标邻近块。
可选地,作为一个实施例,预设条件包括:候选运动矢量的参考图像与当前图像块的参考图像相同。
应理解,本实施例中的获取单元610、确定单元620和划分单元630均可以由处理器实现。
在上文描述中,一个图像块的运动矢量包含两个信息:1)该运动矢量所指向的图像;2)位移。在一些应用场景中,一个图像块的运动矢量仅仅包含“位移”这个信息。该图像块另外提供了用于指示该图像块的参考图像的索引信息。对于已编/解码的图像块,其运动矢量所包含的含义包括:该已编/解码的图像块的参考块在参考图像上相对与该已编/解码的图像块位置相同且位于该参考图像的图像块的位移。在确定该已编/解码的图像块的参考块时,需要通过该已编/解码的图像块的参考图像的索引信息以及该已编/解码的图像块的运动矢量确定该已编/解码的图像块的参考块。那么,在图1所示的视频图像处理方法中,步骤S120不是对运动矢量第二候选列表中的候选运动矢量进行扫描,而是直接对该候选运动矢量所对应的图像块进行扫描。下文中针对该运动矢量新的定义(也即包含“位移”信息但不包含“所指向的图像”),提供视频图像处理方法。需要注意的是,针对“运动矢量”的这两种不同含义所分别提供的基于ATMVP技术确定候选运动矢量的方法基本是一致的,上文中的解释也适用于下文提供的视频图像处理方法,区别主要在构建运动矢量第二候选列表中,根据ATMVP技术确定加入运动矢量第二候选列表中的候选者时,在上文已描述的视频图像处理方法中是对已加入运动矢量第二候选列表中的运动矢量进行扫描,而下文中所提供的视频图像处理方法中是对该已加入运动矢量第二候选列表中的运动矢量所对应的图像块进行扫描。
如图7所示,本申请实施例提供一种视频图像处理方法,该方法包括如下步骤。
S710,确定当前图像块的M个邻近块。
当前图像块为待进行编码(或解码)的图像块。例如,当前图像块为一个编码单元(CU)。
当前图像块所在的图像帧称为当前帧。
邻近块为在当前图像上与当前图像块的位置相邻或具有一定位置间距的图像块。
M个邻近块为当前帧内已完成编码(或解码)的图像块。
作为示例,如图2所示,按图2中所示的位于当前图像块周围4个位置A 1(左)→B 1(上)→B 0(右上)→A 0(左下)的图像块的顺序,依次确定当前图像块的4个邻近块。
S720,对M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M。
根据N个邻近块的扫描结果确定目标邻近块的过程可以是,基于预设条件对N个邻近块依次进行判断,根据判断结果确定目标邻近块。
作为示例,预设条件的定义为,邻近块的参考图像与当前图像块的参考图像相同。
其中,当前图像块的参考图像为与当前图像块所在的图像时间距离最近的参考图像;或,当前图像块的参考图像为编解码端预设的参考图像;或,当前图像块的参考图像为在视频参数集、序列头、序列参数集、图像头、图像参数集、条带头中指定的参考图像。
例如,该当前图像块的参考图像为当前图像块的同位帧,同位帧即为在条带级信息头中设定的用于获取运动信息进行预测的帧。
应理解,根据未来技术的演进,该预设条件可能会被赋予其它不同的定义,相应的方案也落入本申请保护范围。
下文将对根据N个邻近块的扫描结果确定目标邻近块的过程进行详细描述。
在步骤S720中,仅对步骤S710中获取的M个邻近块中的N个邻近块进行扫描,这样可以减少扫描次数。
可选地,在步骤S720中,可以对M个邻近块中的前N个邻近块依次扫描。
当在步骤S710中,按预设顺序依次确定当前图像块的M个邻近块时,步骤S720中获取的前N个邻近块,指的是按该预设顺序首先确定的N个邻近块。
可选地,在步骤S720中,可以对M个邻近块中的最后N个邻近块依次扫描;或者,可以对M个邻近块中的中间的N个邻近块依次扫描。本申请对此不作限定。
S730,根据目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定当前图像块的相关块。
S740,根据相关块的运动矢量对当前图像块进行编/解码。
可选地,步骤S740包括:根据相关块的运动矢量和参考图像确定当前图像块的参考块。
例如,步骤S740包括:构建当前图像块的候选块列表,该候选块列表中的候选块包括M个邻近块和相关块;根据候选块列表中的候选块的参考块对当前图像块进行编解码。
在一个示例中,该候选块列表为当前图像块的merge候选列表。在一个示例中。该候选块列表为当前图像块的AMVP候选列表。
在编码端,将当前块的候选块的索引(index)写入码流。在解码端,获取到索引后,从候选块列表中找到该索引对应的候选块,根据该候选块的参考块确定当前图像块的参考块,或者,根据该候选块的运动矢量确定当前图像块的运动矢量。
例如,直接将候选块的参考块确定当前图像块的参考块,或者,直接将候选块的运动矢量确定当前图像块的运动矢量。又例如,编码端还将该当前块的MVD写入码流中。解码端获取到该MVD后,将该候选块的运动矢量加上MVD作为当前块的运动矢量,然后根据该运动矢量和当前块的参考图像确定当前块的参考块。
在本申请实施例中,在获取当前图像块的目标邻近块的过程中,仅对已经获取的M个邻近块中的N(N小于M)个邻近块依次扫描,相对于现有技术,可以减少在获取当前图像块的目标邻近块的过程中对候选邻近块的扫描次数,从而降低复杂度。
可选地,在本实施例中,在步骤S710中,确定当前图像块在当前帧内的4个邻近块,即M等于4。在步骤S720中,对4个邻近块中的N个邻近块进行扫描,N小于4。
例如,N等于1。例如,在步骤S720中,仅对4个邻近块中的第一个邻近块进行扫描。
再例如,N等于2或3。
下文将对步骤S720中,根据N个邻近块的扫描结果确定目标邻近块的方式进行描述。
可选地,在步骤S720中,对N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据扫描到的第一个符合预设条件的邻近块确定目标邻近块。
例如,预设条件的定义为邻近块的参考图像与当前图像块的参考图像相同。
应理解,在将来演进的技术中,预设条件还可能被赋予其他定义。
下文中,以预设条件的定义为邻近块的参考图像与当前图像块的参考图像相同为例进行描述。
例如,将第一个符合预设条件的邻近块作为目标邻近块。
可选地,当在步骤S720中,在N个邻近块中未扫描到符合预设条件的邻近块时,该方法还包括:对M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据缩放处理后的运动矢量对当前图像块进行编/解码。
例如,根据缩放处理后的运动矢量和当前图像块的参考图像确定当前图像块的参考块。
可选地,特定邻近块为N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
该特定邻近块还可以为N个邻近块中,按其它扫描顺序得到的邻近块。
可选地,根据缩放处理后的运动矢量对当前图像块进行编/解码,包括:对特定邻近块的运动矢量进行缩放处理,使得经过缩放处理后的运动矢量指向的参考帧与当前图像块的参考图像相同;将经过缩放处理后的运动矢量在当前图像块的参考图像中指向的图像块作为当前图像块的参考块。
可选地,当在步骤S720中,当在N个邻近块中未扫描到符合预设条件的邻近块时,将默认块作为当前图像块的候选参考块。
例如,默认块为运动矢量(0,0)指向的图像块。
下文将对步骤S730中,根据目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定当前图像块的相关块的过程进行描述。
可选地,作为一种实现方式,根据目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定当前图像块的相关块,包括:将当前图像块划分为多个子图像块;根据目标邻近块的运动矢量,在当前图像块的参考图像中确定子图像块的相关块,当前图像块的相关块包括子图像块的相关块。
相关块可以被称为collocated block或者corresponding block。
例如,当前图像块为一个CU,将其划分之后得到的子图像块可称为sub-CU。
可选地,子图像块的大小和/或子图像块的相关块的大小固定为大于或等于64个像素。
可选地,子图像块的大小和/或子图像块的相关块的大小均固定为8×8个像素。
在目前ATMVP技术中,对子图像块的大小进行帧级自适应的设置,子图像块的大小默认为4×4,当满足一定条件时,子图像块的大小被设置为8×8。例如,在编码端,在编码当前图像块时,计算同一时域层的上一个编码图像块进行ATMVP模式编码时CU中的各个子图像块的平均块大小,当平均块大小大于阈值,当前图像块的子图像块的尺寸被设置为8×8,否则使用默认值4×4。目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。应理解,当将子图像块的大小设置为4×4,该子图像块的运动矢量的大小(也为4×4)不符合当前标准中运动矢量的存储粒度。此外,目前ATMVP技术中,编码当前图像块时,还需要存储同一时域层的上一个已编码图像块的子图像块的大小的信息。
在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,在保证子图像块的大小和/或子图像块的相关块的大小固定为等于64个像素的前提下,子图像块的大小和/或子图像块的相关块的大小还可以为别的尺寸,例如子图像块的大小和/或子图像块的相关块的大小为A×B,A≤64,B≤64,A和B均为4的整数。例如,子图像块的大小和/或子图像块的相关块的大小为4×16个像素,或者为16×4个像素。
可选地,作为另一种实现方式,根据目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定当前图像块的相关块,包括:根据目标 邻近块的运动矢量,在当前图像块的参考图像中确定当前图像块的相关块。
可选地,步骤S740包括:当相关块的参考图像为特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量和当前图像块的参考图像确定当前图像块的候选参考块;其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
例如,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,步骤S740包括:当相关块的参考图像为特定参考图像,或者当前块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定当前图像块的候选参考块。
在一些实施例中,步骤S720包括:当特定邻近块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量和当前图像块的参考图像确定当前图像块的参考块;其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
其中,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
上述可知,本申请实施例,在获取当前图像块的目标邻近块的过程中,仅对已经获取的M个邻近块中的N(N小于M)个邻近块依次扫描,相对于现有技术,可以减少在获取当前图像块的目标邻近块的过程中对候选邻近块的扫描次数,从而降低复杂度。
在N个邻近块中未扫描到参考帧与当前帧的同位帧相同的邻近块的情况下,对N个邻近块中的一个邻近块的运动矢量进行缩放处理,使其参考帧与当前帧的同位帧相同,然后将这个经过缩放处理的运动矢量作为当前图像块的运动矢量,这样可以提高当前图像块的运动矢量的准确度。
将当前图像块划分为大小为8×8的子图像块,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗 余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
在本申请实施例的一些实现方式中,在确定当前图像块的运动矢量的过程中,涉及到要利用其它图像上的某个图像块的运动矢量来确定该图像块的运动矢量。为描述方便,称该图像块为第一图像块,称所要利用的其他图像上的某个图像块为该第一图像块的时域参考块或相关块。可以理解的是,第一图像块和该第一图像块的时域参考块(或相关块)位于不同的图像上。那么,在利用该时域参考块(或相关块)的运动矢量来确定第一图像块的运动矢量的过程中,可能需要对该时域参考块(或相关块)的运动矢量进行缩放。为描述方便,本文中统一采用“相关块”这个术语。
例如,ATMVP技术应用在构建AMVP候选列表中时,在根据ATMVP技术确定当前图像块的相关块后,根据该相关块的运动矢量确定当前图像块的运动矢量时,需要对该相关块的运动矢量进行缩放,然后根据缩放后的运动矢量确定当前图像块的运动矢量。一般来说,基于相关块的运动矢量指向的参考图像与该相关块所在图像之间的时间距离,以及当前图像块的参考图像与当前图像块所在图像之间的时间距离,确定该相关块的运动矢量的缩放比例。
在一个示例中,称该相关块的运动矢量为MV 2,该运动矢量MV 2所指向的参考图像的参考帧索引值为x。其中,该参考帧索引值x为MV 2所指向的参考图像的顺序编号(例如POC)与该相关块所在图像的顺序编号之差。称第一图像块的参考图像的参考帧索引值为y。其中,该参考帧索引值y为第一图像块的参考图像的顺序编号与该第一图像块所在图像的顺序编号之差。那么,对运动矢量MV 2的缩放比例为y/x。可选的,可以将运动矢量MV 2与y/x的乘积作为第一图像块的运动矢量。
然而,当相关块的运动矢量MV 2是指向一个特定参考图像时,或者,当第一图像块的参考图像是一个特定参考图像时,由于特定参考图像与第一图像块所在图像的时间距离定义不明确,对相关块的运动矢量MV 2缩放 便没有意义。
可选地,在本实施例中,当根据相关块的运动矢量确定当前图像块的运动矢量时,具体地:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量确定当前图像块的运动矢量,其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
例如,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,在本实施例中,当根据相关块的运动矢量确定当前图像块的运动矢量时,具体地:当相关块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定当前图像块的运动矢量。
如图8所示,本申请实施例还提供一种视频图像处理方法,该方法包括如下步骤。
S810,确定当前图像块的M个邻近块。
步骤S810可以对应于上文实施例中的步骤S710。
S820,对M个邻近块中的至少部分邻近块依次扫描,根据扫描结果确定目标邻近块。
可选地,对M个邻近块中的部分邻近块依次扫描,根据扫描结果确定目标邻近块。
可选地,对M个邻近块中的全部邻近块依次扫描,根据扫描结果确定目标邻近块。
S830,将当前图像块划分成多个子图像块,其中,子图像块的大小固定为大于或等于64个像素。
S840,根据目标邻近块的运动矢量及子图像块,在当前图像块的参考图像中确定当前图像块的相关块。
可选地,当前图像块的参考图像为与当前图像块所在的图像时间距离最近的参考图像。
可选地,当前图像块的参考图像为编解码端预设的参考图像。
可选地,当前图像块的参考图像为在视频参数集、序列头、序列参数集、 图像头、图像参数集、条带头中指定的参考图像。
S850,根据相关块的运动矢量对当前图像块进行编/解码。
在本申请实施例中,将当前图像块的子图像块的大小固定为大于或等于64个像素,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
可选地,在本实施例中,子图像块的大小和/或子图像块的时域参考块的大小均固定为8×8个像素。
目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,在保证子图像块的大小和/或子图像块的相关块的大小固定为等于64个像素的前提下,子图像块的大小和/或子图像块的相关块的大小还可以为别的尺寸,例如子图像块的大小和/或子图像块的相关块的大小为A×B,A≤64,B≤64,A和B均为4的整数。例如,子图像块的大小和/或子图像块的相关块的大小为4×16个像素,或者为16×4个像素。
可选地,步骤S820包括:对至少部分邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据扫描到的第一个符合预设条件的邻近块确定目标邻近块。
例如,将第一个符合预设条件的邻近块作为目标邻近块。
例如,预设条件的定义为:邻近块的参考图像与当前图像块的参考图像相同。
可选地,步骤S840包括:根据目标邻近块的运动矢量及子图像块,在当前图像块的参考图像中确定子图像块的相关块,其中,当前图像块的相关块,包括子图像块的相关块。
上文结合图7和图8描述了本申请的方法实施例,下文将描述图7和图8所示的方法实施例对应的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。
图9为本申请实施例提供的视频图像处理装置900的示意性框图。该装置900用于执行如图7所示的方法实施例。该装置900包括如下单元。
获取单元910,用于获取当前图像块的M个邻近块;
确定单元920,用于对M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;
确定单元920还用于,根据目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定当前图像块的相关块;
编/解码单元930,用于根据相关块的运动矢量对当前图像块进行编/解码。
在本申请实施例中,在获取当前图像块的目标邻近块的过程中,仅对已经获取的M个邻近块中的N(N小于M)个邻近块依次扫描,相对于现有技术,可以减少在获取当前图像块的目标邻近块的过程中对候选邻近块的扫描次数,从而降低复杂度。
可选地,作为一个实施例,M等于4,N小于4。
可选地,作为一个实施例,N等于1或2。
可选地,作为一个实施例,确定单元920用于,对M个邻近块中的前N个邻近块依次扫描。
可选地,作为一个实施例,获取单元910用于,按预设顺序依次获取当前图像块的M个邻近块;前N个邻近块,指的是按预设顺序首先确定的N个邻近块。
可选地,作为一个实施例,确定单元920用于,对N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据扫描到的第一个符合预设条件的邻近块确定目标邻近块。
可选地,作为一个实施例,确定单元920用于,将第一个符合预设条件 的邻近块作为目标邻近块。
可选地,作为一个实施例,预设条件包括:邻近块的参考图像与当前图像块的参考图像相同。
可选地,作为一个实施例,编/解码单元930用于,根据相关块的运动矢量和参考图像确定当前图像块的参考块。
可选地,作为一个实施例,编/解码单元930用于,构建当前图像块的候选块列表,候选块列表中的候选块包括M个邻近块和相关块;根据候选块列表中的候选块的参考块对当前图像块进行编解码。
可选地,作为一个实施例,编/解码单元930还用于,当在N个邻近块中未扫描到符合预设条件的邻近块时,对M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据缩放处理后的运动矢量对当前图像块进行编/解码。
可选地,作为一个实施例,编/解码单元930用于,根据缩放处理后的运动矢量和当前图像块的参考图像确定当前图像块的参考块。
可选地,作为一个实施例,特定邻近块为N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
可选地,作为一个实施例,编/解码单元930用于,对特定邻近块的运动矢量进行缩放处理,使得经过缩放处理后的运动矢量指向的参考帧与当前图像块的参考图像相同;将经过缩放处理后的运动矢量在当前图像块的参考图像中指向的图像块作为当前图像块的参考块。
可选地,作为一个实施例,确定单元920用于,当N个邻近块中未扫描到符合预设条件的邻近块时,将默认块作为当前图像块的参考块。
可选地,作为一个实施例,默认块为运动矢量(0,0)指向的图像块。
可选地,作为一个实施例,确定单元920用于:
将当前图像块划分为多个子图像块;
根据目标邻近块的运动矢量,在当前图像块的参考图像中确定子图像块的相关块,当前图像块的相关块包括子图像块的相关块。
可选地,作为一个实施例,子图像块的大小和/或子图像块的相关块的大小固定为大于或等于64个像素。
可选地,在本申请一些实施例中,在子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳 过部分冗余操作,有效节省编解码的时间,提高编码效率。
可选地,在本申请一些实施例中,在当前CU的宽和/或高小于8像素的情况下,设置不进行TMVP操作。换句话说,在子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行TMVP操作。对于上述情况,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
可选地,作为一个实施例,当前图像块为一个编码单元CU。
可选地,作为一个实施例,确定单元920用于,根据目标邻近块的运动矢量,在当前图像块的参考图像中确定当前图像块的相关块。
可选地,作为一个实施例,邻近块为在当前图像上与当前图像块的位置相邻或具有一定位置间距的图像块。
可选地,作为一个实施例,编/解码单元930用于,当相关块的参考图像为特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量和当前图像块的参考图像确定当前图像块的参考块;
其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
可选地,作为一个实施例,,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
可选地,作为一个实施例,编/解码单元930用于,当相关块的参考图像为特定参考图像,或者当前块的参考图像为特定参考图像时,放弃根据相关块的运动矢量确定当前图像块的参考块。
可选地,作为一个实施例,确定单元920用于:当特定邻近块的运动矢量指向特定参考图像,或者当前图像块的参考图像为特定参考图像时,根据处理后的相关块的运动矢量和当前图像块的参考图像确定当前图像块的参考块;其中,处理后的相关块的运动矢量和处理前的相关块的运动矢量相同。
可选地,作为一个实施例,处理后的相关块的运动矢量,包括:根据数值为1的缩放比例对相关块的运动矢量进行缩放后得到的运动矢量,或者,跳过缩放步骤的相关块的运动矢量。
应理解,本实施例中的获取单元910、确定单元920和编/解码单元930均可以由处理器实现。
如图10所示,本申请实施例还提供一种视频图像处理装置1000。该装置1000用于执行如图8所示的方法实施例。该装置1000包括如下单元。
获取单元1010,用于获取当前图像块的M个邻近块;
确定单元1020,用于对M个邻近块中的至少部分邻近块依次扫描,根据扫描结果确定目标邻近块;
划分单元1030,用于将当前图像块划分成多个子图像块,其中,子图像块的大小固定为大于或等于64个像素;
确定单元1020还用于,根据目标邻近块的运动矢量及子图像块,在当前图像块的参考图像中确定当前图像块的相关块;
编/解码单元1040,用于根据相关块的运动矢量对当前图像块进行编/解码。
在本申请实施例中,将当前图像块的子图像块的大小固定为大于或等于64个像素,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
可选地,作为一个实施例,子图像块的大小和/或子图像块的时域参考块的大小均固定为8×8个像素。
目前,在新一代视频编码标准(Versatile Video Coding,VVC)中,是以8×8的大小对运动矢量进行存储。在本申请实施例中,将当前图像块的子图像块的大小设置为8×8,一方面可以适应视频标准VVC中规定的运动矢量的存储粒度,另一方面,无需存储上一个已编码图像块的子图像块的大小的信息,因此,可以节省存储空间。
在本申请一些实施例中,在ATMVP技术中的子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
在本申请一些实施例中,在当前CU的宽和高中的至少一个小于8的情况下,设置不进行TMVP操作,跳过TMVP操作带来的性能影响可以忽略不计,从而可以有效节省编解码的时间,提高编码效率。
应理解,在保证子图像块的大小和/或子图像块的相关块的大小固定为等于64个像素的前提下,子图像块的大小和/或子图像块的相关块的大小还可以为别的尺寸,例如子图像块的大小和/或子图像块的相关块的大小为A×B,A≤64,B≤64,A和B均为4的整数。例如,子图像块的大小和/或子图 像块的相关块的大小为4×16个像素,或者为16×4个像素。
可选地,作为一个实施例,对M个邻近块中的至少部分邻近块依次扫描,根据扫描结果确定目标邻近块,包括:对至少部分邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据扫描到的第一个符合预设条件的邻近块确定目标邻近块。
可选地,作为一个实施例,确定单元1020用于,将第一个符合预设条件的邻近块作为目标邻近块。
可选地,作为一个实施例,预设条件包括:邻近块的参考图像与当前图像块的参考图像相同。
可选地,作为一个实施例,确定单元1020用于,根据目标邻近块的运动矢量及子图像块,在当前图像块的参考图像中确定子图像块的相关块,其中,当前图像块的相关块,包括子图像块的相关块。
应理解,本实施例中的获取单元1010、确定单元1020、划分单元1030和编/解码单元1040均可以由处理器实现。
如图11所示,本申请实施例还提供一种视频图像处理装置1100。装置1100可以用于执行上文描述的方法实施例。装置1100包括处理器1110、存储器1120,存储器1120用于存储指令,处理器1110用于执行存储器1120存储的指令,并且对存储器1120中存储的指令的执行使得处理器1110用于执行根据上文方法实施例的方法。
可选地,如图11所示,该装置1100还可以包括通信接口1130,用于与外部设备进行通信。例如,处理器1110用于控制通信接口1130接收和/或发送信号。
本申请提供的装置500、600、900、1000和1100可以应用于编码器,也可以应用于解码器。
上文中对运动矢量第二候选列表进行了解释,下面将对运动矢量第一候选列表进行解释。
在运动补偿预测阶段,以往主流的视频编码标准只应用了平移运动模型。而在现实世界中,有太多种运动形式,如放大/缩小,旋转,远景运动和其他不规则运动。为了提高帧间预测的效率,可以在编解码技术中引入仿射变换(affine)运动补偿模型。仿射变换运动补偿会通过一组控制点(control point)的MV来描述图像块的仿射运动场。一个示例中,仿射变换运动补偿 模型采用的是四参Affine模型,则该组控制点包括两个控制点(例如图像块的左上角点和右上角点)。一个示例中,仿射变换运动补偿模型采用的是六参Affine模型,则该组控制点包括三个控制点(例如图像块的左上角点、右上角点和左下角点)。
一种实现方式中,在构建运动矢量第一候选列表时,加入的候选者可以是一组控制点的MV,或者称为控制点预测运动矢量(CPMVP,Control point motion vector prediction)。可选的,运动矢量第一候选列表可用于Merge模式中,具体的,可以称为Affine Merge模式;相对应的,该运动矢量第一候选列表可以称为affine merge candidate list。在Affine Merge模式中,直接使用运动矢量第一候选列表中的预测作为当前图像块的CPMV(Control point motion vector),也即不需要进行affine运动估计过程。
一种实现方式中,可将根据ATMVP技术确定的候选者加入到运动矢量第一候选列表中。
其中,一个示例中,将当前图像块的相关块的控制点运动矢量组作为候选者加入到运动矢量第一候选列表中。在采用运动矢量第一列表中该候选者进行预测时,根据当前图像块的相关块的控制点运动矢量组对该当前图像块进行预测。
其中,一个示例中,如上文所描述的,将当前图像块的相关块的代表运动矢量作为候选者加入到运动矢量第一候选列表中。进一步,可选的,还标记该候选者为根据ATMVP技术确定的。当采用运动矢量第一候选列表中该候选者进行预测时,根据该标记和候选者确定当前图像块的相关块,将当前图像块和该相关块采用相同的方式划分成多个子图像块,当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;根据该相关块中各子图像块的运动矢量分别对当前图像块中对应的子图像块的运动矢量进行预测。
其中,可选的,当相关块中出现运动矢量不可获得的子图像块时,采用该相关块的代表运动矢量替代该不可获得的运动矢量,对当前图像块中对应的子图像块进行预测。可选的,当相关块的代表运动矢量均不可获得时,放弃将根据ATMVP技术确定的候选者加入到该运动矢量第二候选列表中。一种示例中,当相关块中的子图像块不可获得,或者相关块中的子图像块采用帧内编码模式时,确定该相关块中出现不可获得运动矢量的子图像块。
其中,可选的,运动矢量第一候选列表中每个候选者包括一组控制点的 运动矢量;在将当前图像块的相关块的代表运动矢量加入运动矢量第一候选列表中时,为保证数据格式的一致性,可将该相关块的代表运动矢量插入为候选者中的每一个控制点的运动矢量(也即该候选者中的每个控制点的运动矢量都赋值为该相关块的代表运动矢量)。
其中,可选的,当前图像块的相关块的代表运动矢量可以指的是该相关块的中心位置的运动矢量,或者其他代表该相关块的运动矢量,在此不做限制。
根据上文对运动矢量第二候选列表的描述可知,在根据ATMVP技术确定候选者时,需要确定当前图像块的相关块。本方案中,在根据ATMVP技术确定加入运动矢量第一候选列表的候选者时,确定当前图像块的相关块的方法包括两种:
方法一、对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M,M小于等于4;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块。
方法二、确定根据当前图像块的运动矢量第二候选列表中的M个候选者确定当前图像块的M个邻近块;对该M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M,M小于等于4;根据该目标邻近块的运动矢量、当前图像块以及当前图像块的参考图像,确定所述当前图像块的相关块。其中,运动矢量第二候选列表中的M个候选者可以指的是当前图像块的M个邻近块。
其中,对方法一和方法二中关于“根据扫描结果确定目标邻近块”,以及“根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块”这两个步骤的说明,可参考上文中的解释,在此不在赘述。
一种实现方式中,确定加入运动矢量第一候选列表中的候选者的方法包括:从当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入运动矢量第一候选列表。
其中,一个示例中,采用仿射变化模式进行预测的邻近块,指的是该邻近块的运动矢量是根据affine merge candidate list中的候选者确定的。也即候 选者来自当前图像块的使用affine模式的空域相邻块的affine运动模型;即,将使用affine模式的空域相邻块的CPMV作为当前块的CPMVP。
其中,一个示例中,控制点运动矢量组可以包括该邻近块的2个控制点的运动矢量(例如该邻近块的左上角点和右上角点),或者包括该邻近块的3个控制点的运动矢量(例如图像块的左上角点、右上角点和左下角点),这取决于采用的是四参Affine模型还是六参Affine模型。
其中,一个示例中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
例如,如图12所示,图12是通过当前图像块的邻近块获取运动矢量第一候选列表的候选者的示意图。在当前图像块的左侧,按图像块A->图像块D->图像块E的扫描顺序依次扫描,将第一个满足预置条件的图像块的控制点运动矢量组作为一个候选者加入运动矢量第一候选列表。在当前图像块的上侧,按图像块B->图像块C的扫描顺序依次扫描,将第一个满足预置条件的图像块的控制点运动矢量组作为一个候选者加入运动矢量第一候选列表。可选的,在该扫描顺序中,若未找到符合该阈值条件的图像块,则放弃在该扫描顺序中确定出候选者。
一种实现方式中,确定加入运动矢量第一候选列表中的候选者的方法包括:
根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
也即在该种实现方式中,是通过构造候选者来加入到运动矢量第一候选列表中。一个示例中,在通过构造候选者来加入到运动矢量第一候选列表之前,首先判断运动矢量第一候选列表中的候选者的数量是否已达到预置数值 (例如5),若未达到预置数值,则通过构造候选者来加入到运动矢量第一候选列表中。
一个示例中,构造的候选者是将当前图像块部分控制点的邻近块的运动信息组合后作为CPMVP加入到运动矢量第一候选列表中。
如图13所示,图13是通过当前图像块的邻近块构造运动矢量第一候选列表的候选者的示意图。当前图像块一共有四个控制点,分别为CP1,CP2,CP3,CP4。其中,图像块A0和A1是CP1的空域相邻块;图像块A2,B2和B3是CP2的空域相邻块;图像块B0和B1是CP2的空域相邻块,T是CP4的时域相邻块。控制点CP1,CP2,CP3和CP4的坐标分别为:(0,0),(W,0),(H,0)和(W,H),W和H分别表示当前CU的宽和高。每个控制点的相邻块运动信息的获取优先级为:
对于CP1,获取优先级为:B2->B3->A2。B2可用时,使用B2的MV作为控制点CP1的MV;B2不可用时,使用B3的MV作为控制点CP1的MV;若B2和B3都不可用,则使用A1的MV作为控制点CP1的MV;若B2,B3和A1都不可用,则控制点CP1的运动信息不可得。
同理,对于CP2,获取优先级为:B1->B0;对于CP3,获取优先级为:A1->A0;对于CP4,直接使用T的MV作为控制点CP4的MV。
只有当当前CU的控制点(六参数模型:CP0,CP1和CP2;四参数模型:CP0和CP1)的MV都可用时,才会插入构造生成的MV,否则直接跳至下一步。得到所有控制点的MV(如果有的话)之后,将控制点的MV进行不同的组合即可得到多个affine candidates,组合方式如下:
若使用四参数affine模型,则组合四个控制点的MV中的两个即可得到一个或多个候选者,选择其中的两种组合方式:{CP1,CP2},{CP1,CP3}。其中,组合方式{CP1,CP3}需要根据四参数模型将选中的两个控制点的MV转化成当前CU的左上角和右上角控制点的MV(CP1和CP2)。
若使用六参数affine模型,则组合四个控制点的MV中的三个即可得到一个或多个候选者,选择4种组合方式:{CP1,CP2,CP4},{CP1,CP2,CP3},{CP2,CP3,CP4},{CP1,CP3,CP4}。其中,组合方式{CP1,CP2,CP3},{CP2,CP3,CP4},{CP1,CP3,CP4}需要根据六参数模型将选中的三个控制点的MV转化成当前CU的左上角,右上角和左下角控制点的MV(CP1,CP2和CP3)
一个示例中,若不同组合的MV(2个或3个)所使用的参考帧不相同, 则认为该组合构造生成的候选者不可用。
一种实现方式中,确定加入运动矢量第一候选列表中的候选者的方法包括:使用默认向量进行填充。可选的,该默认向量可以是零向量或其他向量。可选的,可以是在采用其他方法确定加入运动矢量第一候选列表的候选者之后,判断当前已加入该第一候选列表的候选者的数量是否已达到预置数值;若未达到,则使用默认向量填充到第一候选列表中,直至第一候选列表中的候选者的数量达到预置数值。
当采用运动矢量第一候选列表中的候选者来对当前图像块进行预测时,若采用的候选者为除利用ATMVP技术所确定的候选者以外的至少一个候选者时,通过仿射运动模型根据该候选者导出当前图像块中的子图像块的运动矢量。当采用的候选者为利用ATMVP技术所确定的候选者时,则根据上文所描述的,根据相关块中的每个子图像块的运动矢量确定当前图像块中各子图像块的参考块,将各子图像块的参考块拼接成当前图像块的参考块,并根据该参考块计算当前图像块的残差。
下面结合图14和图15对本申请实施例提供的一种视频图像处理方法进行举例描述。如图14所示,该方法包括如下步骤。
S1410,对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M。可选的,M小于等于4。
S1420,根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块。
S1430,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应。
S1440,根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
对图14所示的视频图像处理方法的解释可参考上文,在此不再赘述。
如图15所示,该方法包括如下步骤。
S1510,根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块。
S1520,对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M。可选的,M小于等于4。
S1530,根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块。
S1540,根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者。其中,该特定候选者可以是上文所提到的根据ATMVP技术所确定的候选者。
S1550,当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应。
S1560,根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
对图15所示的视频图像处理方法的解释可参考上文,在此不再赘述。
图16为本申请实施例提供的视频图像处理装置1600的示意性框图。该装置1600用于执行如图14所示的方法实施例。该装置1600包括如下单元。
构建模块1610,对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
预测模块1620,根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
一个示例中,N等于1或2。
一个示例中,预测模块还用于:在所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测之前将所述相关块的代表运动矢量作为候选者加入运动矢量第一候选列表;
当确定采用所述候选者时,所述预测模块根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
一个示例中,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对 应的子图像块的运动矢量。
一个示例中,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
一个示例中,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
一个示例中,预测模块还用于:当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
一个示例中,预测模块还用于:当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
一个示例中,预测模块还用于:当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
一个示例中,构建模块还用于:确定其他候选者,将所述其他候选者加入所述运动矢量第一候选列表,其中,所述其他候选者中的至少一个候选者包括子图像块的运动矢量。
一个示例中,构建模块还用于:当确定采用所述其他候选者中的其中一个候选者时,根据所述采用的候选者确定所述当前图像块中的子图像块的运动矢量。
一个示例中,所述至少一个候选者包括一组控制点的运动矢量。
一个示例中,预测模块还用于:
当确定采用所述至少一个候选者中的候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
一个示例中,当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者中,每个候选者包括2个控制点的运动矢量;
当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者中,每个候选者包括3个控制点的运动矢量。
一个示例中,构建模块还用于:从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述运动矢量第一候选列表。
一个示例中,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
一个示例中,构建模块还用于:根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
一个示例中,构建模块还用于:
当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,构建模块还用于:
构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的候选者为一个图像块的运动矢量;
当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者 的运动矢量确定所述当前图像块的运动矢量。
一个示例中,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
一个示例中,所述构建运动矢量第二候选列表,包括:
根据所述当前图像块在当前图像上的若干个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
一个示例中,所述当前图像块在当前图像上的若干个邻近块包括所述预设的M个邻近块。
一个示例中,构建模块还用于:
按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表;
所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块。
一个示例中,构建模块还用于:
当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
一个示例中,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
一个示例中,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
将所述第一个符合预设条件的邻近块作为所述目标邻近块。
一个示例中,所述预设条件包括:
邻近块的参考图像与所述当前图像块的参考图像相同。
一个示例中,构建模块还用于当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,预测模块还用于根据所述缩放处理后的运动矢量对所述当前图像 块进行预测。
一个示例中,所述根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
一个示例中,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
一个示例中,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
一个示例中,当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
一个示例中,所述默认块为运动矢量(0,0)指向的图像块。
一个示例中,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于或等于64个像素。
一个示例中,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
一个示例中,在ATMVP技术中的子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素或者其宽和高中的至少一个小于8像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
一个示例中,所述当前图像块为一个编码单元CU。
一个示例中,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
一个示例中,所述邻近块为在所述当前图像上与所述当前图像块的位置 相邻或具有一定位置间距的图像块。
一个示例中,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
一个示例中,所述处理后的所述相关块的运动矢量,包括:
根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
跳过缩放步骤的所述相关块的运动矢量。
一个示例中,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
一个示例中,构建模块还用于:
当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
一个示例中,所述处理后的所述相关块的运动矢量,包括:
根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
跳过缩放步骤的所述相关块的运动矢量。
一个示例中,M小于等于4。
图17为本申请实施例提供的视频图像处理装置1700的示意性框图。该装置1700用于执行如图15所示的方法实施例。该装置1700包括如下单元。
构建模块1710,用于根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块;对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者;当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
预测模块1720,用于根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
一个示例中,所述运动矢量第一候选列表中的至少一个候选者包括子图像块的运动矢量,所述运动矢量第二候选列表中的每个候选者包括图像块的运动矢量。
一个示例中,N等于1或2。
一个示例中,所述M个候选者包括所述当前图像块在当前图像上的M个邻近块的运动矢量。
一个示例中,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
一个示例中,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
将所述第一个符合预设条件的邻近块作为所述目标邻近块。
一个示例中,所述预设条件包括:
邻近块的参考图像与所述当前图像块的参考图像相同。
一个示例中,构建模块还用于当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,预测模块还用于根据所述缩放处理后的运动矢量对所述当前图像块进行预测。
一个示例中,所述根据所述缩放处理后的运动矢量对所述当前图像块进 行预测,包括:
根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
一个示例中,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
一个示例中,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
一个示例中,当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
一个示例中,所述默认块为运动矢量(0,0)指向的图像块。
一个示例中,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于或等于64个像素。
一个示例中,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
一个示例中,在ATMVP技术中的子图像块的大小和/或所述子图像块的相关块的大小为8×8个像素或者其宽和高中的至少一个小于8像素的情况下,设置不进行TMVP操作,可以跳过部分冗余操作,有效节省编解码的时间,提高编码效率。
一个示例中,所述当前图像块为一个编码单元CU。
一个示例中,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
一个示例中,所述邻近块为在所述当前图像上与所述当前图像块的位置相邻或具有一定位置间距的图像块。
一个示例中,所述根据所述相关块的运动矢量对所述当前图像块进行预 测,包括:
当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
一个示例中,所述处理后的所述相关块的运动矢量,包括:
根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
跳过缩放步骤的所述相关块的运动矢量。
一个示例中,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
一个示例中,构建模块还用于:当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
一个示例中,所述处理后的所述相关块的运动矢量,包括:
根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
跳过缩放步骤的所述相关块的运动矢量。
一个示例中,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对应的子图像块的运动矢量。
一个示例中,所述根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者,包括:
将所述当前图像块的相关块的代表运动矢量作为所述特定候选者加入所述运动矢量第一候选列表。
一个示例中,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
一个示例中,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
一个示例中,预测模块还用于:当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
一个示例中,预测模块还用于:当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
一个示例中,预测模块还用于:当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
一个示例中,预测模块还用于:当确定采用所述运动矢量第二候选列表中除所述特定候选者以外的其中一个候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
一个示例中,所述运动矢量第二候选列表中除所述特定候选者以外至少一个候选者中,每个候选者包括一组控制点的运动矢量。
一个示例中,当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者中,每个候选者包括2个控制点的运动矢量;
当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者中,每个候选者包括3个控制点的运动矢量。
一个示例中,预测模块还用于:从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述运动矢量第一候选列表。
一个示例中,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
一个示例中,构建模块还用于:根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
一个示例中,构建模块还用于:当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,构建模块还用于:当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
一个示例中,构建模块还用于:构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的候选者为一个图像块的运动矢量;
当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者的运动矢量确定所述当前图像块的运动矢量。
一个示例中,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
一个示例中,所述构建运动矢量第二候选列表,包括:
根据所述当前图像块在当前图像上的M个邻近块的运动矢量确定加入所述运动矢量第二候选列表的所述M个候选者。
一个示例中,构建模块还用于:按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表;
所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块。
一个示例中,构建模块还用于:当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
一个示例中,M小于等于4。
如图18所示,本申请实施例还提供一种视频图像处理方法1800,该方法包括如下步骤,
S1810,确定基础运动矢量列表,该基础运动矢量列表中包括至少一组双预测基础运动矢量组,该双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;
S1820,从预设的偏移量集中确定两个运动矢量偏移量,该两个运动矢量偏移量分别对应于该第一基础运动矢量和该第二基础运动矢量;
S1830,根据该第一基础运动矢量、该第二基础运动矢量和该两个的运动矢量偏移量,确定当前图像块的运动矢量;
S1840,根据该当前图像块的运动矢量对该当前图像块进行预测。
本申请实施例的视频图像处理方法,以预设的偏移量集为基础对双预测基础运动矢量组中的基础运动矢量进行偏移,通过有限次计算便可以得到当前图像块的更精确的运动矢量,使得预测得到的残差更小,从而能够提高编码效率。
在一些实施例中,本申请实施例的视频图像处理方法可以用于改进合并运动矢量差异(Merge with Motion Vector Difference,MMVD)技术,也称为终极运动矢量表达(Ultimate motion vector expression,UMVE)技术。尤其是应用在MMVD技术构建合并(merge)候选列表(merge list),也称为运动矢量候选列表中。
在一些实施例中,在S1810之前,视频图像处理方法1800还可以包括以下步骤:获取合并候选列表,所述合并候选列表中包括P组合并运动矢量 候选,其中,P为大于或等于1的整数。S1810确定基础运动矢量列表,可以包括:根据所述合并候选列表,确定所述基础运动矢量列表。例如,在P大于或等于2时,取所述合并候选列表中的两组合并运动矢量候选形成所述基础运动矢量列表。可选地,该两组合并运动矢量候选可以是合并候选列表中前两组合并运动矢量候选,也可以是在前两组合并运动矢量候选不符合条件时其他的两组合并运动矢量候选。或者,该两组合并运动矢量候选可以是合并候选列表中符合条件的任意两组合并运动矢量候选,本申请实施例对此不作限定。再如,在P小于2时,以运动矢量(0,0)填充形成所述基础运动矢量列表。
具体可以是该MMVD技术首先利用已有的merge候选列表(或者前文中描述的不同类型模式下获得的各种运动矢量候选列表)中的合并运动矢量候选(candidate)构建基础运动矢量列表(base MVP list)。例如,遍历已有的merge候选列表(merge list)中的合并运动矢量候选,如果已有的merge list中的合并运动矢量candidate的组数大于2,则取merge list中前2组合并运动矢量candidate形成MMVD的base MVP list;否则,使用MV(0,0)来填充形成MMVD的base MVP list。本申请实施例中也可以以其他的默认运动矢量来填充形成基础运动矢量列表,例如(1,1)、(2,2)等,本申请实施例对此不作限定。
应理解,base MVP list中的两组基础MV中的每组基础MV可以为单向预测的基础运动矢量,也可以为双预测基础运动矢量组。当然,base MVP list中可以包括更多组或者更少组基础MV,本申请对此不作限定。本文仅对双预测基础运动矢量组的情况展开讨论,双预测基础运动矢量组中所包括的两个基础运动矢量可以是前向基础运动矢量和后向基础运动矢量;也可以是方向相同的两个基础运动矢量,例如两个前向基础运动矢量或两个后向基础运动矢量。
可选地,在一些实施例中,S1820从预设的偏移量集中确定两个运动矢量偏移量,可以包括:从所述偏移量集中确定包括两个运动矢量偏移量的多组运动矢量偏移量组合;S1830根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量,可以包括:从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动 矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。可选地,率失真损失满足预设条件可以是率失真损失小于预设阈值,或者率失真损失最小,例如预测残差最小等,本申请实施例对此不作限定。
在本申请实施例中,MMVD技术可以根据一定规则对基础运动矢量预测值进行偏移,产生新的运动矢量预测候选作为MMVD运动矢量预测值,放入MMVD运动矢量候选列表。可选地,在一些实施例中,偏移量集中的运动矢量偏移量(offset)可以有8个选择(2 1,2 2,……,2 8),即预设的偏移量集为{2,4,8,16,32,64,128,256}。举例而言,base MVP list中的基础运动矢量(也称为基础运动矢量candidate)的组数为2,运动矢量偏移量(offset)有8个选择(2 1,2 2,……,2 8)。对于MMVD的base MVP list中的一组基础MV的2个分量MV_x,MV_y,可以在其上加或者减(2种选择)运动矢量offset。具体例如,MMVD运动矢量预测值=基础运动矢量预测值+运动矢量偏移量offset。由此,MMVD的当前图像块的运动矢量共有2x8x2x2=64种改进(refine)模式。本申请实施例也可以从64种改进模式中选取一部分,或者由64种改进模式衍生出更多的改进模式,而不限于64种改进模式。例如改进模式数可以是32种或128种,等等,本申请实施例对此不作限定。从这些运动矢量中确定出使得率失真损失满足预设条件的运动矢量作为用于编码和/或解码时当前图像块的运动矢量。
从另一个角度理解,确定出使得率失真损失满足预设条件的运动矢量也可以等价认为是从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。
应理解,运动矢量偏移量组合中的两个运动矢量偏移量可以相同也可以不同。
可选地,包括两个运动矢量偏移量的多组运动矢量偏移量组合可以是遍历预设的偏移量集中的运动矢量偏移量,以形成多组运动矢量偏移量组合。
或者,多组运动矢量偏移量组合中的其中一个运动矢量偏移量,可以是通过预设算法计算得到的偏移量集中的一个运动矢量偏移量,将该计算得到的运动矢量偏移量作为一个固定的值,遍历偏移量集中的运动矢量偏移量作 为多组运动矢量偏移量组合中的另一个运动矢量偏移量,以形成多组运动矢量偏移量组合。不进行缩放操作(进行缩放比例为1的缩放操作),而是进行简单的遍历寻找合适的运动矢量偏移量组合,可以在整体上减少运算量,提高编/解码效率。
或者,多组运动矢量偏移量组合中的其中一个运动矢量偏移量,可以是通过预设算法计算得到的偏移量集中的一个运动矢量偏移量,将该计算得到的运动矢量偏移量作为一个固定的值,通过对该固定的值进行缩放操作得到多组运动矢量偏移量组合中的另一个运动矢量偏移量,以形成多组运动矢量偏移量组合。
或者,包括两个运动矢量偏移量的多组运动矢量偏移量组合还可以以其他方式形成,本申请实施例对此不作限定。
可选地,在一些实施例中,当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,可以对已选的运动矢量偏移量进行缩放操作后得到新的运动矢量偏移量。再通过新的运动矢量偏移量对第一基础运动矢量和第二基础运动矢量进行调整。根据第一基础运动矢量、第二基础运动矢量和两个新的运动矢量偏移量,即根据经过调整的确定当前图像块的运动矢量;根据所述当前图像块的运动矢量对所述当前图像块进行预测。换句话说,在双预测中,根据MMVD技术确定当前图像块的运动矢量预测值时,如果当前图像距离两个基础运动矢量的参考图像的距离不同,需要对运动矢量偏移量进行缩放,然后将缩放后的运动矢量偏移量与基础运动矢量预测相加(或相减)以确定当前图像块的运动矢量。
当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,所述两个运动矢量偏移量用于:根据所述两个运动矢量偏移量对所述第一基础运动矢量和第二基础运动矢量进行调整。可选地,所述当前图像到所述第一基础运动矢量的参考图像的距离与所述当前图像到所述第二基础运动矢量的参考图像的距离之比,等于所述第一基础运动矢量所使用的运动矢量偏移量与所述第二基础运动矢量所使用的运动矢量偏移量之比。换而言之,缩放比例由当前图像块(例如第一图像块)所在图像与当前图像块的在两个参考方向上的参考图像之间的距离决定,这两个参考图像即为当前基础运动矢量的参考图像。该实现方式可以进一步减少需尝试的运动矢量偏移量组合的数量,可以进一步提高编/解码效率。
应理解,调整可以是将两个运动矢量偏移量分别与第一基础运动矢量和第二基础运动矢量相加;或者调整可以是将两个运动矢量偏移量分别与第一基础运动矢量和第二基础运动矢量相减。
例如,在编码或解码过程中,可选的运动矢量偏移量有{2,4,8,16,32,64,128,256}这8种选择。记当前过程选择的运动矢量偏移量为X。记当前图像的帧号为P2,第一基础运动矢量的参考图像的帧号、第二基础运动矢量的参考图像的帧号分别为P0、P1。如果P2-P0=P2-P1,则对两个基础运动矢量加相同大小的运动矢量偏移量X;如果P2-P0=2*(P2-P1),则在第一基础运动矢量上加运动矢量偏移量2*X,在第二基础运动矢量上加运动矢量偏移量X;同理,如果P2-P1=2*(P2-P0),则在第一基础运动矢量上加运动矢量偏移量X,在第二基础运动矢量上加运动矢量偏移量2*X。
此外,如果当前图像到第一基础运动矢量的参考图像的距离与当前图像到第二基础运动矢量的参考图像的距离之比不是2的倍数,则选择合适的比值(2的倍数),使第一基础运动矢量所使用的运动矢量偏移量与第二基础运动矢量所使用的运动矢量偏移量之比尽量接近当前图像到第一基础运动矢量的参考图像的距离与当前图像到第二基础运动矢量的参考图像的距离之比。例如,P2-P0=3*(P2-P1),则选择比值2或4。在第一基础运动矢量上加运动矢量偏移量2*X,在第二基础运动矢量上加运动矢量偏移量X;或者,在第一基础运动矢量上加运动矢量偏移量4*X,在第二基础运动矢量上加运动矢量偏移量X。再如,P2-P0=5*(P2-P1),则比值选择与5最接近的2的倍数4。在第一基础运动矢量上加运动矢量偏移量4*X,在第二基础运动矢量上加运动矢量偏移量X。
当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像(即当前图像块的参考图像为特定参考图像)时,由于特定参考图像与当前图像块所在图像的时间距离定义不明确,对运动矢量偏移量的缩放便没有意义。
可选地,在一些实施例中,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,所述两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
或者换句话说,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,根据处理后的运动矢量偏移量和基础运动矢量组确定当前图 像块的运动矢量,其中,处理后的运动矢量偏移量和处理前的运动矢量偏移量相同。例如,处理后的两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
在一个具体的实施例中,两个运动矢量偏移量中的某一个运动矢量偏移量(例如为offset1)可以是通过预设算法计算得到的。在第一基础运动矢量指向特定参考图像,或者第二基础运动矢量指向特定参考图像,或者第一基础运动矢量和第二基础运动矢量均指向特定参考图像时时,另一个运动矢量偏移量(例如为offset2)可以是对offset1进行缩放比例为1的缩放得到的,或者是对offset1不进行缩放操作得到的。或者,另一个运动矢量偏移量(例如为offset2)可以是对某个初始偏移量进行缩放比例为1的缩放得到的或不进行缩放操作得到的。换而言之,初始运动矢量偏移量可以是offset1也可以是其他的初始偏移量,本申请实施例对此不作限定。
在本申请的一些实施例中,方法由编码端执行,该方法还包括:根据预测的结果进行编码并向解码端发送码流,所述码流中包括用于指示使得率失真损失满足预设条件的运动矢量偏移量组合的索引。在该实施例中,编码端将确定好的运动矢量偏移量组合的索引告知解码端,这样解码端可以通过少量的计算便可以得知两个运动矢量偏移量,可以简化解码端。可选地,当两个运动矢量偏移量中的某一个运动矢量偏移量(例如为offset1)可以是通过预设算法计算得到的时,该索引可以是另一个运动矢量偏移量(例如为offset2)与offset1的比值。
在本申请的一些实施例中,方法由解码端执行,该方法还包括:接收编码端发送的码流,所述码流中包括用于指示两个运动矢量偏移量的形成组合的索引;所述从预设的偏移量集中确定两个运动矢量偏移量,包括:根据所述索引确定所述两个运动矢量偏移量。在该实施例中,编码端将确定好的运动矢量偏移量组合的索引告知解码端,这样解码端可以通过少量的计算便可以得知两个运动矢量偏移量,可以简化解码端。可选地,当两个运动矢量偏移量中的某一个运动矢量偏移量(例如为offset1)可以是通过预设算法计算得到的时,该索引可以是另一个运动矢量偏移量(例如为offset2)与offset1的比值。
上文结合图18描述了本申请的方法实施例,下文将描述上文方法实施 例对应的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。
图19为本申请实施例提供的视频图像处理装置1900的示意性框图。该装置1900用于执行如图18所示的方法实施例。该装置1900包括如下模块。
构建模块1910,用于确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;
预测模块1920,用于根据所述当前图像块的运动矢量对所述当前图像块进行预测。
本申请实施例的视频图像处理装置,以预设的偏移量集为基础对双预测基础运动矢量组中的基础运动矢量进行偏移,通过有限次计算便可以得到当前图像块的更精确的运动矢量,使得预测得到的残差更小,从而能够提高编码效率。
可选地,在一些实施例中,所述构建模块1910从预设的偏移量集中确定两个运动矢量偏移量,包括:所述构建模块从所述偏移量集中确定包括两个运动矢量偏移量的多组运动矢量偏移量组合;所述构建模块1910根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量,包括:所述构建模块1910从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。
可选地,在一些实施例中,视频图像处理装置1900用于编码端,视频图像处理装置1900还包括发送模块,用于:根据预测的结果进行编码并向解码端发送码流,所述码流中包括用于指示使得率失真损失满足预设条件的运动矢量偏移量组合的索引。
可选地,在一些实施例中,视频图像处理装置1900用于解码端,视频 图像处理装置1900还包括接收模块,用于:接收编码端发送的码流,所述码流中包括用于指示两个运动矢量偏移量的形成组合的索引;构建模块1910从预设的偏移量集中确定两个运动矢量偏移量,包括:根据所述索引确定所述两个运动矢量偏移量。
可选地,在一些实施例中,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,所述两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
可选地,在一些实施例中,当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,所述两个运动矢量偏移量用于:根据所述两个运动矢量偏移量对所述第一基础运动矢量和第二基础运动矢量进行调整。
可选地,在一些实施例中,所述当前图像到所述第一基础运动矢量的参考图像的距离与所述当前图像到所述第二基础运动矢量的参考图像的距离之比,等于所述第一基础运动矢量所使用的运动矢量偏移量与所述第二基础运动矢量所使用的运动矢量偏移量之比。
可选地,在一些实施例中,所述构建模块1910还用于:获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;所述构建模块1910确定基础运动矢量列表,包括:所述构建模块1910根据所述合并候选列表,确定所述基础运动矢量列表。
可选地,在一些实施例中,所述构建模块1910根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P大于或等于2时,所述构建模块1910取所述合并候选列表中前两组合并运动矢量候选形成所述基础运动矢量列表。
可选地,在一些实施例中,所述构建模块1910根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P小于2时,所述构建模块1910以运动矢量(0,0)填充形成所述基础运动矢量列表。
可选地,在一些实施例中,所述预设的偏移量集为{2,4,8,16,32,64,128,256}。
可选地,在一些实施例中,所述当前图像块为一个编码单元CU。
可选地,在一些实施例中,所述当前图像块为双预测图像块。
图20是本申请实施例提供的视频图像处理装置2000的示意性框图。如 图20所示的视频图像处理装置2000可以包括处理器2010和存储器2020,所述存储器2020中存储有计算机指令,所述处理器2010执行所述计算机指令时,使得所述视频图像处理装置2000执行以下步骤:确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;根据所述当前图像块的运动矢量对所述当前图像块进行预测。
应理解,本申请实施例的视频图像处理装置2000还可以包括网络接口,以用于传输码流。例如接收编码设备发送的码流。
在本申请的一些实施例中,处理器2010从预设的偏移量集中确定两个运动矢量偏移量,包括:从所述偏移量集中确定包括两个运动矢量偏移量的多组运动矢量偏移量组合;处理器2010根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量,包括:从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。
在本申请的一些实施例中,视频图像处理装置2000用于编码端,处理器2010还用于根据预测的结果进行编码并向解码端发送码流,所述码流中包括用于指示使得率失真损失满足预设条件的运动矢量偏移量组合的索引。
在本申请的一些实施例中,视频图像处理装置2000用于解码端,处理器2010还用于接收编码端发送的码流,所述码流中包括用于指示两个运动矢量偏移量的形成组合的索引;处理器2010从预设的偏移量集中确定两个运动矢量偏移量,包括:根据所述索引确定所述两个运动矢量偏移量。
在本申请的一些实施例中,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,所述两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
在本申请的一些实施例中,当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,所述两个运动矢量偏移量用于:根据所述两个运动矢量偏移量对所述第一基础运动矢量和第二基础运动矢量进行调整。
在本申请的一些实施例中,所述当前图像到所述第一基础运动矢量的参考图像的距离与所述当前图像到所述第二基础运动矢量的参考图像的距离之比,等于所述第一基础运动矢量所使用的运动矢量偏移量与所述第二基础运动矢量所使用的运动矢量偏移量之比。
在本申请的一些实施例中,处理器2010还用于:获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;处理器2010确定基础运动矢量列表,包括:根据所述合并候选列表,确定所述基础运动矢量列表。
在本申请的一些实施例中,处理器2010根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P大于或等于2时,取所述合并候选列表中前两组合并运动矢量候选形成所述基础运动矢量列表。
在本申请的一些实施例中,处理器2010根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P小于2时,以运动矢量(0,0)填充形成所述基础运动矢量列表。
在本申请的一些实施例中,预设的偏移量集为{2,4,8,16,32,64,128,256}。
在本申请的一些实施例中,当前图像块为一个编码单元CU。
在本申请的一些实施例中,当前图像块为双预测图像块。
应理解,图20所示的视频图像处理装置2000或图19所示的视频图像处理装置1900,可用于执行上述方法实施例中的操作或流程,并且视频图像处理装置2000或视频图像处理装置1900中的各个模块和器件的操作和/或功能分别为了实现上述方法实施例中的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供一种视频图像处理方法,包括:确定基础运动矢量列表,所述基础运动矢量列表中包括基础运动矢量组;当所述基础运动矢量组中有至少一个基础运动矢量指向特定参考图像时,放弃根据所述基础运动矢量组和运动矢量偏移量确定所述当前图像块的运动矢量。即,当基础运动矢量组中有至少一个基础运动矢量指向特定参考图像时,放弃将对应的运动矢量放入MV候选列表中。
在本申请的一些实施例中,获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;所述确定基础运动矢量列表,包括:根据所述合并候选列表,确定所述基础运动矢量列表。
在本申请的一些实施例中,所述根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P大于或等于2时,取所述合并候选列表中前两组合并运动矢量候选形成所述基础运动矢量列表。
在本申请的一些实施例中,所述根据所述合并候选列表,确定所述基础运动矢量列表,包括:在P小于2时,以运动矢量(0,0)填充形成所述基础运动矢量列表。
在本申请的一些实施例中,所述当前图像块为一个编码单元CU。
相对应地,本申请提供一种视频图像处理装置,包括:确定模块,用于确定基础运动矢量列表,所述基础运动矢量列表中包括基础运动矢量组;处理模块,用于当所述基础运动矢量组中有至少一个基础运动矢量指向特定参考图像为特定参考图像时,放弃根据所述基础运动矢量组和运动矢量偏移量确定所述当前图像块的运动矢量。
在本申请的一些实施例中,所述视频图像处理装置还包括构建模块,用于:获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;所述确定模块具体用于:根据所述合并候选列表,确定所述基础运动矢量列表。
在本申请的一些实施例中,所述确定模块具体用于:在P大于或等于2时,取所述合并候选列表中前两组合并运动矢量候选形成所述基础运动矢量列表。
在本申请的一些实施例中,所述确定模块具体用于:在P小于2时,以运动矢量(0,0)填充形成所述基础运动矢量列表。
在本申请的一些实施例中,所述当前图像块为一个编码单元CU。
本申请还提供一种视频图像处理装置包括处理器和存储器,所述存储器中存储有计算机指令,所述处理器执行所述计算机指令时,使得所述视频图像处理装置执行以下步骤:确定基础运动矢量列表,所述基础运动矢量列表中包括基础运动矢量组;当所述基础运动矢量组中有至少一个基础运动矢量指向特定参考图像时,放弃根据所述基础运动矢量组和运动矢量偏移量确定所述当前图像块的运动矢量。
应理解,本申请各实施例的设备可以基于存储器和处理器实现,各存储器用于存储用于执行本申请个实施例的方法的指令,处理器执行上述指令,使得设备执行本申请各实施例的方法。
应理解,本申请实施例中提及的处理器可以包括中央处理器(central processing pnit,CPU),网络处理器(network processor,NP)或者CPU和NP的组合。处理器还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。
还应理解,本申请实施例中提及的存储器可以是易失性存储器(volatile memory)或非易失性存储器(non-volatile memory),或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、快闪存储器(flash memory)、硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
需要说明的是,当处理器为通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件时,存储器(存储模块)集成在处理器中。
应注意,本文描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本申请实施例还提供一种计算机可读存储介质,其上存储有指令,当指令在计算机上运行时,使得所述计算机执行上述方法实施例的步骤。
本申请实施例还提供一种计算设备,该计算设备包括上述计算机可读存储介质。
本申请实施例还提供一种包括指令的计算机程序产品,其特征在于,当计算机运行所述计算机程序产品的所述指时,所述计算机执行上述方法实施例的步骤。
本申请实施例还提供一种计算机芯片,该计算机芯片使得计算机执行上述方法实施例的步骤。
本申请实施例可以应用在飞行器,尤其是无人机领域。
应理解,本申请各实施例的电路、子电路、子单元的划分只是示意性的。本领域普通技术人员可以意识到,本文中所公开的实施例描述的各示例的电路、子电路和子单元,能够再行拆分或组合。
本申请实施例提供给的设备,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,SSD)等。
应理解,本文中涉及的第一、第二以及各种数字编号仅为描述方便进行的区分,并不用来限制本申请的范围。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系, 表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
应理解,在本申请实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可 轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。

Claims (223)

  1. 一种视频图像处理方法,其特征在于,包括:
    对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;
    根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;
    将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
    根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  2. 根据权利要求1所述的方法,其特征在于,N等于1或2。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,之前还包括:
    将所述相关块的代表运动矢量作为候选者加入运动矢量第一候选列表;
    当确定采用所述候选者时,根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
    将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对应的子图像块的运动矢量。
  5. 根据权利要求3所述的方法,其特征在于,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
  6. 根据权利要求3所述的方法,其特征在于,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
  7. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:
    当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  9. 根据权利要求8所述的方法,其特征在于,当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
  10. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    确定其他候选者,将所述其他候选者加入所述运动矢量第一候选列表,其中,所述其他候选者中的至少一个候选者包括子图像块的运动矢量。
  11. 根据权利要求10所述的方法,其特征在于,所述方法还包括:
    当确定采用所述其他候选者中的其中一个候选者时,根据所述采用的候选者确定所述当前图像块中的子图像块的运动矢量。
  12. 根据权利要求10或11所述的方法,其特征在于,所述至少一个候选者包括一组控制点的运动矢量。
  13. 根据权利要求10至12任一项所述的方法,其特征在于,所述方法还包括:
    当确定采用所述至少一个候选者中的候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
    根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
  14. 根据权利要求13所述的方法,其特征在于,
    当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者中,每个候选者包括2个控制点的运动矢量;
    当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者中,每个候选者包括3个控制点的运动矢量。
  15. 根据权利要求3至14任一项所述的方法,其特征在于,所述方法还包括:
    从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
    将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述 运动矢量第一候选列表。
  16. 根据权利要求15所述的方法,其特征在于,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
    在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
    在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
    将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
  17. 根据权利要求3至14任一项所述的方法,其特征在于,所述方法还包括:
    根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
    将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  18. 根据权利要求17所述的方法,其特征在于,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
    对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
  19. 根据权利要求17或18所述的方法,其特征在于,所述方法还包括:
    当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  20. 根据权利要求17或18所述的方法,其特征在于,所述方法还包括:
    当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  21. 根据权利要求3至15任一项所述的方法,其特征在于,所述方法还包括:
    构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的 候选者为一个图像块的运动矢量;
    当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者的运动矢量确定所述当前图像块的运动矢量。
  22. 根据权利要求21所述的方法,其特征在于,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
    将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
  23. 根据权利要求21所述的方法,其特征在于,所述构建运动矢量第二候选列表,包括:
    根据所述当前图像块在当前图像上的若干个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
  24. 根据权利要求23所述的方法,其特征在于,所述当前图像块在当前图像上的若干个邻近块包括所述预设的M个邻近块。
  25. 根据权利要求24所述的方法,其特征在于,按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表;
    所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块。
  26. 根据权利要求24所述的方法,其特征在于,所述方法还包括:
    当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
  27. 根据权利要求1或26所述的方法,其特征在于,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
    对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
  28. 根据权利要求19所述的方法,其特征在于,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
    将所述第一个符合预设条件的邻近块作为所述目标邻近块。
  29. 根据权利要求19或28所述的方法,其特征在于,所述预设条件包括:
    邻近块的参考图像与所述当前图像块的参考图像相同。
  30. 根据权利要求27至29任一项所述的方法,其特征在于,所述方法还包括:
    当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测。
  31. 根据权利要求30所述的方法,其特征在于,所述根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
  32. 根据权利要求30所述的方法,其特征在于,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
  33. 根据权利要求30所述的方法,其特征在于,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
    将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
  34. 根据权利要求27至30任一项所述的方法,其特征在于,所述方法还包括:
    当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
  35. 根据权利要求34所述的方法,其特征在于,所述默认块为运动矢量(0,0)指向的图像块。
  36. 根据权利要求1至35任一项所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于或等于64个像素。
  37. 根据权利要求36所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
  38. 根据权利要求37所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,所述方法还包括:
    设置不进行时域运动矢量预测TMVP操作。
  39. 根据权利要求1至36任一项所述的方法,其特征在于,所述方法还包括:
    在所述子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行时域运动矢量预测TMVP操作。
  40. 根据权利要求1至39任一项所述的方法,其特征在于,所述当前图像块为一个编码单元CU。
  41. 根据权利要求1至40任一项所述的方法,其特征在于,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
    根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
  42. 根据权利要求1至41任一项所述的方法,其特征在于,所述邻近块为在所述当前图像上与所述当前图像块的位置相邻或具有一定位置间距的图像块。
  43. 根据权利要求1至42任一项所述的方法,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  44. 根据权利要求43所述的视频图像处理方法,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  45. 根据权利要求1至42中任一项所述的方法,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
  46. 根据权利要求1至45任一项所述的方法,其特征在于,所述方法还包括:
    当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  47. 根据权利要求46所述的方法,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  48. 根据权利要求1至47任一项所述的方法,其特征在于,M小于等于4。
  49. 一种视频图像处理方法,其特征在于,包括:
    根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块;
    对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;
    根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;
    根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者;
    当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
    根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  50. 根据权利要求49所述的方法,其特征在于,所述运动矢量第一候选列表中的至少一个候选者包括子图像块的运动矢量,所述运动矢量第二候选列表中的每个候选者包括图像块的运动矢量。
  51. 根据权利要求49或50所述的方法,其特征在于,N等于1或2。
  52. 根据权利要求49至51任一项所述的方法,其特征在于,所述M个候选者包括所述当前图像块在当前图像上的M个邻近块的运动矢量。
  53. 根据权利要求52所述的方法,其特征在于,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
    对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
  54. 根据权利要求52所述的方法,其特征在于,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
    将所述第一个符合预设条件的邻近块作为所述目标邻近块。
  55. 根据权利要求53或54所述的方法,其特征在于,所述预设条件包括:
    邻近块的参考图像与所述当前图像块的参考图像相同。
  56. 根据权利要求53至55任一项所述的方法,其特征在于,所述方法还包括:
    当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测。
  57. 根据权利要求56所述的方法,其特征在于,所述根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
  58. 根据权利要求56所述的方法,其特征在于,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
  59. 根据权利要求56所述的方法,其特征在于,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
    将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
  60. 根据权利要求53至55任一项所述的方法,其特征在于,所述方法还包括:
    当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
  61. 根据权利要求60所述的方法,其特征在于,所述默认块为运动矢量(0,0)指向的图像块。
  62. 根据权利要求49至61任一项所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于或等于64个像素。
  63. 根据权利要求62所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
  64. 根据权利要求63所述的方法,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,所述方法还包括:
    设置不进行时域运动矢量预测TMVP操作。
  65. 根据权利要求49至62任一项所述的方法,其特征在于,所述方法还包括:
    在所述子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行时域运动矢量预测TMVP操作。
  66. 根据权利要求49至65任一项所述的方法,其特征在于,所述当前图像块为一个编码单元CU。
  67. 根据权利要求49至66任一项所述的方法,其特征在于,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
    根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
  68. 根据权利要求49至67任一项所述的方法,其特征在于,所述邻近 块为在所述当前图像上与所述当前图像块的位置相邻或具有一定位置间距的图像块。
  69. 根据权利要求49至68任一项所述的方法,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  70. 根据权利要求69所述的方法,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  71. 根据权利要求49至68中任一项所述的方法,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
  72. 根据权利要求49至71任一项所述的方法,其特征在于,所述方法还包括:
    当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  73. 根据权利要求72所述的方法,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  74. 根据权利要求49至73任一项所述的方法,其特征在于,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
    将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对应的子图像块的运动矢量。
  75. 根据权利要求49至73任一项所述的方法,其特征在于,所述根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者,包括:
    将所述当前图像块的相关块的代表运动矢量作为所述特定候选者加入所述运动矢量第一候选列表。
  76. 根据权利要求75所述的方法,其特征在于,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
  77. 根据权利要求75所述的方法,其特征在于,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
  78. 根据权利要求75所述的方法,其特征在于,所述方法还包括:
    当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
  79. 根据权利要求78所述的方法,其特征在于,所述方法还包括:
    当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  80. 根据权利要求78所述的方法,其特征在于,当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
  81. 根据权利要求49至80任一项所述的方法,其特征在于,所述方法还包括:
    当确定采用所述运动矢量第二候选列表中除所述特定候选者以外的其中一个候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
    根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
  82. 根据权利要求81所述的方法,其特征在于,所述运动矢量第二候选列表中除所述特定候选者以外至少一个候选者中,每个候选者包括一组控制点的运动矢量。
  83. 根据权利要求82所述的方法,其特征在于,
    当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者中,每个候选者包括2个控制点的运动矢量;
    当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者中,每个候选者包括3个控制点的运动矢量。
  84. 根据权利要求49至83任一项所述的方法,其特征在于,所述方法还包括:
    从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
    将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述运动矢量第一候选列表。
  85. 根据权利要求84所述的方法,其特征在于,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
    在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
    在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
    将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
  86. 根据权利要求49至85任一项所述的方法,其特征在于,所述方法还包括:
    根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
    将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  87. 根据权利要求86所述的方法,其特征在于,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
    对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
  88. 根据权利要求86或87所述的方法,其特征在于,所述方法还包括:
    当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  89. 根据权利要求86或87所述的方法,其特征在于,所述方法还包括:
    当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  90. 根据权利要求51至63任一项所述的方法,其特征在于,所述方法还包括:
    构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的候选者为一个图像块的运动矢量;
    当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者的运动矢量确定所述当前图像块的运动矢量。
  91. 根据权利要求90所述的方法,其特征在于,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
    将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
  92. 根据权利要求90所述的方法,其特征在于,所述构建运动矢量第二候选列表,包括:
    根据所述当前图像块在当前图像上的M个邻近块的运动矢量确定加入所述运动矢量第二候选列表的所述M个候选者。
  93. 根据权利要求92所述的方法,其特征在于,按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表,所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块;
    和/或
    当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选 列表的候选者。
  94. 根据权利要求49至93任一项所述的方法,其特征在于,M小于等于4。
  95. 一种视频图像处理装置,其特征在于,包括:
    构建模块,用于对当前图像块的预设的M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
    预测模块,根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  96. 根据权利要求95所述的视频图像处理装置,其特征在于,N等于1或2。
  97. 根据权利要求95或96所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    在所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测之前,将所述相关块的代表运动矢量作为候选者加入运动矢量第一候选列表;
    当确定采用所述候选者时,所述预测模块根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  98. 根据权利要求97所述的视频图像处理装置,其特征在于,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
    将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对应的子图像块的运动矢量。
  99. 根据权利要求97所述的视频图像处理装置,其特征在于,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
  100. 根据权利要求97所述的视频图像处理装置,其特征在于,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
  101. 根据权利要求97所述的视频图像处理装置,其特征在于,所述预 测模块还用于:
    当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
  102. 根据权利要求101所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  103. 根据权利要求102所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
  104. 根据权利要求97所述的视频图像处理装置,其特征在于,所述构架模块还用于:
    确定其他候选者,将所述其他候选者加入所述运动矢量第一候选列表,其中,所述其他候选者中的至少一个候选者包括子图像块的运动矢量。
  105. 根据权利要求104所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当确定采用所述其他候选者中的其中一个候选者时,根据所述采用的候选者确定所述当前图像块中的子图像块的运动矢量。
  106. 根据权利要求104或105所述的视频图像处理装置,其特征在于,所述至少一个候选者包括一组控制点的运动矢量。
  107. 根据权利要求104至106任一项所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当确定采用所述至少一个候选者中的候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
    根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
  108. 根据权利要求107所述的视频图像处理装置,其特征在于,
    当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者 中,每个候选者包括2个控制点的运动矢量;
    当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者中,每个候选者包括3个控制点的运动矢量。
  109. 根据权利要求97至108任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
    将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述运动矢量第一候选列表。
  110. 根据权利要求109所述的视频图像处理装置,其特征在于,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
    在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
    在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
    将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
  111. 根据权利要求97至108任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
    将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  112. 根据权利要求111所述的视频图像处理装置,其特征在于,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
    对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
  113. 根据权利要求111或112所述的视频图像处理装置,其特征在于, 所述构建模块还用于:
    当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  114. 根据权利要求111或112所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  115. 根据权利要求97至109任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的候选者为一个图像块的运动矢量;
    当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者的运动矢量确定所述当前图像块的运动矢量。
  116. 根据权利要求115所述的视频图像处理装置,其特征在于,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
    将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
  117. 根据权利要求115所述的视频图像处理装置,其特征在于,所述构建运动矢量第二候选列表,包括:
    根据所述当前图像块在当前图像上的若干个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
  118. 根据权利要求117所述的视频图像处理装置,其特征在于,所述当前图像块在当前图像上的若干个邻近块包括所述预设的M个邻近块。
  119. 根据权利要求118所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表;
    所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块。
  120. 根据权利要求118所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
  121. 根据权利要求95或120所述的视频图像处理装置,其特征在于,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
    对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
  122. 根据权利要求113所述的视频图像处理装置,其特征在于,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
    将所述第一个符合预设条件的邻近块作为所述目标邻近块。
  123. 根据权利要求113或122所述的视频图像处理装置,其特征在于,所述预设条件包括:
    邻近块的参考图像与所述当前图像块的参考图像相同。
  124. 根据权利要求121至123任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,
    所述预测模块还用于根据所述缩放处理后的运动矢量对所述当前图像块进行预测。
  125. 根据权利要求124所述的视频图像处理装置,其特征在于,所述根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
  126. 根据权利要求124所述的视频图像处理装置,其特征在于,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
  127. 根据权利要求124所述的视频图像处理装置,其特征在于,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩 放处理后的运动矢量对所述当前图像块进行预测,包括:
    对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
    将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
  128. 根据权利要求121至124任一项所述的视频图像处理装置,其特征在于,当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
  129. 根据权利要求128所述的视频图像处理装置,其特征在于,所述默认块为运动矢量(0,0)指向的图像块。
  130. 根据权利要求95至129任一项所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于或等于64个像素。
  131. 根据权利要求130所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
  132. 根据权利要求131所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,所述视频图像处理装置还包括处理模块,用于:
    设置不进行时域运动矢量预测TMVP操作。
  133. 根据权利要求95至130任一项所述的视频图像处理装置,其特征在于,所述视频图像处理装置还包括处理模块,用于:
    在所述子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行时域运动矢量预测TMVP操作。
  134. 根据权利要求95至133任一项所述的视频图像处理装置,其特征在于,所述当前图像块为一个编码单元CU。
  135. 根据权利要求95至134任一项所述的视频图像处理装置,其特征在于,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
    根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
  136. 根据权利要求85至135任一项所述的视频图像处理装置,其特征在于,所述邻近块为在所述当前图像上与所述当前图像块的位置相邻或具有一定位置间距的图像块。
  137. 根据权利要求95至136任一项所述的视频图像处理装置,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  138. 根据权利要求137所述的视频图像处理装置,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  139. 根据权利要求95至136中任一项所述的视频图像处理装置,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
  140. 根据权利要求95至139任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  141. 根据权利要求140所述的视频图像处理装置,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  142. 根据权利要求95至141任一项所述的视频图像处理装置,其特征在于,M小于等于4。
  143. 一种视频图像处理装置,其特征在于,包括:
    构建模块,用于根据当前图像块的运动矢量第二候选列表中的M个候选者确定所述当前图像块的M个邻近块;对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,N小于M;根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块;根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者;当确定采用所述特定候选者时,将所述当前图像块和所述相关块采用相同的方式划分成多个子图像块,所述当前图像块中的各子图像块与所述相关块中的各子图像块一一对应;
    预测模块,用于根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  144. 根据权利要求143所述的视频图像处理装置,其特征在于,所述运动矢量第一候选列表中的至少一个候选者包括子图像块的运动矢量,所述运动矢量第二候选列表中的每个候选者包括图像块的运动矢量。
  145. 根据权利要求143或144所述的视频图像处理装置,其特征在于,N等于1或2。
  146. 根据权利要求143至145任一项所述的视频图像处理装置,其特征在于,所述M个候选者包括所述当前图像块在当前图像上的M个邻近块的运动矢量。
  147. 根据权利要求146所述的视频图像处理装置,其特征在于,所述对所述M个邻近块中的N个邻近块依次扫描,根据扫描结果确定目标邻近块,包括:
    对所述N个邻近块依次进行扫描,当扫描到第一个符合预设条件的邻近块时,停止扫描,且根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块。
  148. 根据权利要求146所述的视频图像处理装置,其特征在于,所述根据所述扫描到的所述第一个符合预设条件的邻近块确定目标邻近块,包括:
    将所述第一个符合预设条件的邻近块作为所述目标邻近块。
  149. 根据权利要求147或148所述的视频图像处理装置,其特征在于,所述预设条件包括:
    邻近块的参考图像与所述当前图像块的参考图像相同。
  150. 根据权利要求147至149任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,
    所述预测模块还用于根据所述缩放处理后的运动矢量对所述当前图像块进行预测。
  151. 根据权利要求150所述的视频图像处理装置,其特征在于,所述根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    根据所述缩放处理后的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块。
  152. 根据权利要求150所述的视频图像处理装置,其特征在于,所述特定邻近块为所述N个邻近块中,按扫描顺序得到的第一个邻近块或者最后一个邻近块。
  153. 根据权利要求150所述的视频图像处理装置,其特征在于,所述对所述M个邻近块中的特定邻近块的运动矢量进行缩放处理,根据所述缩放处理后的运动矢量对所述当前图像块进行预测,包括:
    对所述特定邻近块的运动矢量进行缩放处理,使得经过所述缩放处理后的运动矢量指向的参考帧与所述当前图像块的参考图像相同;
    将经过所述缩放处理后的运动矢量在所述当前图像块的参考图像中指向的图像块作为所述当前图像块的参考块。
  154. 根据权利要求147至149任一项所述的视频图像处理装置,其特征在于,当在所述N个邻近块中未扫描到符合所述预设条件的邻近块时,将默认块作为所述当前图像块的参考块。
  155. 根据权利要求154所述的视频图像处理装置,其特征在于,所述默认块为运动矢量(0,0)指向的图像块。
  156. 根据权利要求143至155任一项所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为大于 或等于64个像素。
  157. 根据权利要求156所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,或16×4个像素或4×16个像素。
  158. 根据权利要求157所述的视频图像处理装置,其特征在于,所述子图像块的大小和/或所述子图像块的相关块的大小固定为8×8个像素,所述视频图像处理装置还包括处理模块,用于:
    设置不进行时域运动矢量预测TMVP操作。
  159. 根据权利要求143至155任一项所述的视频图像处理装置,其特征在于,所述视频图像处理装置还包括处理模块,用于:
    在所述子图像块和/或所述子图像块的相关块的宽和高中至少一个小于8像素的情况下,设置不进行时域运动矢量预测TMVP操作。
  160. 根据权利要求143至159任一项所述的视频图像处理装置,其特征在于,所述当前图像块为一个编码单元CU。
  161. 根据权利要求143至160任一项所述的视频图像处理装置,其特征在于,所述根据所述目标邻近块的运动矢量、所述当前图像块以及所述当前图像块的参考图像,确定所述当前图像块的相关块,包括:
    根据所述目标邻近块的运动矢量,在所述当前图像块的参考图像中确定所述当前图像块的相关块。
  162. 根据权利要求143至161任一项所述的视频图像处理装置,其特征在于,所述邻近块为在所述当前图像上与所述当前图像块的位置相邻或具有一定位置间距的图像块。
  163. 根据权利要求143至162任一项所述的视频图像处理装置,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  164. 根据权利要求163所述的视频图像处理装置,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  165. 根据权利要求143至162中任一项所述的视频图像处理装置,其特征在于,所述根据所述相关块的运动矢量对所述当前图像块进行预测,包括:
    当所述相关块的参考图像为特定参考图像,或者所述当前块的参考图像为特定参考图像时,放弃根据所述相关块的运动矢量确定所述当前图像块的参考块。
  166. 根据权利要求143至165任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述特定邻近块的运动矢量指向特定参考图像,或者所述当前图像块的参考图像为特定参考图像时,根据处理后的所述相关块的运动矢量和所述当前图像块的参考图像确定所述当前图像块的参考块;
    其中,所述处理后的所述相关块的运动矢量和处理前的相关块的运动矢量相同。
  167. 根据权利要求166所述的视频图像处理装置,其特征在于,所述处理后的所述相关块的运动矢量,包括:
    根据数值为1的缩放比例对所述相关块的运动矢量进行缩放后得到的运动矢量,或者,
    跳过缩放步骤的所述相关块的运动矢量。
  168. 根据权利要求143至167任一项所述的视频图像处理装置,其特征在于,所述根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测,包括:
    将所述相关块中各子图像块的运动矢量,分别作为所述当前图像块中对应的子图像块的运动矢量。
  169. 根据权利要求143至167任一项所述的视频图像处理装置,其特征在于,所述根据所述当前图像块的相关块确定所述当前图像块的运动矢量第一候选列表中的特定候选者,包括:
    将所述当前图像块的相关块的代表运动矢量作为所述特定候选者加入所述运动矢量第一候选列表。
  170. 根据权利要求169所述的视频图像处理装置,其特征在于,将所述相关块的代表运动矢量作为第一个候选者加入运动矢量第一候选列表。
  171. 根据权利要求169所述的视频图像处理装置,其特征在于,所述相关块的代表运动矢量包括所述相关块的中心位置的运动矢量。
  172. 根据权利要求169所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当所述相关块中出现不可获得运动矢量的子图像块时,将所述相关块的代表运动矢量作为所述不可获得运动矢量的子图像块的运动矢量,对所述当前图像块中对应的子图像块进行预测。
  173. 根据权利要求172所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当所述相关块中出现不可获得运动矢量的子图像块,且所述相关块的代表运动矢量不可获得时,放弃根据所述相关块中各子图像块的运动矢量分别对所述当前图像块中对应的子图像块进行预测。
  174. 根据权利要求172所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当所述相关块中的子图像块不可获得,或者所述相关块中的子图像块采用帧内编码模式时,确定所述相关块中出现不可获得运动矢量的子图像块。
  175. 根据权利要求143至174任一项所述的视频图像处理装置,其特征在于,所述预测模块还用于:
    当确定采用所述运动矢量第二候选列表中除所述特定候选者以外的其中一个候选者时,根据仿射变换模型对所述采用的候选者进行仿射变换;
    根据所述仿射变换后的候选者对所述当前图像块中的子图像块进行预测。
  176. 根据权利要求175所述的视频图像处理装置,其特征在于,所述运动矢量第二候选列表中除所述特定候选者以外至少一个候选者中,每个候选者包括一组控制点的运动矢量。
  177. 根据权利要求176所述的视频图像处理装置,其特征在于,
    当所述仿射变换模型包括四参仿射变换模型时,所述至少一个候选者中,每个候选者包括2个控制点的运动矢量;
    当所述仿射变换模型包括六参仿射变换模型时,所述至少一个候选者 中,每个候选者包括3个控制点的运动矢量。
  178. 根据权利要求143至177任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组;
    将每一个确定的邻近块的控制点运动矢量组作为一个候选者加入所述运动矢量第一候选列表。
  179. 根据权利要求178所述的视频图像处理装置,其特征在于,所述从所述当前图像块的邻近块中,按特定扫描顺序确定采用仿射变换模式进行预测的邻近块的控制点运动矢量组,包括:
    在所述当前图像块的左侧邻近块中按第一扫描顺序确定第一邻近块的控制点运动矢量组;
    在所述当前图像块的上侧邻近块中按第二扫描顺序确定第二邻近块的控制点运动矢量组;
    将所述第一邻近块的控制点运动矢量组和所述第二邻近块的控制点运动矢量组加入所述运动矢量第一候选列表。
  180. 根据权利要求143至179任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量;
    将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  181. 根据权利要求180所述的视频图像处理装置,其特征在于,所述根据所述当前图像块的部分控制点的邻近块构造所述部分控制点的运动矢量,包括:
    对所述部分控制点中的每个控制点,按第三扫描顺序对所述控制点的特定邻近块依次扫描,将满足预设条件的特定邻近块的运动矢量作为所述控制点的运动矢量。
  182. 根据权利要求180或181所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述部分控制点的运动矢量分别指向不同的参考帧时,放弃将所述当 前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  183. 根据权利要求180或181所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    当所述运动矢量第一候选列表中的候选者的数量大于预设数值时,放弃将所述当前图像块的部分控制点的运动矢量加入所述运动矢量第一候选列表。
  184. 根据权利要求143至157任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    构建运动矢量第二候选列表,其中,加入所述运动矢量第二候选列表的候选者为一个图像块的运动矢量;
    当确认采用所述运动矢量第二候选列表中的候选者时,根据所述候选者的运动矢量确定所述当前图像块的运动矢量。
  185. 根据权利要求184所述的视频图像处理装置,其特征在于,所述根据所述候选者的运动矢量确定所述当前图像块的运动矢量,包括:
    将所述确认采用的候选者作为所述当前图像块的运动矢量,或者对所述确认采用的候选者进行缩放后作为所述当前图像块的运动矢量。
  186. 根据权利要求184所述的视频图像处理装置,其特征在于,所述构建运动矢量第二候选列表,包括:
    根据所述当前图像块在当前图像上的M个邻近块的运动矢量确定加入所述运动矢量第二候选列表的所述M个候选者。
  187. 根据权利要求186所述的视频图像处理装置,其特征在于,所述构建模块还用于按预设顺序依次将所述预设的M个邻近块的运动矢量分别作为M个候选者,加入所述运动矢量第二候选列表,所述N个邻近块,指的是按所述预设顺序首先确定的N个邻近块;
    和/或
    所述构建模块还用于当所述M个邻近块中的一个或多个邻近块的运动矢量不可获得时,放弃根据所述一个或多个邻近块的运动矢量确定加入所述运动矢量第二候选列表的候选者。
  188. 根据权利要求143至187任一项所述的视频图像处理装置,其特征在于,M小于等于4。
  189. 一种视频图像处理装置,其特征在于,包括:存储器与处理器, 所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行如权利要求1至48中任一项所述的方法。
  190. 一种视频图像处理装置,其特征在于,包括:存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行如权利要求49至94中任一项所述的方法。
  191. 一种计算机存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行如权利要求1至48中任一项所述的方法。
  192. 一种计算机存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行如权利要求49至94中任一项所述的方法。
  193. 一种包含指令的计算机程序产品,其特征在于,所述指令被计算机执行时使得计算机执行如权利要求1至48中任一项所述的方法。
  194. 一种包含指令的计算机程序产品,其特征在于,所述指令被计算机执行时使得计算机执行如权利要求49至94中任一项所述的方法。
  195. 一种视频图像处理方法,其特征在于,包括:
    确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;
    从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;
    根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;
    根据所述当前图像块的运动矢量对所述当前图像块进行预测。
  196. 根据权利要求195所述的方法,其特征在于,所述从预设的偏移量集中确定两个运动矢量偏移量,包括:
    从所述偏移量集中确定包括两个运动矢量偏移量的多组运动矢量偏移量组合;
    所述根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的 运动矢量偏移量,确定当前图像块的运动矢量,包括:
    从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。
  197. 根据权利要求195或196所述的方法,其特征在于,所述方法由编码端执行,所述方法还包括:
    根据预测的结果进行编码并向解码端发送码流,所述码流中包括用于指示使得率失真损失满足预设条件的运动矢量偏移量组合的索引。
  198. 根据权利要求195所述的方法,其特征在于,所述方法由解码端执行,所述方法还包括:
    接收编码端发送的码流,所述码流中包括用于指示两个运动矢量偏移量的形成组合的索引;
    所述从预设的偏移量集中确定两个运动矢量偏移量,包括:
    根据所述索引确定所述两个运动矢量偏移量。
  199. 根据权利要求195至198中任一项所述的方法,其特征在于,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,所述两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
  200. 根据权利要求195至198中任一项所述的方法,其特征在于,当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,所述两个运动矢量偏移量用于:
    根据所述两个运动矢量偏移量对所述第一基础运动矢量和第二基础运动矢量进行调整。
  201. 根据权利要求200所述的方法,其特征在于,所述当前图像到所述第一基础运动矢量的参考图像的距离与所述当前图像到所述第二基础运动矢量的参考图像的距离之比,等于所述第一基础运动矢量所使用的运动矢量偏移量与所述第二基础运动矢量所使用的运动矢量偏移量之比。
  202. 根据权利要求195至201中任一项所述的方法,其特征在于,所述方法还包括:
    获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;
    所述确定基础运动矢量列表,包括:
    根据所述合并候选列表,确定所述基础运动矢量列表。
  203. 根据权利要求202所述的方法,其特征在于,所述根据所述合并候选列表,确定所述基础运动矢量列表,包括:
    在P大于或等于2时,取所述合并候选列表中的两组合并运动矢量候选形成所述基础运动矢量列表。
  204. 根据权利要求202所述的方法,其特征在于,所述根据所述合并候选列表,确定所述基础运动矢量列表,包括:
    在P小于2时,以运动矢量(0,0)填充形成所述基础运动矢量列表。
  205. 根据权利要求195至204任一项所述的方法,其特征在于,所述预设的偏移量集为{2,4,8,16,32,64,128,256}。
  206. 根据权利要求195至205任一项所述的方法,其特征在于,所述当前图像块为一个编码单元CU。
  207. 根据权利要求195至206任一项所述的方法,其特征在于,所述当前图像块为双预测图像块。
  208. 一种视频图像处理装置,其特征在于,包括:
    构建模块,用于确定基础运动矢量列表,所述基础运动矢量列表中包括至少一组双预测基础运动矢量组,所述双预测基础运动矢量组中包括第一基础运动矢量和第二基础运动矢量;从预设的偏移量集中确定两个运动矢量偏移量,所述两个运动矢量偏移量分别对应于所述第一基础运动矢量和所述第二基础运动矢量;根据所述第一基础运动矢量、所述第二基础运动矢量和所述两个的运动矢量偏移量,确定当前图像块的运动矢量;
    预测模块,用于根据所述当前图像块的运动矢量对所述当前图像块进行预测。
  209. 根据权利要求208所述的视频图像处理装置,其特征在于,所述构建模块从预设的偏移量集中确定两个运动矢量偏移量,包括:
    所述构建模块从所述偏移量集中确定包括两个运动矢量偏移量的多组运动矢量偏移量组合;
    所述构建模块根据所述第一基础运动矢量、所述第二基础运动矢量和所 述两个的运动矢量偏移量,确定当前图像块的运动矢量,包括:
    所述构建模块从所述多组运动矢量偏移量组合中确定出使得率失真损失满足预设条件的运动矢量偏移量组合,根据所述第一基础运动矢量、所述第二基础运动矢量和所述使得率失真损失满足预设条件的运动矢量偏移量组合,确定当前图像块的运动矢量。
  210. 根据权利要求208或209所述的视频图像处理装置,其特征在于,所述视频图像处理装置用于编码端,所述视频图像处理装置还包括发送模块,用于:
    根据预测的结果进行编码并向解码端发送码流,所述码流中包括用于指示使得率失真损失满足预设条件的运动矢量偏移量组合的索引。
  211. 根据权利要求208所述的视频图像处理装置,其特征在于,所述视频图像处理装置用于解码端,所述视频图像处理装置还包括接收模块,用于:
    接收编码端发送的码流,所述码流中包括用于指示两个运动矢量偏移量的形成组合的索引;
    所述构建模块从预设的偏移量集中确定两个运动矢量偏移量,包括:
    根据所述索引确定所述两个运动矢量偏移量。
  212. 根据权利要求208至211中任一项所述的视频图像处理装置,其特征在于,当所述第一基础运动矢量和/或第二基础运动矢量指向特定参考图像时,所述两个运动矢量偏移量中的至少一个运动矢量偏移量,包括:根据数值为1的缩放比例对初始运动矢量偏移量进行缩放后得到的运动矢量偏移量,或者,跳过缩放操作得到的运动矢量偏移量。
  213. 根据权利要求208至211中任一项所述的视频图像处理装置,其特征在于,当所述第一基础运动矢量和第二基础运动矢量均指向非特定参考图像时,所述两个运动矢量偏移量用于:
    根据所述两个运动矢量偏移量对所述第一基础运动矢量和第二基础运动矢量进行调整。
  214. 根据权利要求213所述的视频图像处理装置,其特征在于,所述当前图像到所述第一基础运动矢量的参考图像的距离与所述当前图像到所述第二基础运动矢量的参考图像的距离之比,等于所述第一基础运动矢量所使用的运动矢量偏移量与所述第二基础运动矢量所使用的运动矢量偏移量 之比。
  215. 根据权利要求208至214中任一项所述的视频图像处理装置,其特征在于,所述构建模块还用于:
    获取合并候选列表,所述合并候选列表中包括P组合并运动矢量候选,其中,P为大于或等于1的整数;
    所述构建模块确定基础运动矢量列表,包括:
    所述构建模块根据所述合并候选列表,确定所述基础运动矢量列表。
  216. 根据权利要求215所述的视频图像处理装置,其特征在于,所述构建模块根据所述合并候选列表,确定所述基础运动矢量列表,包括:
    在P大于或等于2时,所述构建模块取所述合并候选列表中的两组合并运动矢量候选形成所述基础运动矢量列表。
  217. 根据权利要求215所述的视频图像处理装置,其特征在于,所述构建模块根据所述合并候选列表,确定所述基础运动矢量列表,包括:
    在P小于2时,所述构建模块以运动矢量(0,0)填充形成所述基础运动矢量列表。
  218. 根据权利要求208至217任一项所述的视频图像处理装置,其特征在于,所述预设的偏移量集为{2,4,8,16,32,64,128,256}。
  219. 根据权利要求208至218任一项所述的视频图像处理装置,其特征在于,所述当前图像块为一个编码单元CU。
  220. 根据权利要求208至219任一项所述的视频图像处理装置,其特征在于,所述当前图像块为双预测图像块。
  221. 一种视频图像处理装置,其特征在于,包括:存储器与处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得,所述处理器用于执行如权利要求195至207中任一项所述的方法。
  222. 一种计算机存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时使得,所述计算机执行如权利要求195至207中任一项所述的方法。
  223. 一种包含指令的计算机程序产品,其特征在于,所述指令被计算机执行时使得计算机执行如权利要求195至207中任一项所述的方法。
PCT/CN2019/078051 2018-04-02 2019-03-13 视频图像处理方法与装置 WO2019192301A1 (zh)

Priority Applications (11)

Application Number Priority Date Filing Date Title
CN202210376345.1A CN115037942A (zh) 2018-04-02 2019-03-13 视频图像处理方法与装置
KR1020207031587A KR102685009B1 (ko) 2018-04-02 2019-03-13 비디오 이미지 처리 방법 및 장치
KR1020247022895A KR20240110905A (ko) 2018-04-02 2019-03-13 비디오 이미지 처리 방법 및 장치
JP2020553581A JP7533945B2 (ja) 2018-04-02 2019-03-13 動画像処理方法、及び動画像処理装置
EP19781713.3A EP3780619A4 (en) 2018-04-02 2019-03-13 METHOD AND APPARATUS FOR VIDEO IMAGE PROCESSING
CN202210376602.1A CN114938452A (zh) 2018-04-02 2019-03-13 视频图像处理方法与装置
CN201980002813.5A CN110720219B (zh) 2018-04-02 2019-03-13 视频图像处理方法与装置
US17/039,903 US11190798B2 (en) 2018-04-02 2020-09-30 Method and device for video image processing
US17/456,815 US11997312B2 (en) 2018-04-02 2021-11-29 Method and device for video image processing
JP2024025977A JP2024057014A (ja) 2018-04-02 2024-02-22 動画像符号化方法、動画像符号化装置、及びビットストリーム生成方法
US18/674,297 US20240314356A1 (en) 2018-04-02 2024-05-24 Method and device for video image processing

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
PCT/CN2018/081652 WO2019191890A1 (zh) 2018-04-02 2018-04-02 用于图像处理的方法和图像处理装置
CNPCT/CN2018/081652 2018-04-02
PCT/CN2018/095710 WO2019192102A1 (zh) 2018-04-02 2018-07-13 用于图像运动补偿的方法和装置
CNPCT/CN2018/095710 2018-07-13
PCT/CN2018/103693 WO2019192141A1 (zh) 2018-04-02 2018-08-31 用于图像运动补偿的方法和装置
CNPCT/CN2018/103693 2018-08-31
CNPCT/CN2018/107436 2018-09-25
PCT/CN2018/107436 WO2019192152A1 (zh) 2018-04-02 2018-09-25 获取视频图像运动矢量的方法与装置
PCT/CN2018/112805 WO2019192170A1 (zh) 2018-04-02 2018-10-30 视频图像处理方法与装置
CNPCT/CN2018/112805 2018-10-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/039,903 Continuation US11190798B2 (en) 2018-04-02 2020-09-30 Method and device for video image processing

Publications (1)

Publication Number Publication Date
WO2019192301A1 true WO2019192301A1 (zh) 2019-10-10

Family

ID=68099823

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/078051 WO2019192301A1 (zh) 2018-04-02 2019-03-13 视频图像处理方法与装置

Country Status (1)

Country Link
WO (1) WO2019192301A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115086678A (zh) * 2022-08-22 2022-09-20 北京达佳互联信息技术有限公司 视频编码方法和装置、视频解码方法和装置
WO2022257674A1 (zh) * 2021-06-07 2022-12-15 腾讯科技(深圳)有限公司 帧间预测的编码方法、装置、设备及可读存储介质
CN116405699A (zh) * 2020-03-31 2023-07-07 北京达佳互联信息技术有限公司 用于在视频编解码中用信号通知语法元素的方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101573985A (zh) * 2006-11-03 2009-11-04 三星电子株式会社 用于视频预测编码的方法和装置以及用于视频预测解码的方法和装置
CN101873500A (zh) * 2009-04-24 2010-10-27 华为技术有限公司 帧间预测编码方法、帧间预测解码方法及设备
US20120128071A1 (en) * 2010-11-24 2012-05-24 Stmicroelectronics S.R.L. Apparatus and method for performing error concealment of inter-coded video frames
CN106375770A (zh) * 2011-01-21 2017-02-01 Sk电信有限公司 视频解码方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101573985A (zh) * 2006-11-03 2009-11-04 三星电子株式会社 用于视频预测编码的方法和装置以及用于视频预测解码的方法和装置
CN101873500A (zh) * 2009-04-24 2010-10-27 华为技术有限公司 帧间预测编码方法、帧间预测解码方法及设备
US20120128071A1 (en) * 2010-11-24 2012-05-24 Stmicroelectronics S.R.L. Apparatus and method for performing error concealment of inter-coded video frames
CN106375770A (zh) * 2011-01-21 2017-02-01 Sk电信有限公司 视频解码方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116405699A (zh) * 2020-03-31 2023-07-07 北京达佳互联信息技术有限公司 用于在视频编解码中用信号通知语法元素的方法和装置
WO2022257674A1 (zh) * 2021-06-07 2022-12-15 腾讯科技(深圳)有限公司 帧间预测的编码方法、装置、设备及可读存储介质
CN115086678A (zh) * 2022-08-22 2022-09-20 北京达佳互联信息技术有限公司 视频编码方法和装置、视频解码方法和装置

Similar Documents

Publication Publication Date Title
CN110720219B (zh) 视频图像处理方法与装置
WO2019192301A1 (zh) 视频图像处理方法与装置
WO2019192170A1 (zh) 视频图像处理方法与装置
JP2024147810A (ja) 動画像処理方法、動画処理装置、及びビットストリーム生成方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19781713

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020553581

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20207031587

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019781713

Country of ref document: EP

Effective date: 20201102