WO2022257674A1 - 帧间预测的编码方法、装置、设备及可读存储介质 - Google Patents
帧间预测的编码方法、装置、设备及可读存储介质 Download PDFInfo
- Publication number
- WO2022257674A1 WO2022257674A1 PCT/CN2022/091617 CN2022091617W WO2022257674A1 WO 2022257674 A1 WO2022257674 A1 WO 2022257674A1 CN 2022091617 W CN2022091617 W CN 2022091617W WO 2022257674 A1 WO2022257674 A1 WO 2022257674A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- mvp
- motion vector
- target
- mode
- candidate motion
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 127
- 238000003860 storage Methods 0.000 title claims abstract description 27
- 239000013598 vector Substances 0.000 claims abstract description 322
- 230000004044 response Effects 0.000 claims abstract description 15
- 238000004590 computer program Methods 0.000 claims description 18
- 238000010276 construction Methods 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 28
- 238000012545 processing Methods 0.000 abstract description 12
- 230000008569 process Effects 0.000 description 26
- 238000010586 diagram Methods 0.000 description 21
- 230000009466 transformation Effects 0.000 description 16
- 229910003460 diamond Inorganic materials 0.000 description 10
- 239000010432 diamond Substances 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 10
- 238000005192 partition Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 6
- 238000007906 compression Methods 0.000 description 6
- 238000012937 correction Methods 0.000 description 6
- 238000009795 derivation Methods 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000013139 quantization Methods 0.000 description 4
- 230000006798 recombination Effects 0.000 description 4
- 238000005215 recombination Methods 0.000 description 4
- 230000008521 reorganization Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/109—Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/567—Motion estimation based on rate distortion criteria
Definitions
- the embodiments of the present application relate to the field of video processing, and in particular to an encoding method, device, device and readable storage medium for inter-frame prediction.
- each coding unit corresponds to multiple prediction modes and transformation units .
- each coding unit may correspond to an intra prediction mode and an inter prediction mode, and the inter prediction mode may include four single reference frame modes of NEARESTMV, NEARMV, GLOBALMV and NEWMV.
- MVP Motion Vector Prediction
- Embodiments of the present application provide an encoding method, device, device, and readable storage medium for inter-frame prediction, which can improve the encoding efficiency of inter-frame prediction in NEWMV mode.
- the technical solution includes the following contents.
- an inter-frame prediction encoding method comprising:
- the motion vector group including a target MVP determined from the MVP, and a target motion vector determined from the candidate motion vectors
- the traversal of the interpolation mode and the traversal of the motion mode is performed on the coding unit based on the motion vector group to obtain a target interpolation mode and a target motion mode corresponding to the coding unit.
- the motion estimation traversal of the motion vector prediction MVP in the specified inter-frame prediction mode to obtain candidate motion vectors includes:
- i-th MVP in response to i being within the range of the number of MVPs, performing motion estimation on the i-th MVP to obtain the i-th candidate motion vector, where i is an integer;
- n candidate motion vectors for n MVPs wherein the ith MVP corresponds to the ith candidate motion vector, and n is the number of the MVPs.
- the determining the motion vector group from the MVP and the candidate motion vectors includes:
- rate-distortion costs corresponding to the m combination relationships respectively, where the rate-distortion costs are used to represent pixel error conditions under the combination relationships;
- the set of motion vectors is determined from the m combination relationships based on the rate-distortion cost.
- the determining the motion vector group from the m combination relationships based on the rate-distortion cost includes:
- the motion vector group including the object MVP and the object motion vector in the object combination relationship is determined.
- an encoding device for inter-frame prediction includes:
- An acquisition module configured to acquire an image frame to be encoded, where the image frame is divided into encoding units
- a prediction module configured to, in response to predicting the coding unit through a specified inter-frame prediction mode, perform motion estimation traversal on the motion vector prediction MVP in the specified inter-frame prediction mode, to obtain candidate motion vectors;
- a determining module configured to determine a motion vector group from the MVP and the candidate motion vectors, the motion vector group includes the target MVP determined from the MVP, and the target motion determined from the candidate motion vectors vector;
- the prediction module is further configured to traverse the interpolation mode and motion mode of the coding unit based on the motion vector group to obtain a target interpolation mode and a target motion mode corresponding to the coding unit.
- a computer device in another aspect, includes a processor and a memory, and a computer program is stored in the memory, and the computer program is loaded and executed by the processor to realize the above-mentioned inter-frame prediction encoding method.
- a computer-readable storage medium where a computer program is stored in the storage medium, and the computer program is loaded and executed by the processor to implement the above-mentioned encoding method for inter-frame prediction.
- a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
- the processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above-mentioned encoding method for inter-frame prediction.
- the image frame is encoded using the specified inter-frame prediction mode (such as: NEWMV mode)
- the specified inter-frame prediction mode such as: NEWMV mode
- the optimal motion mode does not need to perform interpolation mode traversal and motion mode traversal for all MVPs and motion vectors, thereby reducing the calculation amount of the rate-distortion cost and reducing the complexity of the calculation, thereby improving the specified inter prediction mode. Coding efficiency.
- FIG. 1 is a schematic diagram of a standard coding framework provided by an exemplary embodiment of the present application
- FIG. 2 is a schematic diagram of CU partition types provided by an exemplary embodiment of the present application.
- FIG. 3 is a schematic diagram of the position of the MVP in the single reference frame mode corresponding to the inter-frame prediction provided by an exemplary embodiment of the present application;
- Fig. 4 is a schematic diagram of the optimal result selection process in the NEWMV mode provided by related technologies
- FIG. 5 is a schematic diagram of the overall process of an inter-frame prediction encoding method provided by an exemplary embodiment of the present application
- FIG. 6 is a flowchart of an encoding method for inter-frame prediction provided by an exemplary embodiment of the present application
- Fig. 7 is a schematic diagram of a TZ search template provided based on the embodiment shown in Fig. 6;
- Fig. 8 is a schematic diagram based on the search complement provided in Fig. 6;
- FIG. 9 is a partial schematic diagram of searching for position points in a raster scanning manner based on the embodiment shown in FIG. 6;
- FIG. 10 is a flowchart of an encoding method for inter-frame prediction provided by another exemplary embodiment of the present application.
- FIG. 11 is a flowchart of an encoding method for inter-frame prediction provided by another exemplary embodiment of the present application.
- Fig. 12 is a structural block diagram of an encoding device for inter-frame prediction provided by an exemplary embodiment of the present application.
- Fig. 13 is a structural block diagram of an encoding device for inter-frame prediction provided by another exemplary embodiment of the present application.
- Fig. 14 is a structural block diagram of a computer device provided by an exemplary embodiment of the present application.
- the encoding method for inter-frame prediction provided by the embodiment of the present application can be applied to a terminal or a server.
- the terminal when the encoding method of inter-frame prediction is applied to a terminal, the terminal includes but is not limited to a mobile phone, a computer, an intelligent voice interaction device, a smart home appliance, a vehicle-mounted terminal, and the like.
- the method provided in the embodiment of the present application can be applied to a vehicle-mounted scene, that is, encoding of inter-frame prediction of video is performed on the vehicle-mounted terminal, as an intelligent traffic system (Intelligent Traffic System, ITS) in one ring.
- Intelligent Traffic System Intelligent Traffic System
- Intelligent transportation system is the effective comprehensive application of advanced science and technology (information technology, computer technology, data communication technology, sensor technology, electronic control technology, automatic control theory, operations research, artificial intelligence, etc.) Manufacturing, strengthening the connection among vehicles, roads, and users, so as to form an integrated transportation system that guarantees safety, improves efficiency, improves the environment, and saves energy.
- advanced science and technology information technology, computer technology, data communication technology, sensor technology, electronic control technology, automatic control theory, operations research, artificial intelligence, etc.
- the video can be encoded through the server, and the encoded video stream can be sent to the terminal or other servers through the server.
- the above-mentioned server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, and can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network Cloud servers for basic cloud computing services such as services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Delivery Network, CDN), and big data and artificial intelligence platforms.
- the above server can also be implemented as a node in the blockchain system.
- Blockchain is a new application model of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
- Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify the validity of its information. (anti-counterfeiting) and generate the next block.
- the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
- Fig. 1 is a schematic diagram of a standard coding framework provided by an exemplary embodiment of the present application.
- the encoder first divides the image frame 110 into coding tree units (Coding Tree Unit, CTU), and then deeply divide the coding tree unit to obtain a coding unit (ie CU).
- CTU Coding Tree Unit
- Each CU can correspond to multiple prediction modes and transformation units (Transform Unit, TU).
- the encoder predicts each CU by using a prediction mode to obtain a prediction value (ie, MVP) corresponding to each CU.
- the prediction for each CU may include inter-frame prediction and intra-frame prediction.
- motion estimation Motion Estimation, ME
- motion compensation Motion Compensation, MC
- the predicted value is subtracted from the input data (that is, the actual value of MV (Motion Vector, motion vector)) to obtain the residual (Motion Vector Difference, MVD), and then the residual is transformed and quantized to obtain the residual coefficient.
- the residual coefficient is sent to the output code stream of the entropy coding module.
- the residual coefficient is inversely quantized and transformed, the residual value of the reconstructed image is obtained.
- the residual value is added to the predicted value, the reconstructed image is obtained.
- the reconstructed image is filtered, it enters into the reference frame queue, which is used as the reference frame of the next image frame, so as to be sequentially encoded backwards.
- the intra-frame prediction selection is first performed on the basis of the image frame 110, and the intra-frame prediction is performed based on the reconstructed image and the current frame to obtain an intra-frame prediction result.
- FIG. 2 is a schematic diagram of the CU partition type provided by an exemplary embodiment of the present application.
- the CU partition type 200 includes: NONE type 210; SPLIT type 220; HORZ type 230; VERT type 240; HORZ_4 type 250 ; HORZ_A type 260; HORZ_B type 270; VERT_A type 280; VERT_B type 290 and VERT_4 type 201.
- CU can correspond to 22 block sizes, which are 4 ⁇ 4, 4 ⁇ 8, 8 ⁇ 4, 8 ⁇ 8, 8 ⁇ 16, 16 ⁇ 8, 16 ⁇ 16, 16 ⁇ 32, 32 ⁇ 16, 32 ⁇ 32, 32 ⁇ 64, 64 ⁇ 32, 64 ⁇ 64, 64 ⁇ 128, 128 ⁇ 64, 128 ⁇ 128, 4 ⁇ 16, 16 ⁇ 4, 8 ⁇ 32, 32 ⁇ 8, 16 ⁇ 64 and 64 ⁇ 16.
- the prediction mode of the CU corresponds to an intra prediction mode and an inter prediction mode.
- the prediction type is determined, firstly within the same prediction type, compare the different prediction modes under the prediction type to find the optimal prediction mode under the prediction type. For example, in the intra-frame prediction type, different intra-frame prediction modes are compared, so as to determine the optimal prediction mode in the intra-frame prediction type.
- the method of calculating the rate-distortion cost can be compared among different intra-frame prediction modes to select the intra-frame prediction mode with the smallest rate-distortion cost; in the inter-frame prediction type, the different inter-frame prediction modes are Comparison, so as to determine the optimal prediction mode in the inter prediction type.
- different inter-frame prediction modes can be compared by calculating the rate-distortion cost, and the inter-frame prediction mode with the smallest rate-distortion cost is selected. Then compare between the intra prediction mode and the inter prediction mode to find the optimal prediction mode under the current CU.
- the rate-distortion cost between the optimal intra-frame prediction mode and the optimal inter-frame prediction mode For example, compare the rate-distortion cost between the optimal intra-frame prediction mode and the optimal inter-frame prediction mode, and use the mode with the lowest rate-distortion cost as the optimal prediction mode for the current CU; at the same time, perform TU transformation on the CU, each CU Corresponding to a variety of transformation types, find the optimal transformation type; then compare different CU segmentation types, and find the optimal CU segmentation type according to the rate-distortion cost; finally divide the image frame into CUs.
- the inter-frame prediction modes include four single reference frame modes of NEARESTMV, NEARMV, GLOBALMV and NEWMV, and eight combined reference frame modes of NEAREST_NEARESTMV, NEAR_NEARMV, NEAREST_NEWMV, NEW_NEARESTMV, NEAR_NEWMV, NEW_NEARMV, GLOBAL_GLOBALMV and NEW_NEWMV.
- NEARESTMV mode and NEARMV mode mean that the motion vector (ie MV) of the prediction block is derived from the surrounding block information, and there is no need to transmit the motion vector difference (ie MVD); while the NEWMV mode means that MVD needs to be transmitted, GLOBALMV mode means CU The MV information of is derived from the global motion.
- the NEARESTMV, NEARMV and NEWMV modes all depend on the derivation of MVP.
- the AV1 standard will calculate 4 MVPs according to the protocol rules.
- the 0th MV is in NEARESTMV mode; the 1st to 3rd MVs are in NEARMV mode; and the NEWMV mode uses one of the 0th to 2nd MVs.
- FIG. 3 shows a schematic diagram of a position of an MVP in a single reference frame mode corresponding to inter-frame prediction provided by an exemplary embodiment of the present application.
- the NEWMV mode uses one of the 0th MV310, the 1st MV320 and the 2nd MV330 as the MVP.
- each prediction mode in the aforementioned inter-frame prediction types corresponds to a different reference frame.
- Table 1 For illustration, please refer to Table 1 below.
- the combination relationship between the inter prediction mode and the reference frame may be as follows:
- the current combination will correspond to a maximum of 3 MVPs, and then perform motion estimation on the current MVP (wherein, motion estimation is performed only for modes containing NEWMV mode), combination mode type There are 4 processes of selecting the best, choosing the best interpolation method, and choosing the best sports mode.
- the combination mode is used to fuse the predicted pixels of two reference frames.
- an optimal combination mode type may be selected from the combination modes, and predictive pixels of two reference frames are fused together based on the selected combination mode type.
- Each combination mode represents a fusion method of predicted pixels.
- the motion mode corresponding to the single reference frame mode is different from the motion mode corresponding to the combined reference frame mode.
- the single reference frame mode corresponds to four motion modes, namely: SIMPLE (simple motion compensation), OBMC (overlapping block motion compensation), WARPED (global and local warp motion compensation) and SIMPLE (inter_intra); the combined reference frame mode corresponds to SIMPLE mode.
- the calculation complexity corresponding to a group of inter prediction modes is very large, especially the NEWMV mode.
- NEWMV mode for each CU, it is necessary to traverse 7 reference frames, and traverse up to 3 MVPs for each reference frame, and each MVP performs motion estimation, 9 types of interpolation, and 4 types of motion mode calculations. That is, in NEWMV mode, each CU corresponds to a maximum of 189 (7 ⁇ 3 ⁇ 9) interpolation and 84 (7 ⁇ 3 ⁇ 4) motion mode calculations.
- FIG. 4 shows a schematic diagram of an optimal result selection process in NEWMV mode provided by the related art. As shown in FIG. 4 , the process includes the following steps.
- the initial value of N is set to 0, and N represents the Nth MVP, so that each MVP can be traversed.
- Step 402 judging whether N is less than the number of MVPs.
- N is less than the number of MVPs, it means that the traversal of the MVPs of the current CU in the EWMV mode has not ended.
- N is greater than or equal to the number of MVPs, it means that all MVPs in the EWMV mode of the current CU have been traversed, and the MVPs in the next prediction mode can be traversed for the current CU.
- Step 404 motion estimation. That is, motion estimation is performed on the current MVP to obtain an optimal motion vector corresponding to the current MVP.
- Step 405 selecting the best interpolation mode for the optimal motion vector.
- the optimal interpolation mode corresponding to the optimal motion vector is selected from nine interpolation modes.
- Step 406 choosing the best exercise mode.
- the optimal motion mode corresponding to the optimal motion vector is selected from the four motion modes.
- FIG. 5 is a schematic diagram of an overall process of an inter-frame prediction encoding method provided by an exemplary embodiment of the present application. As shown in FIG. 5 , the process includes the following steps.
- N the initial value of N is set to 0, and N represents the Nth MVP, so that each MVP can be traversed. It should be noted that, in the embodiment of the present application, it is only necessary to traverse the motion vectors corresponding to each MVP.
- Step 502 judging whether N is less than the number of MVPs.
- N is less than the number of MVPs, it means that the traversal of the MVPs of the current CU in the EWMV mode has not ended.
- N is greater than or equal to the number of MVPs, it means that all MVPs in the EWMV mode of the current CU have been traversed, and the MVPs in the next prediction mode can be traversed for the current CU.
- the number of MVPs in the EWMV mode as an example, traversing from the 0th MVP to the M-1th MVP, a total of M MVPs have been traversed.
- Step 504 motion estimation. That is, motion estimation is performed on the current MVP to obtain the optimal MV corresponding to the current MVP (ie, the candidate MV hereinafter). After obtaining the optimal MV corresponding to each MVP, step 505 is executed.
- Step 505 calibration, to select the optimal combination of MVP and MV. For example, by adopting a correction strategy, based on each MVP and the optimal MV corresponding to each MVP, an optimal MVP and MV combination is selected. This process will be described in detail below, and details not described in the embodiments of the present application can refer to the following embodiments, and will not be repeated here.
- step 506 the optimal motion vector interpolation method is selected. For example, for the optimal combination of MVP and MV, the optimal interpolation mode is selected to obtain the optimal interpolation mode.
- Step 507 choosing the best exercise mode. For example, for the optimal combination of MVP and MV, the optimal motion mode is selected to obtain the optimal motion mode.
- the coding method of inter-frame prediction provided in the embodiment of the present application continues the loop process after the motion estimation is performed, and after the loop is completed, it needs to be corrected to select the optimal combination of MVP and MV, and finally Based on the optimal MVP and MV combination, the subsequent interpolation mode selection and motion mode selection are performed.
- the embodiment of the present application since the embodiment of the present application does not need to perform interpolation mode traversal and motion mode for all MVPs and motion vectors traversal, thereby reducing the calculation amount of the rate-distortion cost, and reducing the complexity of the calculation, thereby improving the coding efficiency in the NEWMV mode.
- Fig. 6 is a flow chart of an encoding method for inter-frame prediction provided by an exemplary embodiment of the present application.
- the application of the method in an encoder is used as an example for illustration. As shown in Fig. 6 , the method includes the following steps.
- Step 601 acquire an image frame to be encoded.
- the image frame is divided into coding units, and the coding units correspond to at least two prediction modes, and the at least two prediction modes include a specified inter-frame prediction mode.
- the image frame to be encoded is a non-key frame, that is, a P frame or a B frame.
- the video stream to be encoded corresponding to the image frame to be encoded is first acquired, and each image frame in the video stream to be encoded is sequentially acquired, so as to determine the current image frame to be encoded.
- each CU is associated with an intra prediction mode and an inter prediction mode.
- the inter prediction mode includes the NEWMV mode.
- the NEWMV mode is the designated inter-frame prediction mode provided in the embodiment of the present application. Since the NEWMV mode depends on the derivation of the MVP, in the NEWMV mode, it is necessary to determine the optimal MVP from at least two (usually 3) MVPs.
- the specified inter-frame prediction mode in the embodiment of the present application may be the NEWMV mode, or may also be a combination mode including the NEWMV mode, which is not limited in the embodiment of the present application.
- Step 602 In response to predicting the coding unit through the specified inter-frame prediction mode, perform motion estimation traversal on the motion vector prediction MVP in the specified inter-frame prediction mode to obtain candidate motion vectors.
- the i-th MVP when performing motion estimation traversal on MVPs, it is first necessary to obtain the number of MVPs, and for the i-th MVP, in response to i being within the range of the number of MVPs, perform motion estimation on the i-th MVP to obtain the i-th MVP candidate motion vectors, i is an integer; n candidate motion vectors are obtained for n MVPs, wherein the i-th MVP corresponds to the i-th candidate motion vector, and n is the number of MVPs.
- the i-th candidate motion vector may refer to an optimal motion vector corresponding to the i-th MVP in motion estimation. The specific motion estimation method will be described in detail below, and will not be repeated here.
- motion estimation is performed on the 0th MVP for the 0th MVP.
- motion estimation is performed on the 1st MVP and the 2nd MVP, so as to achieve 3 MVPs motion estimation traversal.
- the candidate motion vectors are stored.
- the candidate motion vector is stored; or, the candidate motion vector and the distortion of the candidate motion vector are stored.
- the storage manner may include at least one of the following manners.
- a first array and a second array are constructed, and the first array and the second array are respectively used to store the candidate motion vector and the distortion of the candidate motion vector.
- the first array is used to store the distortion of the candidate motion vector corresponding to the MVP
- the second array is used to store the candidate motion vector corresponding to the MVP.
- the above candidate motion vectors include an optimal motion vector obtained by performing motion estimation by the MVP.
- the number of MVPs in the NEWMV mode is 3, that is, the 0th to 2nd MVPs shown in FIG. 7 above.
- the set array includes the first array and the second array, respectively as follows.
- the first array dist_bestmv_list[i], used to store the distortion of the optimal motion vector corresponding to each MVP.
- the second array: me_bestmv_list[i], is used to store the optimal motion vector corresponding to each MVP.
- i is used to represent the i-th MVP, and the value of i is 0, 1, 2.
- Ways of motion estimation may include integer-pixel motion estimation and sub-pixel motion estimation.
- the whole pixel motion estimation can include TZ search method (introduced in the subsequent content), NStep search method, diamond search method (introduced in the subsequent content), hexagonal search method, etc.; sub-pixel motion estimation can include diamond search, full search etc.
- the distortion and optimal motion vector corresponding to the current MVP index can be obtained.
- the distortion and optimal motion vectors corresponding to the current MVP index are respectively recorded in the array.
- dist represents the distortion
- bestmv represents the optimal motion vector. That is, the distortion corresponding to the MVP index is stored under the MVP index of the first array, and the optimal motion vector under the MVP index is stored under the MVP index of the second array.
- the distortion corresponding to the 0th MVP is stored under the index of the 0th MVP in the first array, which is dist_bestmv_list[0]
- the optimal motion vector corresponding to the 0th MVP is stored in the second array
- the index of the 0th MVP is me_bestmv_list[0].
- Step 603 determine a motion vector group from the MVP and candidate motion vectors.
- the motion vector group includes the target MVP determined from the MVPs, and the target motion vector determined from the candidate motion vectors.
- the target MVP is the optimal MVP and the target motion vector is the optimal motion vector.
- the target MVP and the target motion vector are corresponding motion vector groups, that is, when the target MVP is the i-th MVP, the target motion vector is the i-th motion vector, in this case, directly according to
- the rate-distortion cost corresponding to each group of candidate motion vector groups only needs to determine the motion vector group.
- determine the first rate-distortion cost of the 0th MVP and the 0th motion vector determine the second rate-distortion cost of the 1st MVP and the 1st motion vector, and determine the 2nd MVP and the 2nd motion Vector of third rate-distortion costs.
- a set of motion vector groups with the smallest rate-distortion cost is determined, for example: when the first rate-distortion cost is the smallest, the first rate-distortion cost corresponds to The 0th MVP and the 0th motion vector are used as the motion vector group.
- a group of motion vector groups with the smallest rate-distortion cost is determined as an example for illustration.
- two sets of motion vector groups with the smallest rate-distortion cost may also be determined, and
- the MVPs and motion vectors corresponding to the two sets of motion vector groups with the smallest rate-distortion cost are subjected to subsequent interpolation mode traversal and motion mode traversal, which is not limited in this embodiment of the present application.
- the number of motion vector groups can be determined through an external interface.
- the target MVP and the target motion vector are a motion vector group obtained after recombination, that is, when the target MVP is the i-th MVP, the target motion vector can be the i-th motion vector, or other
- the MVP determines the resulting motion vector.
- the target MVP and the target motion vector are described as an example of a motion vector group obtained after recombination.
- each MVP is reorganized with each candidate motion vector in turn, so as to determine a set of motion vector groups from the obtained combination, wherein, the number of combinations obtained by recombining each MVP with each candidate motion vector is The product of the number of MVPs and the number of candidate motion vectors.
- the foregoing recombination manner of the MVP and the motion vector is only a schematic example, and the embodiment of the present application does not limit the specific recombination manner of the MVP and the motion vector.
- the motion vector group can be determined by means of correction (that is, reorganization) to eliminate the influence of the local optimal value, thereby improving the accuracy of the acquisition of the target MVP and the target motion vector, and further improving the motion vector The quality of the group, which is beneficial to improve the encoding quality in the specified inter prediction mode.
- correction can be performed by recombining all MVPs and all candidate motion vectors.
- the 0th candidate motion vector that is, the optimal motion vector of the 0th MVP
- the 1st MVP performs motion estimation
- the 1st candidate motion vector also That is, the optimal motion vector of the first MVP
- the second MVP performs motion estimation to obtain the second candidate motion vector (that is, the optimal motion vector of the second MVP).
- a set of motion vectors is determined from these combinations, eg, the combination with the least rate-distortion cost.
- a combination with the smallest rate-distortion cost is determined from these combinations as a motion vector group; or, multiple combinations with the smallest rate-distortion cost are determined from these combinations as a motion vector group, and according to multiple motion
- the target MVP and the target motion vector corresponding to the vector groups are respectively traversed by interpolation mode and motion mode.
- the number of motion vector groups can be determined through the external interface.
- Step 604 Traverse the interpolation mode and motion mode of the coding unit based on the motion vector group to obtain the target interpolation mode and target motion mode corresponding to the coding unit.
- the optimal interpolation method and motion mode can be selected according to the target MVP (such as the optimal MVP) and the target motion vector (such as the optimal motion vector) in the motion vector group .
- the inter-frame prediction coding method for the scene where the image frame is coded by using the specified inter-frame prediction mode (such as: NEWMV mode), by first determining the target MVP and the target motion vector The combination (such as the optimal combination), and then perform interpolation mode selection and motion mode selection for this combination, without the need to perform interpolation mode traversal and motion mode traversal for all MVPs and motion vectors, thereby reducing the calculation of the rate-distortion cost. And the complexity of calculation is reduced, thereby improving the coding efficiency in a specified inter-frame prediction mode.
- the specified inter-frame prediction mode such as: NEWMV mode
- the motion vector group is determined by means of correction (that is, reorganization) to eliminate the influence of the local optimal value between the MVP and the candidate motion vector, thereby improving the acquisition accuracy of the target MVP and the target motion vector, and further improving The quality of the motion vector group is improved, which is beneficial to improve the coding quality in the specified inter-frame prediction mode.
- the above motion estimation method is introduced by taking the TZ search method and the diamond search method as representatives.
- the implementation of the TZ search method includes the following processes.
- the current MVP is used as the starting point of the search, and there is also the (0,0) position, and the rate-distortion cost under the corresponding motion vectors is compared between the two, and the motion vector with a small rate-distortion cost is used as the final search starting point.
- FIG. 7 shows a schematic diagram of a TZ search template provided by an exemplary embodiment of the present application.
- the TZ search method includes a rhombus template 710, and the search is performed within the search window range of the rhombus template 710, wherein the step size is incremented by an integer power of 2, and the point with the smallest rate-distortion cost is selected as the search Results (determine optimal motion vector based on search results).
- FIG. 8 shows a schematic diagram of a 2-point search provided by an exemplary embodiment of the present application.
- point 2 point 4, point 5, and point 7
- the position points in the four positive directions of left, right, up, and down have all been calculated, and there is no need to add points.
- step (3) If the step with the smallest rate-distortion cost obtained in step (3) is greater than 5, start raster scanning to make all points every 5 rows and 5 columns.
- FIG. 9 shows a partial schematic diagram of searching for location points in a raster scanning manner. As shown in FIG. 9 , every 5 rows and 5 columns are scanned at the marked point 910 to obtain search results.
- the implementation of the diamond search method includes the following processes.
- the diamond search method is also called the diamond search method, and there are two different matching templates, the big diamond and the small diamond.
- the large diamond-shaped matching template has 9 search points
- the small diamond-shaped matching template has 5 search points.
- a large diamond-shaped matching template with a larger step size is used for coarse search
- a small diamond-shaped matching template is used for fine search.
- the search steps may be as follows.
- FIG. 10 is a flow chart of an encoding method for inter-frame prediction provided by another exemplary embodiment of the present application.
- the application of the method in an encoder is used as an example for illustration. As shown in FIG. 10 , the method includes the following steps.
- Step 1001 acquire an image frame to be encoded.
- the image frame is divided into coding units, and the coding units correspond to at least two prediction modes, and the at least two prediction modes include a specified inter-frame prediction mode.
- Step 1002 acquiring the number of motion vector prediction MVPs.
- Step 1003 for the i-th MVP, in response to the fact that i is within the range of the number of MVPs, perform motion estimation on the i-th MVP to obtain an i-th candidate motion vector, where i is an integer.
- Step 1004 obtain n candidate motion vectors for n MVPs.
- the i-th MVP corresponds to the i-th candidate motion vector
- n is the number of MVPs.
- the 0th candidate motion vector that is, the optimal motion vector of the 0th MVP
- the 1st MVP performs motion estimation
- the 1st candidate motion vector also That is, the optimal motion vector of the first MVP
- the second MVP performs motion estimation to obtain the second candidate motion vector (that is, the optimal motion vector of the second MVP).
- Step 1005 sequentially reorganize each MVP and each candidate motion vector to obtain m combination relationships; wherein, the value of m is the square of n.
- a set of motion vectors is determined from these combinations, such as the combination with the smallest rate-distortion cost after combination.
- Step 1006 determine the rate-distortion costs corresponding to the m combination relationships respectively.
- the rate-distortion cost is used to represent the pixel error condition under the combination relationship. In some embodiments, the rate-distortion cost is used to represent the pixel encoding cost under the combination relationship, and the rate-distortion cost is determined by the distortion under the current combination relationship and the number of codeword bits occupied by encoding.
- i is used to indicate the i-th MVP
- j is used to indicate the j-th candidate motion vector.
- m is used to represent the maximum value of the index of the MVP or the candidate motion vector, and is used to represent the number of the MVP and the candidate motion vector. In some embodiments, m ranges from 0 to the number of MVPs or candidate motion vectors minus one.
- mvp[i] represents the i-th MVP
- mvcost(me_beatmv_list[j]-mvp[i]) represents the rate-distortion cost corresponding to the difference between the j-th candidate motion vector and the i-th MVP.
- cost 00 dist_bestmv_list[0]+mvcost(me_bestmv_list[0]-mvp[0]);
- cost 01 dist_bestmv_list[1]+mvcost(me_bestmv_list[1]-mvp[0]);
- cost 02 dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[0]);
- cost 10 dist_bestmv_list[0]+mvcost(me_bestmv_list[0]-mvp[1]);
- cost 12 dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[1]);
- cost 21 dist_bestmv_list[1]+mvcost(me_bestmv_list[1]-mvp[2]);
- cost 22 dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[2]);
- Cost 00 represents the combination relationship between the 0th MVP and the 0th candidate motion vector
- cost 01 represents the combination relationship between the 0th MVP and the 1st candidate motion vector
- cost 02 represents the 0th MVP and the 2nd candidate motion vector combinations, and so on.
- Step 1007 determine the motion vector group from the m combination relations based on the rate-distortion cost.
- a target combination relationship with the smallest rate-distortion cost is determined from the m combination relationships, and a motion vector group including the target MVP and the target motion vector in the target combination relationship is determined.
- the optimal mvp index is 2
- the optimal motion vector index is 0, that is, the motion is composed of the 2nd MVP and the 0th candidate motion vector Vector group
- the second MVP is the target MVP
- the 0th candidate motion vector is the target motion vector.
- Step 1008 Traverse the interpolation mode and motion mode of the coding unit based on the motion vector group to obtain the target interpolation mode and target motion mode corresponding to the coding unit.
- the optimal interpolation mode and motion mode are selected according to the target MVP (such as the optimal MVP) and the target motion vector (such as the optimal motion vector) in the motion vector group.
- the inter-frame prediction coding method for the scene where the image frame is coded by using the specified inter-frame prediction mode (such as: NEWMV mode), by first determining the target MVP and the target motion vector The combination (such as the optimal combination), and then perform interpolation mode selection and motion mode selection for this combination, without the need to perform interpolation mode traversal and motion mode traversal for all MVPs and motion vectors, thereby reducing the calculation of the rate-distortion cost. And the complexity of calculation is reduced, thereby improving the coding efficiency in a specified inter-frame prediction mode.
- the specified inter-frame prediction mode such as: NEWMV mode
- the method provided in this embodiment corrects the MVP and the optimal motion vector corresponding to the MVP by reorganizing the MVP and the candidate motion vectors, so as to avoid the existence of a local optimum between the MVP and the optimal motion vector corresponding to the MVP.
- the determination accuracy of the target MVP and the target motion vector is improved, that is, the coding quality is further improved.
- the method provided in this embodiment can reduce 2/3 of the calculation amount of interpolation mode traversal and motion mode calculation in NEWMV mode.
- FIG. 11 is a flow chart of an encoding method for inter-frame prediction provided by another exemplary embodiment of the present application.
- the application of the method in an encoder is used as an example for illustration. As shown in FIG. 11 , the method includes the following steps.
- Step 1101 acquire an image frame to be encoded.
- the image frame is divided into coding units, and the coding units correspond to at least two prediction modes, and the at least two prediction modes include a specified inter-frame prediction mode.
- Step 1102 in response to predicting the CU through a specified inter-frame prediction mode, perform motion estimation traversal on the motion vector prediction MVP in the specified inter-frame prediction mode to obtain candidate motion vectors.
- n candidate motion vectors are obtained for n MVPs, wherein the i-th MVP corresponds to the i-th candidate motion vector, and n is the number of MVPs.
- Step 1103 determine a motion vector group from the MVP and candidate motion vectors.
- the motion vector group includes the target MVP determined from the MVPs, and the target motion vector determined from the candidate motion vectors.
- the target MVP is the optimal MVP and the target motion vector is the optimal motion vector.
- the MVP and the candidate motion vector can be corrected by recombining to obtain the target MVP and the target motion vector.
- Step 1104 Traverse the interpolation mode and motion mode of the coding unit based on the motion vector group to obtain the target interpolation mode and target motion mode corresponding to the coding unit.
- the optimal interpolation mode and motion mode are selected according to the target MVP (such as the optimal MVP) and the target motion vector (such as the optimal motion vector) in the motion vector group.
- an introduction is made on the optimal interpolation mode and the optimal motion mode.
- the realization of the optimal interpolation mode includes the following process.
- Interpolation is a process for increasing the sampling rate.
- the reason for interpolation is that if the optimal motion vector contains sub-pixels, the predicted pixels cannot be obtained directly. Therefore, it is necessary to first obtain the reference pixel corresponding to the integer pixel position of the optimal motion vector, and then interpolate according to the sub-pixel coordinates to finally obtain the predicted pixel.
- AV1 designs three interpolation methods: REG (regular expression-based interpolation method), SMOOTH (smooth interpolation method) and SHARP for sub-pixels.
- REG regular expression-based interpolation method
- SMOOTH smooth interpolation method
- SHARP SHARP for sub-pixels.
- the filter kernels are all 8 taps, and the difference between the three interpolation methods is mainly due to the different coefficients of the filter kernels.
- the interpolation method with the smallest rate-distortion cost is the optimal interpolation method.
- the realization of optimal motion mode includes the following process.
- the sports modes mainly include the following four types: SIMPLE, OBMC, WARPED and SIMPLE (inter_intra).
- the optimal motion mode is written to the bitstream to instruct the decoder which motion mode to use to recover the reconstructed data when decoding.
- SIMPLE (inter_intra) and the first SIMPLE are both SIMPLE modes, but there is a big difference.
- the four motion modes all need complete rate-distortion costs, that is, the entire reconstruction process of transformation, quantization, inverse quantization, and inverse transformation.
- the difference is that the methods of obtaining predicted pixels are different.
- the first step is to obtain the predicted pixels.
- the predicted pixel is the predicted value obtained after interpolation.
- For OBMC mode reprocess the predicted pixels obtained after interpolation. Obtain the prediction pixels of the adjacent block according to the adjacent block MV, and then fuse with the interpolated prediction value of the current block according to certain rules to obtain a new prediction value.
- For WARPED mode Refer to the three available positions on the left, top, and top right corners to construct an affine transformation MV, then perform a small-scale motion search, and finally perform interpolation to obtain predicted pixels.
- Inter_intra perform secondary processing on the predicted pixels obtained after interpolation. First do intra-frame prediction of the four intra-frame modes of DC, V, H, and SMOOTH to obtain the optimal intra-frame prediction pixel, and then fuse the intra-frame prediction and inter-frame prediction pixels to obtain a new prediction value.
- the second step is the complete rate-distortion calculation.
- the residual pixels are obtained.
- the acquisition of the number of bits is related to the context of entropy coding.
- the bit number rate in the rate-distortion cost formula contains a lot of information, such as the number of bits consumed by the reference frame, the number of bits consumed by the MVP index, the number of bits consumed by the MVD, the number of bits consumed by the interpolation method, and the motion mode.
- the number of bits in the above step 2 is required, which is related to the number of bits consumed by the transformation type and the number of bits consumed by the transformation unit division type in the transformation. number and the number of bits consumed by residual coefficients together to evaluate the optimal transformation type and the optimal TU partition type of the current prediction block, and then obtain the distortion and the number of bits corresponding to the optimal transformation type and the optimal TU partition type, respectively denoted as dist and rate.
- rate-distortion cost rdcost is obtained, and the motion mode corresponding to the minimum rate-distortion cost is the optimal motion mode.
- rdcost dist+rate ⁇ , where ⁇ is a preset constant.
- Step 1105 determine the reference frame corresponding to the target MVP.
- the reference frame index mode corresponding to the target MVP is determined, and the reference frame is obtained by using the reference frame index mode and the target MVP index.
- the indexing method of the reference frame please refer to Table 1 above.
- Step 1106 determine the number of bits consumed by the target MVP index.
- Step 1107 encode the CU based on the target MVP, the target interpolation method, the target motion mode, the reference frame and the number of bits.
- the inter-frame prediction coding method for the scene where the image frame is coded by using the specified inter-frame prediction mode (such as: NEWMV mode), by first determining the target MVP and the target motion vector The combination (such as the optimal combination), and then perform interpolation mode selection and motion mode selection for this combination, without the need to perform interpolation mode traversal and motion mode traversal for all MVPs and motion vectors, thereby reducing the calculation of the rate-distortion cost. And the complexity of calculation is reduced, thereby improving the coding efficiency in a specified inter-frame prediction mode.
- the specified inter-frame prediction mode such as: NEWMV mode
- the method provided in this embodiment selects the target mvp and the corresponding target motion vector in advance through MVP correction, and then performs interpolation mode traversal and motion mode traversal, which can save up to 2/3 of interpolation mode traversal and motion mode traversal
- the calculation amount is very high, and the acceleration ratio is very high, which has well established the inter-frame prediction framework of the single-reference frame NEWMV mode, which is the core part of the AV1 encoder.
- the coding method of inter-frame prediction is applied to the AV1 compression protocol as an example for illustration.
- the coding method of inter-frame prediction provided in the embodiment of the present application can also be applied to other compression protocols.
- Fig. 12 is a structural block diagram of an encoding device for inter-frame prediction provided in an exemplary embodiment of the present application. As shown in Fig. 12 , the device includes:
- An acquisition module 1210 configured to acquire an image frame to be encoded, where the image frame is divided into encoding units;
- the prediction module 1220 is configured to perform motion estimation traversal on the motion vector prediction MVP in the specified inter prediction mode to obtain candidate motion vectors in response to predicting the coding unit through a specified inter prediction mode;
- a determining module 1230 configured to determine a motion vector group from the MVP and the candidate motion vectors, the motion vector group includes the target MVP determined from the MVP, and the target MVP determined from the candidate motion vectors motion vector;
- the prediction module 1220 is further configured to traverse the interpolation mode and motion mode of the coding unit based on the motion vector group to obtain a target interpolation mode and a target motion mode corresponding to the coding unit.
- the acquiring module 1210 is also configured to acquire the number of MVPs
- the prediction module 1220 is further configured to perform motion estimation on the i-th MVP for the i-th MVP, in response to i being within the range of the number of MVPs, to obtain an i-th candidate motion vector, where i is an integer;
- the prediction module 1220 is further configured to obtain n candidate motion vectors for n MVPs; wherein, the i-th MVP corresponds to the i-th candidate motion vector, and n is the number of the MVPs.
- the prediction module 1220 is further configured to sequentially reorganize each MVP and each candidate motion vector to obtain m combination relationships; wherein, the value of m is the square of n;
- the determination module 1230 is further configured to determine the rate-distortion costs corresponding to the m combination relationships, the rate-distortion costs are used to represent the pixel error situation under the combination relationships; based on the rate-distortion costs from the The motion vector group is determined among the m combination relationships.
- the determining module 1230 is further configured to determine the target combination relationship with the smallest rate-distortion cost from the m combination relationships; determine the target combination relationship that includes the target combination relationship The motion vector set of MVP and the target motion vector.
- the device further includes:
- a construction module 1310 configured to construct a first array and a second array, the first array is used to store the distortion of the candidate motion vector corresponding to the MVP, and the second array is used to store the candidate motion vector corresponding to the MVP ;
- a storage module 1320 configured to store the distortion corresponding to the ith candidate motion vector into the first array
- the storage module 1320 is further configured to store the ith candidate motion vector into the second array.
- the determining module 1230 is further configured to determine a reference frame corresponding to the target MVP; determine the number of bits consumed by the target MVP index;
- the device also includes:
- An encoding module 1330 configured to encode the coding unit based on the target MVP, the target interpolation mode, the target motion mode, the reference frame, and the number of bits.
- the determining module 1230 is further configured to determine a reference frame index mode corresponding to the target MVP; obtain the reference frame by using the reference frame index mode and the target MVP index.
- the determining module 1230 is further configured to determine the difference between the target motion vector and the target MVP as the number of bits consumed by the target MVP index.
- the encoding device for inter-frame prediction provided by the embodiment of the present application, for a scene in which an image frame is encoded using a specified inter-frame prediction mode (such as: NEWMV mode), by first determining the target MVP and the target motion vector The combination (such as the optimal combination), and then perform interpolation mode selection and motion mode selection for this combination, without the need to perform interpolation mode traversal and motion mode traversal for all MVPs and motion vectors, thereby reducing the calculation of the rate-distortion cost. And the complexity of calculation is reduced, thereby improving the coding efficiency in a specified inter-frame prediction mode.
- a specified inter-frame prediction mode such as: NEWMV mode
- the encoding device for inter-frame prediction provided by the above embodiment is only illustrated by dividing the above functional modules. In practical applications, the above function allocation can be completed by different functional modules according to needs, that is, the device The internal structure of the system is divided into different functional modules to complete all or part of the functions described above.
- the encoding device for inter-frame prediction provided by the above embodiment and the embodiment of the encoding method for inter-frame prediction belong to the same concept, and the specific implementation process thereof is detailed in the method embodiment, and will not be repeated here.
- Fig. 14 shows a structural block diagram of a computer device 1400 provided by an exemplary embodiment of the present application.
- the computer device 1400 can be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, moving picture expert compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, moving picture expert compression standard Audio level 4) player, laptop or desktop computer, it can also be a server.
- the computer device 1400 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, or other names.
- a computer device 1400 includes: a processor 1401 and a memory 1402 .
- the processor 1401 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
- Processor 1401 can adopt at least one hardware form in DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, programmable logic array) accomplish.
- the processor 1401 can also include a main processor and a coprocessor, the main processor is a processor for processing data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the coprocessor is Low-power processor for processing data in standby state.
- the processor 1401 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
- the processor 1401 may also include an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
- AI Artificial Intelligence, artificial intelligence
- Memory 1402 may include one or more computer-readable storage media, which may be non-transitory.
- the memory 1402 may also include high-speed random access memory, and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
- non-transitory computer-readable storage medium in the memory 1402 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 1401 to implement the above-mentioned encoding method for inter-frame prediction.
- FIG. 14 does not constitute a limitation to the computer device 1400, and may include more or less components than shown in the figure, or combine certain components, or adopt a different component arrangement.
- a computer-readable storage medium is also provided, and a computer program is stored in the storage medium, and when the computer program is executed by a processor, the above coding method for inter-frame prediction is implemented.
- the computer-readable storage medium may include: a read-only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), a solid-state hard drive (SSD, Solid State Drives) or an optical disc, etc.
- random access memory may include resistive random access memory (ReRAM, Resistance Random Access Memory) and dynamic random access memory (DRAM, Dynamic Random Access Memory).
- ReRAM resistive random access memory
- DRAM Dynamic Random Access Memory
- a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
- a processor of a computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the above-mentioned encoding method for inter-frame prediction.
- the information including but not limited to target device information, target personal information, etc.
- data including but not limited to data used for analysis, stored data, displayed data, etc.
- signals involved in this application All are authorized by the subject or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.
- the image frames, videos, etc. involved in this application are all obtained under full authorization.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
一种帧间预测的编码方法、装置、设备及可读存储介质,涉及视频处理领域。该方法包括:获取待编码的图像帧,图像帧划分有编码单元(601);响应于通过指定帧间预测模式对编码单元进行预测,对运动向量预测MVP进行运动估计遍历,得到候选运动向量(602);从MVP和候选运动向量中确定运动向量组(603);基于运动向量组对编码单元进行插值方式的遍历和运动模式的遍历(604)。基于一组目标MVP和目标运动向量进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
Description
本申请要求于2021年06月07日提交的申请号为202110629001.2、发明名称为“帧间预测的编码方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及视频处理领域,特别涉及一种帧间预测的编码方法、装置、设备及可读存储介质。
在对视频进行编码的过程中,将图像帧输入至编码器后,图像帧会被划分为多个编码单元(Coding Unit,CU),其中,每个编码单元对应有多种预测模式和变换单元。如:每个编码单元可以对应有帧内预测模式和帧间预测模式,帧间预测模式又可包括NEARESTMV、NEARMV、GLOBALMV和NEWMV该4种单参考帧模式。
相关技术中,对于NEWMV模式下的任一参考帧,需要获取NEWMV模式下的所有运动向量预测(Motion Vector Prediction,MVP),然后对每个MVP做运动估计、9种插值方式遍历、4种运动模式遍历,最后选出最优的MVP,以及该最优的MVP对应的最优运动向量、最优插值方式和最优运动模式。
然而,通过上述方式进行NEWMV模式下的帧间预测时,要求每种可能组合都要进行率失真代价的计算,计算复杂度很大,编码效率较低。
发明内容
本申请实施例提供了一种帧间预测的编码方法、装置、设备及可读存储介质,可以提高NEWMV模式下的帧间预测时的编码效率。所述技术方案包括如下内容。
一方面,提供了一种帧间预测的编码方法,所述方法包括:
获取待编码的图像帧,所述图像帧划分有编码单元;
响应于通过指定帧间预测模式对所述编码单元进行预测,对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量;
从所述MVP和所述候选运动向量中确定运动向量组,所述运动向量组中包括从所述MVP中确定的目标MVP,以及从所述候选运动向量中确定的目标运动向量;
基于所述运动向量组对所述编码单元进行插值方式的遍历和运动模式的遍历,得到所述编码单元对应的目标插值方式和目标运动模式。
在一个可选的实施例中,所述对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量,包括:
获取所述MVP的数量;
针对第i个MVP,响应于i在所述MVP的数量范围内,对所述第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;
针对n个MVP得到n个候选运动向量,其中,所述第i个MVP对应所述第i个候选运动向量,n为所述MVP的数量。
在一个可选的实施例中,所述从所述MVP和所述候选运动向量中确定运动向量组,包括:
依次将每个MVP与各个候选运动向量进行重组,得到m个组合关系;其中,m的取值为n的平方;
确定所述m个组合关系分别对应的率失真代价,所述率失真代价用于表示在所述组合关系下的像素误差情况;
基于所述率失真代价从所述m个组合关系中确定所述运动向量组。
在一个可选的实施例中,所述基于所述率失真代价从所述m个组合关系中确定所述运动向量组,包括:
从所述m个组合关系中确定率失真代价最小的目标组合关系;
确定包括所述目标组合关系中的所述目标MVP和所述目标运动向量的所述运动向量组。
另一方面,提供了一种帧间预测的编码装置,所述装置包括:
获取模块,用于获取待编码的图像帧,所述图像帧划分有编码单元;
预测模块,用于响应于通过指定帧间预测模式对所述编码单元进行预测,对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量;
确定模块,用于从所述MVP和所述候选运动向量中确定运动向量组,所述运动向量组中包括从所述MVP中确定的目标MVP,以及从所述候选运动向量中确定的目标运动向量;
所述预测模块,还用于基于所述运动向量组对所述编码单元进行插值方式的遍历和运动模式的遍历,得到所述编码单元对应的目标插值方式和目标运动模式。
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如上述帧间预测的编码方法。
另一方面,提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如上述帧间预测的编码方法。
另一方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述帧间预测的编码方法。
本申请实施例提供的技术方案带来的有益效果至少包括如下内容。
对于采用指定帧间预测模式(如:NEWMV模式)对图像帧进行编码的场景,通过先确定出包括目标MVP和目标运动向量的组合(如最优组合),再针对该组合进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
图1是本申请一个示例性实施例提供的标准编码框架示意图;
图2是本申请一个示例性实施例提供的CU分割类型的示意图;
图3是本申请一个示例性实施例提供的帧间预测对应的单参考帧模式下的MVP的位置示意图;
图4是相关技术提供的NEWMV模式下最优结果选择过程的示意图;
图5是本申请一个示例性实施例提供的帧间预测的编码方法的整体过程示意图;
图6是本申请一个示例性实施例提供的帧间预测的编码方法的流程图;
图7是基于图6示出的实施例提供的TZ搜索模板示意图;
图8是基于图6提供的搜索补做的示意图;
图9是基于图6示出的实施例提供的光栅扫描方式搜索位置点的局部示意图;
图10是本申请另一个示例性实施例提供的帧间预测的编码方法流程图;
图11是本申请另一个示例性实施例提供的帧间预测的编码方法流程图;
图12是本申请一个示例性实施例提供的帧间预测的编码装置的结构框图;
图13是本申请另一个示例性实施例提供的帧间预测的编码装置的结构框图;
图14是本申请一个示例性的实施例提供的计算机设备的结构框图。
首先,针对本申请实施例提供的方法的实施环境进行说明。
本申请实施例提供的帧间预测的编码方法可以应用于终端中,也可以应用于服务器中。
示意性的,在帧间预测的编码方法应用于终端的情况下,该终端包括但不限于手机、电脑、智能语音交互设备、智能家电、车载终端等。当终端实现为车载终端时,本申请实施例提供的方法可以应用于车载场景中,也即在车载终端上进行视频的帧间预测的编码,作为智能交通系统(Intelligent Traffic System,ITS)中的一环。智能交通系统是将先进的科学技术(信息技术、计算机技术、数据通信技术、传感器技术、电子控制技术、自动控制理论、运筹学、人工智能等)有效地综合运用于交通运输、服务控制和车辆制造,加强车辆、道路、使用者三者之间的联系,从而形成一种保障安全、提高效率、改善环境、节约能源的综合运输系统。
在帧间预测的编码方法应用于服务器的情况下,可通过服务器进行视频的编码,并可通过服务器将编码视频流发送至终端或者其他服务器。值得注意的是,上述服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。
在一些实施例中,上述服务器还可以实现为区块链系统中的节点。区块链(Blockchain)是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链,本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。
其次,针对本申请实施例涉及的编码框架、CU分割类型、帧内预测模式和MVP的推导过程等进行举例说明。
图1是本申请一个示例性实施例提供的标准编码框架示意图,如图1所示,当图像帧110被送入到编码器时,编码器首先将图像帧110分割成编码树单元(Coding Tree Unit,CTU),再对编码树单元进行深度划分,得到编码单元(即CU),每个CU可对应有多种预测模式和变换单元(Transform Unit,TU)。编码器通过采用预测模式对每个CU进行预测,得到每个CU分别对应的预测值(即MVP)。其中,对每个CU的预测可以包括帧间预测和帧内预测。
帧间预测中,首先对图像帧110和参考帧120进行运动估计(Motion Estimation,ME),再对运动估计结果进行运动补偿(Motion Compensation,MC),得到预测值。预测值与输入数据(即MV(Motion Vector,运动向量)实际值)相减得到残差(Motion Vector Difference,MVD),然后对残差进行变换和量化,得到残差系数。将残差系数送入熵编码模块输出码流,同时,残差系数经过逆量化和逆变换后,得到重构图像的残差值,残差值与预测值相加后,得到重构图像,重构图像经过滤波后,进入参考帧队列,作为下一图像帧的参考帧,从而依次向后编码。
帧内预测过程中,首先在图像帧110的基础上进行帧内预测选择,并基于重构图像与当前帧进行帧内预测,得到帧内预测的结果。
图2是本申请一个示例性实施例提供的CU分割类型的示意图,如图2所示,CU分割类型200中包括:NONE类型210;SPLIT类型220;HORZ类型230;VERT类型240;HORZ_4类型250;HORZ_A类型260;HORZ_B类型270;VERT_A类型280;VERT_B类型290和VERT_4类型201。
基于上述CU分割类型,CU可对应有22种块大小,分别为4×4、4×8、8×4、8×8、8×16、16×8、16×16、16×32、32×16、32×32、32×64、64×32、64×64、64×128、128×64、128×128、4×16、16×4、8×32、32×8、16×64和64×16。
CU的预测模式对应有帧内预测模式和帧间预测模式。在预测类型确定时,首先在相同的 预测类型内,对该预测类型下的不同预测模式间进行比较,找到该预测类型下的最优预测模式。如:在帧内预测类型中,对不同的帧内预测模式进行比较,从而确定帧内预测类型中的最优预测模式。其中,可通过计算率失真代价的方式在不同的帧内预测模式之间进行比较,以选出率失真代价最小的帧内预测模式;在帧间预测类型中,对不同的帧间预测模式进行比较,从而确定帧间预测类型中的最优预测模式。其中,可通过计算率失真代价的方式在不同的帧间预测模式之间进行比较,选出率失真代价最小的帧间预测模式。再在帧内预测模式和帧间预测模式之间进行比较,找到当前CU下的最优预测模式。例如,比较最优帧内预测模式和最优帧间预测模式之间的率失真代价,将率失真代价小的模式作为当前CU下的最优预测模式;同时对CU做TU变换,每个CU对应着多种变换类型,从中找到最优的变换类型;然后不同CU分割类型之间进行比较,根据率失真代价找到最优的CU分割类型;最后将图像帧分割成一个个CU。
在一些实施例中,帧间预测模式包括NEARESTMV、NEARMV、GLOBALMV和NEWMV4种单参考帧模式,以及NEAREST_NEARESTMV、NEAR_NEARMV、NEAREST_NEWMV、NEW_NEARESTMV、NEAR_NEWMV、NEW_NEARMV、GLOBAL_GLOBALMV和NEW_NEWMV 8种组合参考帧模式。NEARESTMV模式和NEARMV模式是指预测块的运动向量(即MV)是根据周围块信息推导得到,不需要传输运动向量差异(即MVD);而NEWMV模式则意味着需要传输MVD,GLOBALMV模式是指CU的MV信息是根据全局运动推导得到。
其中,NEARESTMV、NEARMV和NEWMV模式都依赖于MVP的推导,对于给定的参考帧,AV1标准会按照协议规则计算出4个MVP。
示意性的,MVP的推导过程中,首先按照一定方式跳跃式扫描当前CU左侧第1、第3以及第5列,以及上方第1、第3和第5行的CU的信息,选出使用相同参考帧的CU,对使用相同参考帧的CU的MV进行去重;若参考帧不重复的MV的数量小于(或者等于)8个,则首先放宽对CU的要求为同方向的参考帧,并继续添加MV;若依旧少于(或者等于)8个MV,则使用全局运动矢量填充;选出8个MV后按照重要性进行排序,得到重要性最高的4个MV。其中,第0个MV为NEARESTMV模式下的;第1至第3个MV为NEARMV模式下的;而NEWMV模式使用第0至第2个MV中的一个MV。示意性的,请参考图3,其示出了本申请一个示例性实施例提供的帧间预测对应的单参考帧模式下的MVP的位置示意图。如图3所示,基于上述MVP的推导过程选出重要性最高的4个MV后,NEWMV模式使用第0个MV310、第1个MV320和第2个MV330中的一个MV作为MVP。
在一些实施例中,上述帧间预测类型中的每种预测模式对应有不同的参考帧。示意性的,请参考如下表一。
表一
在一些实施例中,帧间预测模式与参考帧之间的组合关系可以如下:
对于帧间预测类型对应的4种单参考帧模式,均有7个参考帧,分别为LAST_FRAME、LAST2_FRAME、LAST3_FRAME、GOLDEN_FRAME、BWDREF_FRAME、ALTREF2_FRAME和ALTREF_FRAME,单参考帧模式与参考帧之间共有28(4×7)种组合。
对于帧间预测类型对应的8种组合参考帧模式,均有16个参考帧组合,分别是:
{LAST_FRAME,ALTREF_FRAME};
{LAST2_FRAME,ALTREF_FRAME};
{LAST3_FRAME,ALTREF_FRAME};
{GOLDEN_FRAME,ALTREF_FRAME};
{LAST_FRAME,BWDREF_FRAME};
{LAST2_FRAME,BWDREF_FRAME};
{LAST3_FRAME,BWDREF_FRAME;
{GOLDEN_FRAME,BWDREF_FRAME};
{LAST_FRAME,ALTREF2_FRAME};
{LAST2_FRAME,ALTREF2_FRAME};
{LAST3_FRAME,ALTREF2_FRAME};
{GOLDEN_FRAME,ALTREF2_FRAME};
{LAST_FRAME,LAST2_FRAME};
{LAST_FRAME,LAST3_FRAME};
{LAST_FRAME,GOLDEN_FRAME};
{BWDREF_FRAME,ALTREF_FRAME}。
因此,帧间预测模式与参考帧之间共有156(7×4+16×8)种组合。
在一些实施例中,对于上述组合中的任一组合,当前组合均会对应最多3个MVP,然后对当前MVP做运动估计(其中,只含有NEWMV模式的模式才做运动估计)、组合模式类型择优、插值方式择优、及运动模式择优4个过程。
其中,组合模式用于将2个参考帧的预测像素进行融合。例如,可以从组合模式中选出一种最优的组合模式类型,基于被选出的组合模式类型将2个参考帧的预测像素融合到一起。每种组合模式表示一种预测像素的融合方式。
在一些实施例中,单参考帧模式对应的运动模式和组合参考帧模式对应的运动模式不同。单参考帧模式对应有4种运动模式,分别为:SIMPLE(简单运动补偿)、OBMC(重叠块运动补偿)、WARPED(全局和局部扭曲运动补偿)和SIMPLE(inter_intra);组合参考帧模式对应有SIMPLE模式。
在相关技术中,一组帧间预测模式对应的计算复杂度非常大,尤其是NEWMV模式。在NEWMV模式下,对于每个CU,需要遍历7个参考帧,对每个参考帧遍历最多3个MVP,每个MVP分别做运动估计、9种方式的插值及4种运动模式计算。即在NEWMV模式下,每个CU最多对应有189(7×3×9)次插值和84(7×3×4)次运动模式计算。
示意性的,图4示出了相关技术提供的NEWMV模式下最优结果选择过程的示意图,如图4所示,该过程中包括如下步骤。
步骤401,设置N=0,获取MVP的数量。
即,设置N的初始取值为0,N表示第N个MVP,从而能够对每个MVP进行遍历。例如,该N的最大值可以取7×3-1=20,也即对于一个CU,NEWMV模式下的MVP最多可以有21个。
步骤402,判断N是否小于MVP的数量。
当N小于MVP的数量时,表示当前CU在EWMV模式下的MVP的遍历未结束。
步骤403,当N小于MVP的数量时,获取MVP,设置N=N+1。
设置N=N+1即表示当前MVP遍历完成后,对下一个MVP进行遍历。而当N大于或等于MVP的数量时,表示当前CU在EWMV模式下的所有MVP都遍历结束,可以针对当前CU进行下一预测模式下的MVP的遍历。
步骤404,运动估计。即对当前MVP进行运动估计,得到当前MVP对应的最优运动向量。
步骤405,最优运动向量下插值方式择优。例如,从9种插值方式中选择出,最优运动向量对应的最优插值方式。
步骤406,运动模式择优。例如,从4种运动模式中选择出,最优运动向量对应的最优运动模式。
在上述NEWMV模式下的最优结果选择过程中,需要针对每个MVP做运动估计、9种插值计算和4种运动模式计算,才能选出最优MVP对应下的最优运动向量、最优插值方式和最优运动模式,计算复杂度很大,编码效率较低。
而本申请实施例提供了一种帧间预测的编码方法,可以提高NEWMV模式下的帧间预测时的编码效率。该方法主要是通过校正策略选出最优的运动向量和MVP组合,然后只对最优的运动向量和MVP组合,做插值方式择优和运动模式择优。对比上述图4,图5是本申请一个示例性实施例提供的帧间预测的编码方法的整体过程示意图,如图5所示,该过程中包括如下步骤。
步骤501,设置N=0,获取MVP的数量。
即,设置N的初始取值为0,N表示第N个MVP,从而能够对每个MVP进行遍历。需要说明的是,本申请实施例只需对每个MVP分别对应的运动向量进行遍历。
步骤502,判断N是否小于MVP的数量。
当N小于MVP的数量时,表示当前CU在EWMV模式下的MVP的遍历未结束。
步骤503,当N小于MVP的数量时,获取MVP,设置N=N+1。
设置N=N+1表示当前MVP遍历完成后,对下一个MVP进行遍历。而当N大于或者等于MVP的数量时,表示当前CU在EWMV模式下的所有MVP都遍历结束,可以针对当前CU进行下一预测模式下的MVP的遍历。以EWMV模式下的MVP的数量为M为例,即从第0个MVP遍历至第M-1个MVP,共计遍历了所有M个MVP。
步骤504,运动估计。即对当前MVP进行运动估计,得到当前MVP对应的最优MV(即下文中的候选MV)。在获取每个MVP分别对应的最优MV之后,执行步骤505。
步骤505,校正,选出最优的MVP和MV组合。例如,通过采用校正策略,基于每个MVP和每个MVP对应的最优MV,选出最优的MVP和MV组合。下文将对此过程进行详细说明,本申请实施例未说明的细节可以参考下述实施例,这里不再赘述。
步骤506,最优运动向量下插值方式择优。例如,针对最优的MVP和MV组合,进行插值方式择优,得到最优插值方式。
步骤507,运动模式择优。例如,针对最优的MVP和MV组合,进行运动模式择优,得到最优运动模式。
根据图5可知,本申请实施例中提供的帧间预测的编码方法,在进行运动估计后即继续进行循环过程,而循环完毕后还需要进行校正,选出最优的MVP和MV组合,最终基于最优的MVP和MV组合进行后续的插值方式择优和运动模式择优,相比于相关技术,在 NEWMV模式下,由于本申请实施例无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了NEWMV模式下的编码效率。
结合上述说明,下文将对本申请实施例提供的帧间预测的编码方法进行详细说明。
图6是本申请一个示例性实施例提供的帧间预测的编码方法的流程图,以该方法应用于编码器中为例进行说明,如图6所示,该方法包括如下步骤。
步骤601,获取待编码的图像帧。
本实施例中,图像帧划分有编码单元,编码单元对应至少两种预测模式,该至少两种预测模式中包括指定帧间预测模式。
在一些实施例中,该待编码的图像帧为非关键帧,也即P帧或者B帧。获取该待编码的图像帧时,首先获取该待编码的图像帧对应的待编码视频流,并依次获取该待编码视频流中的每一帧图像帧,从而确定当前待编码的图像帧。
在一些实施例中,每个编码单元均对应有帧内预测模式和帧间预测模式。其中,帧间预测模式中包括NEWMV模式。示意性的,该NEWMV模式即为本申请实施例中提供的指定帧间预测模式。由于NEWMV模式依赖于MVP的推导,因此在NEWMV模式下,需要从至少两个(通常是3个)MVP中确定出最优MVP。
本申请实施例中的指定帧间预测模式可以为NEWMV模式,或者,也可以为包含NEWMV模式的组合模式,本申请实施例对此不作限定。
步骤602,响应于通过指定帧间预测模式对编码单元进行预测,对指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量。
在一些实施例中,在对MVP进行运动估计遍历时,首先需要获取MVP的数量,针对第i个MVP,响应于i在MVP的数量范围内,对第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;针对n个MVP得到n个候选运动向量,其中,第i个MVP对应第i个候选运动向量,n为MVP的数量。其中,第i个候选运动向量可以是指第i个MVP在运动估计中对应的最优运动向量。具体的运动估计方法下文将进行详细说明,这里不再赘述。
示意性的,设MVP的数量为3,则针对第0个MVP,对第0个MVP进行运动估计,同理,针对第1个MVP和第2个MVP进行运动估计,从而实现对3个MVP的运动估计遍历。
在一些实施例中,得到n个MVP分别对应的候选运动向量后,对候选运动向量进行存储。可选地,对候选运动向量进行存储;或者,对候选运动向量以及候选运动向量的失真进行存储。
本实施例中,以对候选运动向量以及候选运动向量的失真进行存储为例,则存储方式可以包括如下方式中的至少一种。
第一,构建第一数组和第二数组,第一数组和第二数组分别用于存储候选运动向量以及候选运动向量的失真。
第二,构建数据库,在数据库中以键值对(key-value)的形式存储候选运动向量以及候选运动向量的失真。
本申请实施例中,以构建第一数组和第二数组的方式存储候选运动向量以及候选运动向量的失真为例进行说明。其中,第一数组用于存储MVP对应的候选运动向量的失真,第二数组用于存储MVP对应的候选运动向量。在对第i个MVP进行运动估计得到第i个候选运动向量后,将第i个候选运动向量对应的失真存储至第一数组中,将第i个候选运动向量存储至第二数组中。
在一些实施例中,上述候选运动向量包括MVP进行运动估计得到的最优运动向量。
示意性的,对指定帧间预测模式下的MVP进行运动估计遍历时,主要可以包括如下几个过程。
1、获取MVP的数量。
在一些实施例中,NEWMV模式下的MVP的数量为3,即上述图7示出的第0个至第2个MVP。
2、设置数组,用于存储每个MVP分别对应的数据。
其中,设置的数组中包括第一数组和第二数组,分别如下所示。
第一数组:dist_bestmv_list[i],用于存放每个MVP分别对应的最优运动向量的失真。
第二数组:me_bestmv_list[i],用于存放每个MVP分别对应的最优运动向量。其中,i用于表示第i个MVP,i的取值为0,1,2。
3、依次对各个MVP进行运动估计。
运动估计的方式可以包括整像素运动估计和分像素运动估计。其中,整像素运动估计可以包括TZ搜索方法(后续内容中进行介绍)、NStep搜索方法、菱形搜索方法(后续内容中进行介绍)、六边形搜索方法等;分像素运动估计可以包括菱形搜索、全搜索等。
4、得到每个MVP分别对应的最优运动向量后,即可得到当前MVP索引对应的失真、最优运动向量。分别对当前MVP索引对应的失真和最优运动向量在数组中进行记录。
示意性的,用mvp_index表示当前MVP索引,mvp_index取值为[0,2],则dist_bestmv_list[mvp_index]=dist;me_bestmv_list[mvp_index]=bestmv。
其中,dist表示失真,bestmv表示最优运动向量。即,将MVP索引对应的失真存储到第一数组的MVP索引下,将MVP索引下的最优运动向量存储到第二数组的MVP索引下。示意性的,将第0个MVP对应的失真存储到第一数组中第0个MVP的索引下,即为dist_bestmv_list[0],将第0个MVP对应的最优运动向量存储到第二数组中第0个MVP的索引下,即为me_bestmv_list[0]。
步骤603,从MVP和候选运动向量中确定运动向量组。
运动向量组中包括从MVP中确定的目标MVP,以及从候选运动向量中确定的目标运动向量。
在一些实施例中,目标MVP为最优MVP,目标运动向量为最优运动向量。
在一些实施例中,目标MVP和目标运动向量为对应的运动向量组,也即,当目标MVP为第i个MVP时,目标运动向量为第i个运动向量,在该情况下,则直接根据每组候选运动向量组对应的率失真代价确定运动向量组即可。示意性的,确定第0个MVP和第0个运动向量的第一率失真代价,确定第1个MVP和第1个运动向量的第二率失真代价,确定第2个MVP和第2个运动向量的第三率失真代价。根据第一率失真代价、第二率失真代价和第三率失真代价,确定出率失真代价最小的一组运动向量组,如:当第一率失真代价最小时,将第一率失真代价对应的第0个MVP和第0个运动向量作为运动向量组。
值得注意的是,上述实施例中,以确定出率失真代价最小的一组运动向量组为例进行说明,在一些实施例中,也可以确定出率失真代价最小的两组运动向量组,并将率失真代价最小的两组运动向量组分别对应的MVP和运动向量进行后续的插值方式遍历和运动模式遍历,本申请实施例对此不加以限定。可选地,运动向量组的个数可以通过对外接口决定。
在另一些实施例中,目标MVP和目标运动向量为重组后得到的运动向量组,也即,当目标MVP为第i个MVP时,目标运动向量可以是第i个运动向量,也可以是其他MVP确定得到的运动向量。本申请实施例中,以目标MVP和目标运动向量为重组后得到的运动向量组为例进行说明。
示意性的,依次将每个MVP与每个候选运动向量进行重组,从而从得到的组合中确定出一组运动向量组,其中,每个MVP与每个候选运动向量重组得到的组合数量,为MVP的数量与候选运动向量数量的乘积。
值得注意的是,上述MVP和运动向量的重组方式仅为示意性的举例,本申请实施例对MVP和运动向量的具体重组方式不加以限定。
在上述步骤602得到的各个MVP和对应的候选运动向量中,MVP和候选运动向量之间的匹配可能存在局部最优值。故,本申请实施例中,可以通过校正的方式(即重组)对运动向量组进行确定,以消除局部最优值的影响,从而提高目标MVP和目标运动向量的获取准确性,进而提高运动向量组的质量,如此有利于提升指定帧间预测模式下的编码质量。
可选地,可以通过重新组合所有MVP和所有候选运动向量的方式进行校正。
示意性的,第0个MVP进行运动估计后得到第0个候选运动向量(也即第0个MVP的最优运动向量);第1个MVP进行运动估计后得到第1个候选运动向量(也即第1个MVP的最优运动向量);第2个MVP进行运动估计后得到第2个候选运动向量(也即第2个MVP的最优运动向量)。则依次对3个MVP和3个候选运动向量进行重组,得到第0个MVP与第0个候选运动向量的组合、第0个MVP与第1个候选运动向量的组合、第0个MVP与第2个候选运动向量的组合、第1个MVP与第0个候选运动向量的组合、第1个MVP与第1个候选运动向量的组合、第1个MVP与第2个候选运动向量的组合、第2个MVP与第0个候选运动向量的组合、第2个MVP与第1个候选运动向量的组合、第2个MVP与第2个候选运动向量的组合。从这些组合中确定出运动向量组,如率失真代价最小的组合。
在一些实施例中,从这些组合中确定出率失真代价最小的一个组合作为运动向量组;或者,从这些组合中确定出率失真代价最小的多个组合作为运动向量组,并根据多个运动向量组分别对应的目标MVP和目标运动向量,分别进行插值方式的遍历和运动模式的遍历。其中,运动向量组的个数可以通过对外接口决定。
步骤604,基于运动向量组对编码单元进行插值方式的遍历和运动模式的遍历,得到编码单元对应的目标插值方式和目标运动模式。
在一些实施例中,在确定运动向量组后,可以根据运动向量组中的目标MVP(如最优MVP)和目标运动向量(如最优运动向量),进行插值方式的择优和运动模式的择优。
综上所述,本申请实施例提供的帧间预测的编码方法,对于采用指定帧间预测模式(如:NEWMV模式)对图像帧进行编码的场景,通过先确定出包括目标MVP和目标运动向量的组合(如最优组合),再针对该组合进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
另外,通过校正的方式(即重组)对运动向量组进行确定,以消除MVP和候选运动向量之间的局部最优值的影响,从而提高了目标MVP和目标运动向量的获取准确性,进而提高了运动向量组的质量,如此有利于提升指定帧间预测模式下的编码质量。
在一些实施例中,以TZ搜索方法和菱形搜索方法为代表,对上述运动估计方法进行介绍。
TZ搜索方法的实现方式包括如下过程。
(1)确定搜索起始点。
采用当前MVP作为搜索起始点,同时还有(0,0)位置,比较二者对应运动向量下的率失真代价,率失真代价小的运动向量作为最终的搜索起始点。
(2)以步长1开始,在搜索窗范围内进行搜索。
示意性的,请参考图7,其示出了本申请一个示例性实施例提供的TZ搜索模板示意图。如图7所示,TZ搜索方法下包括菱形模板710,在该菱形模板710的搜索窗范围内进行搜索,其中,步长以2的整数次幂递增,选出率失真代价最小的点作为搜索结果(基于搜索结果确定最优运动向量)。
(3)若率失真代价最小的点对应的步长为1,则开始2点搜索。示意性的,图8示出了本申请一个示例性实施例提供的2点搜索的示意图,如图8所示,在点1位置补做点801和点802;在点3位置补做点803和点804;在点6位置补做点805和点806;在点8位置补做 点807和点808。而其他位置,如点2、点4、点5、点7的左右上下四个正方向的位置点均计算过,不需要再补点。
(4)若步骤(3)中得到的率失真代价最小的步长大于5,则开始以光栅扫描的方式,做隔5行5列的所有点。示意性的,请参考图9,其示出了光栅扫描方式搜索位置点的局部示意图。如图9所示,每隔5行5列在标注的点910位置处进行扫描,以得到搜索结果。
菱形搜索方法的实现方式包括如下过程。
菱形搜索方法也称为钻石搜索方法,有大菱形和小菱形两种不同的匹配模板。其中,大菱形匹配模板有9个搜索点,小菱形匹配模板有5个搜索点。首先使用步长较大的大菱形匹配模板进行粗搜索,然后使用小菱形匹配模板进行细搜索。搜索步骤可以如下。
一,以搜索窗口的中心点为中心,按照大菱形匹配模板,计算中心点和其周围8个点(共计9个点)的率失真代价,比较得到率失真代价最小的点。
二,如果搜索窗口的中心点就是率失真代价值最小的点,则跳到三,使用小菱形搜索模板,否则依旧回到一的搜索。
三,利用搜索点数只有5个点的小菱形匹配模板,计算这5个点的率失真代价,取率失真代价最小的点为最佳匹配点,即最优运动向量。
在一些实施例中,上述MVP和运动向量的校正方式是通过对MVP和候选运动向量的重组实现的。图10是本申请另一个示例性实施例提供的帧间预测的编码方法流程图,以该方法应用于编码器中为例进行说明,如图10所示,该方法包括如下步骤。
步骤1001,获取待编码的图像帧。
其中,图像帧划分有编码单元,编码单元对应至少两种预测模式,该至少两种预测模式中包括指定帧间预测模式。
步骤1002,获取运动向量预测MVP的数量。
步骤1003,针对第i个MVP,响应于i在MVP的数量范围内,对第i个MVP进行运动估计,得到第i个候选运动向量,i为整数。
示意性的,设MVP的数量为3,则由于第0个MVP在MVP的数量范围内,故对第0个MVP进行运动估计,同理,针对第1个MVP和第2个MVP进行运动估计。
步骤1004,针对n个MVP得到n个候选运动向量。
其中,第i个MVP对应第i个候选运动向量,n为MVP的数量。
示意性的,第0个MVP进行运动估计后得到第0个候选运动向量(也即第0个MVP的最优运动向量);第1个MVP进行运动估计后得到第1个候选运动向量(也即第1个MVP的最优运动向量);第2个MVP进行运动估计后得到第2个候选运动向量(也即第2个MVP的最优运动向量)。
步骤1005,依次将每个MVP与各个候选运动向量进行重组,得到m个组合关系;其中,m的取值为n的平方。
示意性的,针对上述MVP和候选运动向量,则重组后,得到第0个MVP与第0个候选运动向量的组合、第0个MVP与第1个候选运动向量的组合、第0个MVP与第2个候选运动向量的组合、第1个MVP与第0个候选运动向量的组合、第1个MVP与第1个候选运动向量的组合、第1个MVP与第2个候选运动向量的组合、第2个MVP与第0个候选运动向量的组合、第2个MVP与第1个候选运动向量的组合、第2个MVP与第2个候选运动向量的组合。从这些组合中确定出运动向量组,如组合后率失真代价最小的组合。
步骤1006,确定m个组合关系分别对应的率失真代价。
在一些实施例中,率失真代价用于表示在组合关系下的像素误差情况。在一些实施例中,率失真代价用于表示在组合关系下的像素编码代价,率失真代价是由当前组合关系下的失真和编码所占用的码字位数确定的。
示意性的,MVP和候选运动向量的重组后率失真代价的计算方式如下公式一所示。
其中,i用于指示第i个MVP,j用于指示第j个候选运动向量。则m用于表示MVP或候选运动向量的索引最大取值,以用于代表MVP和候选运动向量的数量。在一些实施例中,m的取值为0到MVP或候选运动向量的数量减1。mvp[i]代表第i个MVP,mvcost(me_beatmv_list[j]-mvp[i])表示,第j个候选运动向量与第i个MVP之差对应的率失真代价。
示意性的,以m的取值为[0,2],也即有3个MVP为例,则能够得到9个组合关系分别对应的率失真代价,该9个率失真代价分别为如下公式所示。
cost
00=dist_bestmv_list[0]+mvcost(me_bestmv_list[0]-mvp[0]);
cost
01=dist_bestmv_list[1]+mvcost(me_bestmv_list[1]-mvp[0]);
cost
02=dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[0]);
cost
10=dist_bestmv_list[0]+mvcost(me_bestmv_list[0]-mvp[1]);
cost
11=dist_bestmv_list[1]+mvcost(me_bestmv_list[1]-mvp[1]);
cost
12=dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[1]);
cost
20=dist_bestmv_list[0]+mvcost(me_bestmv_list[0]-mvp[2]);
cost
21=dist_bestmv_list[1]+mvcost(me_bestmv_list[1]-mvp[2]);
cost
22=dist_bestmv_list[2]+mvcost(me_bestmv_list[2]-mvp[2]);
cost
00表示第0个MVP和第0个候选运动向量的组合关系;cost
01表示第0个MVP和第1个候选运动向量的组合关系;cost
02表示第0个MVP和第2个候选运动向量的组合关系,以此类推。
步骤1007,基于率失真代价从m个组合关系中确定运动向量组。
在一些实施例中,从m个组合关系中确定率失真代价最小的目标组合关系,确定包括目标组合关系中的目标MVP和目标运动向量的运动向量组。
示意性的,以上述9个组合关系为例,如:cost
20最小,则最优mvp索引为2,最优运动向量索引为0,即由第2个MVP和第0个候选运动向量构成运动向量组,第2个MVP为目标MVP,第0个候选运动向量为目标运动向量。
步骤1008,基于运动向量组对编码单元进行插值方式的遍历和运动模式的遍历,得到编码单元对应的目标插值方式和目标运动模式。
在一些实施例中,在确定运动向量组后,根据运动向量组中的目标MVP(如最优MVP)和目标运动向量(如最优运动向量),进行插值方式择优和运动模式择优。
综上所述,本申请实施例提供的帧间预测的编码方法,对于采用指定帧间预测模式(如:NEWMV模式)对图像帧进行编码的场景,通过先确定出包括目标MVP和目标运动向量的组合(如最优组合),再针对该组合进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
另外,本实施例提供的方法,通过以对MVP和候选运动向量进行重组的方式,对MVP和MVP对应的最优运动向量进行校正,避免MVP与MVP对应的最优运动向量之间存在局部最优值的情况,提高了目标MVP和目标运动向量的确定准确率,也即进一步提高了编码质量。
另外,本实施例提供的方法,能够减少NEWMV模式下2/3的插值方式遍历和运动模式计算的运算量。
在一些实施例中,在确定运动向量组后,还需要确定参考帧,索引消耗的比特数等内容。 图11是本申请另一个示例性实施例提供的帧间预测的编码方法流程图,以该方法应用于编码器中为例进行说明,如图11所示,该方法包括如下步骤。
步骤1101,获取待编码的图像帧。
其中,图像帧划分有编码单元,编码单元对应至少两种预测模式,该至少两种预测模式中包括指定帧间预测模式。
步骤1102,响应于通过指定帧间预测模式对编码单元进行预测,对指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量。
在一些实施例中,在对MVP进行运动估计遍历时,首先需要获取MVP的数量,针对第i个MVP,响应于i在MVP的数量范围内,对第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;针对n个MVP得到n个候选运动向量,其中,第i个MVP对应第i个候选运动向量,n为MVP的数量。
步骤1103,从MVP和候选运动向量中确定运动向量组。
运动向量组中包括从MVP中确定的目标MVP,以及从候选运动向量中确定的目标运动向量。
在一些实施例中,目标MVP为最优MVP,目标运动向量为最优运动向量。
在上述步骤1102得到的各个MVP和对应的候选运动向量中,MVP和候选运动向量之间的匹配可能存在局部最优值。故,本申请实施例中,可以通过重新组合的方式对MVP和候选运动向量进行校正,得到目标MVP和目标运动向量。
步骤1104,基于运动向量组对编码单元进行插值方式的遍历和运动模式的遍历,得到编码单元对应的目标插值方式和目标运动模式。
在一些实施例中,在确定运动向量组后,根据运动向量组中的目标MVP(如最优MVP)和目标运动向量(如最优运动向量),进行插值方式择优和运动模式择优。
在一些实施例中,针对插值方式择优和运动模式择优进行介绍。
插值方式择优的实现包括如下过程。
插值是一种为提高采样率的过程。插值的原因是最优运动向量若包含分像素,则预测像素不能直接取到,因此需要先取最优运动向量整像素位置对应的参考像素,再根据分像素坐标来插值,最终得到预测像素。
插值的计算过程中,先进行水平方向插值,再进行竖直方向插值,AV1为分像素设计了REG(基于正则表达式的插值方法),SMOOTH(平滑插值方法)和SHARP三种插值方法,所有滤波核都是8抽头,3种插值方式区别主要是滤波核的系数不同。
因为水平和垂直可以任意组合,共得到9种插值方式,即REG_REG,REG_SMOOTH,REG_SHARP,SMOOTH_REG,SMOOTH_SMOOTH,SMOOTH_SHARP,SHARP_REG,SHARP_SMOOTH和SHARP_SHARP。
遍历9种插值方式,并估算率失真代价,率失真代价最小的插值方式即为最优插值方式。
运动模式择优的实现包括如下过程。
运动模式主要包括如下4种:SIMPLE,OBMC,WARPED和SIMPLE(inter_intra)。
最优运动模式会写入码流,以指示解码器在解码时恢复重建数据所使用的运动模式。SIMPLE(inter_intra)跟第一个SIMPLE同为SIMPLE模式,但区别很大,解码时可以通过语法中的参考帧信息,来判断是SIMPLE模式还是SIMPLE(inter_intra)模式,如此使用同一个标记可以节省一个bit。
4种运动模式均要做完整率失真代价,即做变换、量化、逆量化、逆变换的整个重建过程,不同的地方是预测像素的获取方法各不相同。
第一步,获取预测像素。
对于SIMPLE模式:预测像素为插值后得到的预测值。
对于OBMC模式:对插值后得到的预测像素二次加工。根据相邻块MV获取相邻块的预 测像素,然后按照一定规则跟当前块插值后的预测值融合,得到新预测值。
对于WARPED模式:参考左边、上边、右上角3个可用位置,构造仿射变换MV,然后做小范围运动搜索,最后做插值获得预测像素。
对于SIMPLE(inter_intra)模式:对插值后得到的预测像素二次加工。先做DC、V、H和SMOOTH 4种帧内模式的帧内预测,得到最优帧内预测像素,然后将帧内预测和帧间预测像素融合,得到新预测值。
第二步,完整率失真计算。
首先,根据输入像素和预测像素,得到残差像素。
然后,获取某一运动模式下,除残差系数的比特数。
比特数的获取与熵编码的上下文有关,率失真代价公式中比特数rate包含了很多信息,比方参考帧消耗比特数、MVP索引消耗比特数,MVD消耗比特数,插值方式消耗比特数,运动模式消耗比特数,残差消耗比特数等,此处比特数为某一运动模式除变换外的所有消耗比特数,不同运动模式对应比特数不同。
然后,残差数据SSE计算。
需要对残差数据做变换、量化、逆量化、逆变换的过程,得到重建像素,在变化的过程需要上述步骤2中的比特数,与变换中变换类型消耗比特数、变换单元划分类型消耗比特数、残差系数消耗比特数一起,来评估当前预测块最优变换类型和最优TU分割类型,然后得到最优变换类型和最优TU分割类型所对应的失真和比特数,分别记作dist和rate。
最后,获取率失真代价rdcost,最小率失真代价对应的运动模式即为最优运动模式。
其中,rdcost=dist+rate×λ,λ为预设的常量。
值得注意的是,上述计算失真时,是以通过SSE的方式计算为例进行说明的,本申请实施例中,还可以通过SATD、SAD等方式实现,本申请实施例对此不加以限定。
步骤1105,确定目标MVP对应的参考帧。
在一些实施例中,确定目标MVP对应的参考帧索引方式,以参考帧索引方式和目标MVP索引得到参考帧。其中,参考帧的索引方式请参考如上表一。
步骤1106,确定目标MVP索引消耗的比特数。
可选的,将目标运动向量和目标MVP之差,确定为目标MVP索引消耗的比特数。也即,mvd=best_mv-best_mvp,其中,best_mv表示最优运动向量(也即本申请实施例中的目标运动向量),best_mvp表示最优MVP(也即本申请实施例中的目标MVP)。
步骤1107,基于目标MVP、目标插值方式、目标运动模式、参考帧和比特数对编码单元进行编码。
综上所述,本申请实施例提供的帧间预测的编码方法,对于采用指定帧间预测模式(如:NEWMV模式)对图像帧进行编码的场景,通过先确定出包括目标MVP和目标运动向量的组合(如最优组合),再针对该组合进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
另外,本实施例提供的方法,通过MVP校正,提前选出目标mvp和对应的目标运动向量,然后再做插值方式遍历和运动模式遍历,最多可以节省2/3的插值方式遍历和运动模式遍历下的计算量,加速比非常高,很好的奠定了单参考帧NEWMV模式的帧间预测框架,是AV1编码器核心的部分。
实测结果:编码65帧,加速6%,加速比超过50:1。
值得注意的是,上述实施例中,是以帧间预测的编码方法应用于AV1压缩协议中为例进行说明的,本申请实施例提供的帧间预测的编码方法还可以应用于其他压缩协议,如:H.266压缩协议、AVS3压缩协议等。
图12是本申请一个示例性实施例提供的帧间预测的编码装置的结构框图,如图12所示,该装置包括:
获取模块1210,用于获取待编码的图像帧,所述图像帧划分有编码单元;
预测模块1220,用于响应于通过指定帧间预测模式对所述编码单元进行预测,对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量;
确定模块1230,用于从所述MVP和所述候选运动向量中确定运动向量组,所述运动向量组中包括从所述MVP中确定的目标MVP,以及从所述候选运动向量中确定的目标运动向量;
所述预测模块1220,还用于基于所述运动向量组对所述编码单元进行插值方式的遍历和运动模式的遍历,得到所述编码单元对应的目标插值方式和目标运动模式。
在一个可选的实施例中,所述获取模块1210,还用于获取所述MVP的数量;
所述预测模块1220,还用于针对第i个MVP,响应于i在所述MVP的数量范围内,对所述第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;
所述预测模块1220,还用于针对n个MVP得到n个候选运动向量;其中,所述第i个MVP对应所述第i个候选运动向量,n为所述MVP的数量。
在一个可选的实施例中,所述预测模块1220,还用于依次将每个MVP与各个候选运动向量进行重组,得到m个组合关系;其中,m的取值为n的平方;
所述确定模块1230,还用于确定所述m个组合关系分别对应的率失真代价,所述率失真代价用于表示在所述组合关系下的像素误差情况;基于所述率失真代价从所述m个组合关系中确定所述运动向量组。
在一个可选的实施例中,所述确定模块1230,还用于从所述m个组合关系中确定所述率失真代价最小的目标组合关系;确定包括所述目标组合关系中的所述目标MVP和所述目标运动向量的所述运动向量组。
在一个可选的实施例中,如图13所示,该装置还包括:
构建模块1310,用于构建第一数组和第二数组,所述第一数组用于存储所述MVP对应的候选运动向量的失真,所述第二数组用于存储所述MVP对应的候选运动向量;
存储模块1320,用于将所述第i个候选运动向量对应的失真存储至所述第一数组中;
所述存储模块1320,还用于将所述第i个候选运动向量存储至所述第二数组中。
在一个可选的实施例中,所述确定模块1230,还用于确定所述目标MVP对应的参考帧;确定所述目标MVP索引消耗的比特数;
所述装置还包括:
编码模块1330,用于基于所述目标MVP、所述目标插值方式、所述目标运动模式、所述参考帧和所述比特数对所述编码单元进行编码。
在一个可选的实施例中,所述确定模块1230,还用于确定所述目标MVP对应的参考帧索引方式;以所述参考帧索引方式和所述目标MVP索引得到所述参考帧。
在一个可选的实施例中,所述确定模块1230,还用于将所述目标运动向量和所述目标MVP之差,确定为所述目标MVP索引消耗的比特数。
综上所述,本申请实施例提供的帧间预测的编码装置,对于采用指定帧间预测模式(如:NEWMV模式)对图像帧进行编码的场景,通过先确定出包括目标MVP和目标运动向量的组合(如最优组合),再针对该组合进行插值方式择优和运动模式择优,而无需针对所有的MVP和运动向量进行插值方式遍历和运动模式遍历,从而减少了率失真代价的计算量,以及降低了计算的复杂度,进而提高了指定帧间预测模式下的编码效率。
需要说明的是:上述实施例提供的帧间预测的编码装置,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实 施例提供的帧间预测的编码装置与帧间预测的编码方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图14示出了本申请一个示例性实施例提供的计算机设备1400的结构框图。该计算机设备1400可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑,也可以是服务器。计算机设备1400还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,计算机设备1400包括有:处理器1401和存储器1402。
处理器1401可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1401可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1401也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1401可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1401还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1402可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1402还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1402中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器1401所执行以实现上述帧间预测的编码方法。
本领域技术人员可以理解,图14中示出的结构并不构成对计算机设备1400的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
在一个示例性实施例中,还提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序在被处理器执行时以实现上述帧间预测的编码方法。
可选地,该计算机可读存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、固态硬盘(SSD,Solid State Drives)或光盘等。其中,随机存取记忆体可以包括电阻式随机存取记忆体(ReRAM,Resistance Random Access Memory)和动态随机存取存储器(DRAM,Dynamic Random Access Memory)。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
在一个示例性实施例中,还提供了一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中。计算机设备的处理器从所述计算机可读存储介质中读取所述计算机指令,所述处理器执行所述计算机指令,使得所述计算机设备执行上述帧间预测的编码方法。
需要说明的是,本申请所涉及的信息(包括但不限于对象设备信息、对象个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经对象授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请中涉及到的图像帧、视频等都是在充分授权的情况下获取的。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
Claims (16)
- 一种帧间预测的编码方法,所述方法由计算机设备执行,所述方法包括:获取待编码的图像帧,所述图像帧划分有编码单元;响应于通过指定帧间预测模式对所述编码单元进行预测,对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量;从所述MVP和所述候选运动向量中确定运动向量组,所述运动向量组中包括从所述MVP中确定的目标MVP,以及从所述候选运动向量中确定的目标运动向量;基于所述运动向量组对所述编码单元进行插值方式的遍历和运动模式的遍历,得到所述编码单元对应的目标插值方式和目标运动模式。
- 根据权利要求1所述的方法,其中,所述对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量,包括:获取所述MVP的数量;针对第i个MVP,响应于i在所述MVP的数量范围内,对所述第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;针对n个MVP得到n个候选运动向量;其中,所述第i个MVP对应所述第i个候选运动向量,n为所述MVP的数量。
- 根据权利要求2所述的方法,其中,所述从所述MVP和所述候选运动向量中确定运动向量组,包括:依次将每个MVP与各个候选运动向量进行重组,得到m个组合关系;其中,m的取值为n的平方;确定所述m个组合关系分别对应的率失真代价,所述率失真代价用于表示在所述组合关系下的像素误差情况;基于所述率失真代价从所述m个组合关系中确定所述运动向量组。
- 根据权利要求3所述的方法,其中,所述基于所述率失真代价从所述m个组合关系中确定所述运动向量组,包括:从所述m个组合关系中确定所述率失真代价最小的目标组合关系;确定包括所述目标组合关系中的所述目标MVP和所述目标运动向量的所述运动向量组。
- 根据权利要求2所述的方法,其中,所述方法还包括:构建第一数组和第二数组,所述第一数组用于存储所述MVP对应的候选运动向量的失真,所述第二数组用于存储所述MVP对应的候选运动向量;所述对所述第i个MVP进行运动估计,得到第i个候选运动向量之后,还包括:将所述第i个候选运动向量对应的失真存储至所述第一数组中;将所述第i个候选运动向量存储至所述第二数组中。
- 根据权利要求1至5任一所述的方法,其中,所述从所述MVP和所述候选运动向量中确定运动向量组之后,还包括:确定所述目标MVP对应的参考帧;确定所述目标MVP索引消耗的比特数;基于所述目标MVP、所述目标插值方式、所述目标运动模式、所述参考帧和所述比特数对所述编码单元进行编码。
- 根据权利要求6所述的方法,其中,所述确定所述目标MVP对应的参考帧,包括:确定所述目标MVP对应的参考帧索引方式;以所述参考帧索引方式和所述目标MVP索引得到所述参考帧。
- 根据权利要求6所述的方法,其中,所述确定所述目标MVP索引消耗的比特数,包括:将所述目标运动向量和所述目标MVP之差,确定为所述目标MVP索引消耗的比特数。
- 一种帧间预测的编码装置,所述装置包括:获取模块,用于获取待编码的图像帧,所述图像帧划分有编码单元;预测模块,用于响应于通过指定帧间预测模式对所述编码单元进行预测,对所述指定帧间预测模式下的运动向量预测MVP进行运动估计遍历,得到候选运动向量;确定模块,用于从所述MVP和所述候选运动向量中确定运动向量组,所述运动向量组中包括从所述MVP中确定的目标MVP,以及从所述候选运动向量中确定的目标运动向量;所述预测模块,还用于基于所述运动向量组对所述编码单元进行插值方式的遍历和运动模式的遍历,得到所述编码单元对应的目标插值方式和目标运动模式。
- 根据权利要求9所述的装置,其中,所述获取模块,还用于获取所述MVP的数量;所述预测模块,还用于针对第i个MVP,响应于i在所述MVP的数量范围内,对所述第i个MVP进行运动估计,得到第i个候选运动向量,i为整数;所述预测模块,还用于针对n个MVP得到n个候选运动向量,其中,所述第i个MVP对应所述第i个候选运动向量,n为所述MVP的数量。
- 根据权利要求10所述的装置,其中,所述预测模块,还用于依次将每个MVP与各个候选运动向量进行重组,得到m个组合关系;其中,m的取值为n的平方;所述确定模块,还用于确定所述m个组合关系分别对应的率失真代价,所述率失真代价用于表示在所述组合关系下的像素误差情况;基于所述率失真代价从所述m个组合关系中确定所述运动向量组。
- 根据权利要求11所述的装置,其中,所述确定模块,还用于从所述m个组合关系中确定所述率失真代价最小的目标组合关系;确定包括所述目标组合关系中的所述目标MVP和所述目标运动向量的所述运动向量组。
- 根据权利要求10所述的装置,其中,所述装置还包括:构建模块,用于构建第一数组和第二数组,所述第一数组用于存储所述MVP对应的候选运动向量的失真,所述第二数组用于存储所述MVP对应的候选运动向量;存储模块,用于将所述第i个候选运动向量对应的失真存储至所述第一数组中;所述存储模块,还用于将所述第i个候选运动向量存储至所述第二数组中。
- 一种计算机设备,其中,所述计算机设备包括处理器和存储器,所述存储器中存储有计算机程序,所述计算机程序由所述处理器加载并执行以实现如权利要求1至8任一所述的帧间预测的编码方法。
- 一种计算机可读存储介质,其中,所述存储介质中存储有计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至8任一所述的帧间预测的编码方法。
- 一种计算机程序产品,其中,所述计算机程序产品包括计算机指令,所述计算机指令存储在计算机可读存储介质中,处理器从所述计算机可读存储介质读取并执行所述计算机指令,以实现如权利要求1至8任一所述的帧间预测的编码方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22819274.6A EP4354858A4 (en) | 2021-06-07 | 2022-05-09 | CODING METHOD AND APPARATUS USING INTER-FRAME PREDICTION, DEVICE AND READABLE STORAGE MEDIUM |
US18/123,650 US20230232020A1 (en) | 2021-06-07 | 2023-03-20 | Inter prediction encoding method, apparatus, and device, and readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110629001.2 | 2021-06-07 | ||
CN202110629001.2A CN113079372B (zh) | 2021-06-07 | 2021-06-07 | 帧间预测的编码方法、装置、设备及可读存储介质 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/123,650 Continuation US20230232020A1 (en) | 2021-06-07 | 2023-03-20 | Inter prediction encoding method, apparatus, and device, and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022257674A1 true WO2022257674A1 (zh) | 2022-12-15 |
Family
ID=76617117
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/091617 WO2022257674A1 (zh) | 2021-06-07 | 2022-05-09 | 帧间预测的编码方法、装置、设备及可读存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230232020A1 (zh) |
EP (1) | EP4354858A4 (zh) |
CN (1) | CN113079372B (zh) |
WO (1) | WO2022257674A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6243418B1 (en) * | 1998-03-30 | 2001-06-05 | Daewoo Electronics Co., Ltd. | Method and apparatus for encoding a motion vector of a binary shape signal |
WO2019192301A1 (zh) * | 2018-04-02 | 2019-10-10 | 深圳市大疆创新科技有限公司 | 视频图像处理方法与装置 |
CN111818342A (zh) * | 2020-08-28 | 2020-10-23 | 浙江大华技术股份有限公司 | 帧间预测方法及预测装置 |
CN111953997A (zh) * | 2019-05-15 | 2020-11-17 | 华为技术有限公司 | 候选运动矢量列表获取方法、装置及编解码器 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103621090B (zh) * | 2011-06-24 | 2017-09-15 | 寰发股份有限公司 | 移除运动向量预测子中冗余的方法与装置 |
US10701393B2 (en) * | 2017-05-10 | 2020-06-30 | Mediatek Inc. | Method and apparatus of reordering motion vector prediction candidate set for video coding |
CN109672894B (zh) * | 2017-10-13 | 2022-03-08 | 腾讯科技(深圳)有限公司 | 一种帧间预测方法、装置及存储介质 |
US11057640B2 (en) * | 2017-11-30 | 2021-07-06 | Lg Electronics Inc. | Image decoding method and apparatus based on inter-prediction in image coding system |
WO2020070730A2 (en) * | 2018-10-06 | 2020-04-09 | Beijing Bytedance Network Technology Co., Ltd. | Size restriction based on affine motion information |
CA3190343A1 (en) * | 2018-12-12 | 2020-06-18 | Lg Electronics Inc. | Method and apparatus for processing video signal based on history based motion vector prediction |
CN112312131B (zh) * | 2020-12-31 | 2021-04-06 | 腾讯科技(深圳)有限公司 | 一种帧间预测方法、装置、设备及计算机可读存储介质 |
-
2021
- 2021-06-07 CN CN202110629001.2A patent/CN113079372B/zh active Active
-
2022
- 2022-05-09 EP EP22819274.6A patent/EP4354858A4/en active Pending
- 2022-05-09 WO PCT/CN2022/091617 patent/WO2022257674A1/zh active Application Filing
-
2023
- 2023-03-20 US US18/123,650 patent/US20230232020A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6243418B1 (en) * | 1998-03-30 | 2001-06-05 | Daewoo Electronics Co., Ltd. | Method and apparatus for encoding a motion vector of a binary shape signal |
WO2019192301A1 (zh) * | 2018-04-02 | 2019-10-10 | 深圳市大疆创新科技有限公司 | 视频图像处理方法与装置 |
CN111953997A (zh) * | 2019-05-15 | 2020-11-17 | 华为技术有限公司 | 候选运动矢量列表获取方法、装置及编解码器 |
CN111818342A (zh) * | 2020-08-28 | 2020-10-23 | 浙江大华技术股份有限公司 | 帧间预测方法及预测装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4354858A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP4354858A1 (en) | 2024-04-17 |
CN113079372B (zh) | 2021-08-06 |
EP4354858A4 (en) | 2024-09-18 |
CN113079372A (zh) | 2021-07-06 |
US20230232020A1 (en) | 2023-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220217337A1 (en) | Method, codec device for intra frame and inter frame joint prediction | |
TWI241532B (en) | Methods and systems for image intra-prediction mode estimation, communication, and organization | |
CN110290388B (zh) | 帧内预测方法、视频编码方法、计算机设备及存储装置 | |
KR20190117708A (ko) | 부호화유닛 심도 확정 방법 및 장치 | |
CN108921910B (zh) | 基于可伸缩卷积神经网络的jpeg编码压缩图像复原的方法 | |
WO2020140700A1 (zh) | 色度块的预测方法和装置 | |
CN111131830B (zh) | 重叠块运动补偿的改进 | |
CN118264802A (zh) | 图像解码和编码方法及数据的发送方法 | |
CN109688407B (zh) | 编码单元的参考块选择方法、装置、电子设备及存储介质 | |
TW202106003A (zh) | 使用基於矩陣之內預測及二次轉換之寫碼技術 | |
CN109819250B (zh) | 一种多核全组合方式的变换方法和系统 | |
CN105282558A (zh) | 帧内像素预测方法、编码方法、解码方法及其装置 | |
CN107046645A (zh) | 图像编解码方法及装置 | |
WO2016180129A1 (zh) | 预测模式选择方法、装置及设备 | |
WO2019072248A1 (zh) | 运动估计方法、装置、电子设备及计算机可读存储介质 | |
TWI727826B (zh) | 使用內預測之寫碼技術 | |
WO2022063265A1 (zh) | 帧间预测方法及装置 | |
JP7439219B2 (ja) | イントラ予測方法及び装置、コンピュータ可読記憶媒体 | |
EP4404562A1 (en) | Video compression method and apparatus, and computer device and storage medium | |
WO2021168817A1 (zh) | 视频处理的方法及装置 | |
WO2022063267A1 (zh) | 帧内预测方法及装置 | |
WO2022111233A1 (zh) | 帧内预测模式的译码方法和装置 | |
JP2023514215A (ja) | 画像およびビデオ圧縮のイントラ予測 | |
CN110677644B (zh) | 一种视频编码、解码方法及视频编码帧内预测器 | |
US20240089494A1 (en) | Video encoding and decoding method and apparatus, storage medium, electronic device, and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22819274 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022819274 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022819274 Country of ref document: EP Effective date: 20240108 |