CN112995669B - Inter-frame prediction method and device, electronic equipment and storage medium - Google Patents

Inter-frame prediction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112995669B
CN112995669B CN202110456467.7A CN202110456467A CN112995669B CN 112995669 B CN112995669 B CN 112995669B CN 202110456467 A CN202110456467 A CN 202110456467A CN 112995669 B CN112995669 B CN 112995669B
Authority
CN
China
Prior art keywords
motion information
block
candidate
coding
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110456467.7A
Other languages
Chinese (zh)
Other versions
CN112995669A (en
Inventor
曹亚曦
俞鸣园
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Huachuang Video Signal Technology Co Ltd
Original Assignee
Zhejiang Huachuang Video Signal Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Huachuang Video Signal Technology Co Ltd filed Critical Zhejiang Huachuang Video Signal Technology Co Ltd
Priority to CN202110456467.7A priority Critical patent/CN112995669B/en
Publication of CN112995669A publication Critical patent/CN112995669A/en
Application granted granted Critical
Publication of CN112995669B publication Critical patent/CN112995669B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses an inter-frame prediction method, an inter-frame prediction device, electronic equipment and a computer readable storage medium. When constructing a candidate adjacent block list required for inter-frame prediction, the method carries out self-adaptive adjustment on at least one of the scanning sequence of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list and the binding mode of motion information and position in each direction according to the coding performance so as to improve the coding efficiency and accuracy.

Description

Inter-frame prediction method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of video encoding and decoding, and in particular, to an inter-frame prediction method, an inter-frame prediction device, an electronic device, and a computer-readable storage medium.
Background
The video coding process mainly comprises video acquisition, prediction, transformation quantization and entropy coding, wherein the prediction comprises intra-frame prediction and inter-frame prediction, which are respectively used for removing the spatial and temporal redundancy of video images.
Generally, the luminance and chrominance signal values of the pixels of the temporally adjacent frames are relatively close and have strong correlation. The inter-frame prediction searches for a matching block closest to the current block in the reference frame by using methods such as motion search, and records motion information such as a motion vector (mv) and a reference frame index between the current block and the matching block. And encoding the motion information and transmitting the encoded motion information to a decoding end. At the decoding end, the decoder can find the matching block of the current block as long as the MV of the current block is analyzed through the corresponding syntax element. And copying the pixel value of the matching block to the current block, namely the interframe prediction value of the current block.
At present, video conferences are more and more widely applied, videos shared in the video conferences are relatively regular, and how to utilize the regularity to further improve the coding efficiency becomes a key point and a difficult point of research in the industry.
Disclosure of Invention
The applicant creatively provides an inter-frame prediction method, an inter-frame prediction device, an electronic device and a computer readable storage medium.
According to a first aspect of embodiments of the present application, there is provided an inter-frame prediction method, including: constructing a candidate neighbor block list from at least one candidate frame aiming at a current block, wherein in the process of constructing the candidate neighbor block list, at least one of the scanning sequence of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list and the binding mode of the motion information and the position in each direction is subjected to self-adaptive adjustment according to the coding performance; determining a first adjacent block matched with the current block from the candidate adjacent block list; and recording the index of the first adjacent block and the motion vector of the first adjacent block so as to predict the pixel information of the current block according to the index of the first adjacent block, the motion vector of the first adjacent block and the pixel of the first adjacent block.
According to an embodiment of the present application, after recording the index of the first neighboring block and the motion vector of the first neighboring block, the method further includes: and coding the index of the first adjacent block to obtain coding information, wherein the coding of the index adopts a binary mode of a truncated binary code.
According to an embodiment of the present application, adaptively adjusting a scanning order of spatial domain candidate motion information according to coding performance includes: scanning the airspace candidate motion information in different scanning sequences according to a set rule to obtain at least one scanning result; determining a scanning result with the optimal coding performance from at least one scanning result according to the coding performance as a first scanning result; and setting the scanning sequence of the obtained first scanning result as the scanning sequence for scanning the spatial domain candidate motion information.
According to an embodiment of the present application, adaptively adjusting the selected position of the temporal motion information according to the coding performance includes: determining a coding unit with optimal coding performance as a first coding unit from coding units at four positions of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the co-located block of the current block; setting motion information of a first coding unit as motion information of a co-located block; and scaling the motion information of the co-located block to obtain time domain motion information.
According to an embodiment of the present application, the adaptive adjustment of the binding mode between the motion information and the position in each direction according to the coding performance includes: and selecting the motion information in different directions for the adjacent blocks of the same coding unit, and selecting the motion information in any direction for the rest adjacent blocks according to the coding performance.
According to an embodiment of the present application, determining a first neighbor block matching a current block from a candidate neighbor block list includes: determining motion information with the minimum Rate Distortion Cost (RDcost) as first motion information according to each inter-frame angle weighted prediction mode, wherein the size of a threshold value and a weight range of the division of the inter-frame angle weighted prediction mode is configurable or obtained by self-adaptive adjustment; a first neighboring block is determined based on the first motion information.
According to an embodiment of the present application, determining motion information with a minimum RDCost according to an inter-frame angle weighted prediction mode includes: carrying out Rate Distortion Optimization (RDO) rough selection to obtain an alternative motion information set according to the angle weighted prediction mode between frames, wherein the number of the alternative motion information is obtained by self-adaptive adjustment; and performing RDO (remote data object) selection on the candidate motion information list to obtain the motion information with the minimum RDcost.
According to a second aspect of embodiments of the present application, an inter prediction apparatus includes: the candidate neighbor block list construction module is used for constructing a candidate neighbor block list from at least one candidate frame aiming at the current block, wherein in the process of constructing the candidate neighbor block list, at least one item of the scanning sequence of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list and the binding mode of the motion information and the position in each direction is subjected to self-adaptive adjustment according to the coding performance; a matching adjacent block determining module, configured to determine a first adjacent block matching the current block from the candidate adjacent block list; and the index and motion vector recording module is used for recording the index of the first adjacent block and the motion vector of the first adjacent block so as to predict the pixel information of the current block according to the index of the first adjacent block, the motion of the first adjacent block and the pixel of the first adjacent block.
According to an embodiment of the present application, the apparatus further comprises: the coding module is used for coding the index of the first adjacent block to obtain coding information, wherein the coding of the index adopts a binary mode of truncated binary codes.
According to an embodiment of the present application, the candidate neighbor block list constructing module includes: the scanning submodule is used for scanning the airspace candidate motion information in different scanning sequences according to a set rule to obtain at least one scanning result; the first scanning result determining submodule is used for determining a scanning result with the optimal coding performance from at least one scanning result according to the coding performance as a first scanning result; and the scanning sequence setting sub-module is used for setting the scanning sequence of the obtained first scanning result as the scanning sequence for scanning the spatial domain candidate motion information.
According to an embodiment of the present application, the candidate neighbor block list constructing module includes: the first coding unit determining submodule is used for determining a coding unit with the optimal coding performance from coding units at four positions of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the co-located block of the current block as a first coding unit; a motion information setting sub-module of the co-located block for setting the motion information of the first encoding unit as the motion information of the co-located block; and the time domain motion information acquisition module is used for scaling the motion information of the co-located block to obtain the time domain motion information.
According to an embodiment of the present application, the candidate neighboring block list building module is specifically configured to select motion information in different directions for neighboring blocks of the same coding unit, and select motion information in any direction for the remaining neighboring blocks according to coding performance.
According to an embodiment of the present application, the matching neighbor block determining module includes: the first motion information determining submodule is used for determining motion information with the minimum RDcost as first motion information according to each inter-frame angle weighted prediction mode, wherein the division threshold value and the weight range size of the inter-frame angle weighted prediction mode are obtained by configurable or adaptive adjustment; and the first adjacent block determining submodule is used for determining the first adjacent block according to the first motion information.
According to an embodiment of the present application, the first motion information determination sub-module includes: the RDO rough selection unit is used for carrying out RDO rough selection according to the angle weighted prediction mode among the frames to obtain an alternative motion information set, wherein the number of the alternative motion information is obtained by self-adaptive adjustment; and the RDO selection unit is used for carrying out RDO selection on the candidate motion information list to obtain the motion information with the minimum RDcost.
According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of the above inter-frame prediction methods when executing a program stored in the memory.
According to a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of any one of the above-described inter-prediction methods.
The embodiment of the application provides an inter-frame prediction method, an inter-frame prediction device, electronic equipment and a computer readable storage medium. When constructing a candidate adjacent block list required for inter-frame prediction, the method carries out self-adaptive adjustment on at least one of the scanning sequence of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list and the binding mode of motion information and position in each direction according to the coding performance so as to improve the coding efficiency and accuracy.
It is to be understood that the implementation of the present application does not require all of the above-described advantages to be achieved, but rather that certain technical solutions may achieve certain technical effects, and that other embodiments of the present application may also achieve other advantages not mentioned above.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
FIG. 1 is a schematic flowchart illustrating an implementation of an inter-frame prediction method according to an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating positions of a current block and candidate neighboring blocks according to an embodiment of the inter prediction method of the present application;
FIG. 3 is a flowchart illustrating an implementation of another embodiment of an inter prediction method according to the present application;
FIG. 4 is a block diagram of an inter prediction apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
Fig. 1 shows a flow of implementing an embodiment of the inter prediction method of the present application. Referring to fig. 1, the method includes: operation S110, constructing a candidate neighbor block list from at least one candidate frame for the current block, wherein in the process of constructing the candidate neighbor block list, at least one of a scanning sequence of the spatial domain candidate motion information, a selection position of the temporal domain motion information, an insertion position of the candidate list, and a binding manner of motion information and position in each direction is adaptively adjusted according to coding performance; operation S120 of determining a first neighbor block matching the current block from the candidate neighbor block list; in operation S130, the index of the first neighboring block and the motion vector of the first neighboring block are recorded to predict pixel information of the current block based on the index of the first neighboring block, the motion vector of the first neighboring block, and the pixels of the first neighboring block.
Generally, video information is coded after being inter-predicted, wherein the information to be coded mainly includes: the index of the neighbor block.
The index of the neighbor block is often related to the ordering of the neighbor block in the candidate neighbor block list, and the encoding of the index is shorter the earlier the ordering is. The sequence of the neighbor block in the candidate neighbor block list is closely related to the scanning sequence of the spatial domain candidate motion information, the selection position of the temporal domain motion information and the insertion position of the candidate list.
Taking the position relationship of the current block and its neighboring block A, B, C, D, F, G shown in fig. 2 as an example, when scanning the spatial domain candidate motion information, if scanning is performed in a fixed order, for example, in the order of FGCABD, the neighboring block matching the current block may be located at F or D. When the neighbor block is located in F, because the neighbor block is scanned first, the candidate neighbor block list is correspondingly inserted first, the index of the candidate neighbor block list is advanced, and the index is coded in a shorter way; conversely, when the neighbor is located at D, since it was last scanned, and accordingly, the candidate neighbor list was last inserted, its index is later and the index is longer encoded.
Similarly, the selection position of the temporal motion information and the insertion position of the candidate list both affect the ordering of the matching block in the candidate neighbor list, and correspondingly affect the coding length of the index.
In addition, because the difference of the running information in different directions is large and the difference of the motion information in the same direction is small, the binding mode of the motion information in each direction and the position affects the difference distribution of the candidate motion information. The more uniform the difference distribution, the more accurate the last selected matching block and the more accurate the inter-frame prediction result.
Therefore, in the inter-frame prediction method, in the process of constructing the candidate neighbor blocks, the scanning order of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list, and the binding mode of the motion information and the position in each direction are not fixed, but at least one of the scanning order of the spatial domain candidate motion information, the selection position of the temporal domain motion information, the insertion position of the candidate list, and the binding mode of the motion information and the position in each direction is adaptively adjusted according to the coding performance (for example, coding length, coding distribution) and accuracy of real-time prediction.
For example, the spatial domain candidate motion information is scanned for multiple times according to a certain rule, the selection position of the temporal domain motion information and the insertion position of the candidate list are changed, the motion information in each direction is inserted into the proper position in the candidate neighbor block list in a more optimal arrangement mode, so that the motion information of the neighbor blocks on the same side is preferably from the motion information lists in different directions, the motion information of different sides can be from the motion information lists in the same direction, and the like to construct the candidate neighbor block list, and the candidate neighbor block list with the best coding performance is selected as the coding basis, so that the accuracy and the coding efficiency of inter-frame prediction can be further improved.
It should be noted that the embodiment shown in fig. 1 is only one basic embodiment of the inter prediction method of the present application, and further refinement and expansion can be performed by an implementer on the basis of the embodiment.
According to an embodiment of the present application, after recording the index of the first neighboring block and the motion vector of the first neighboring block, the method further includes: and coding the index of the first adjacent block to obtain coding information, wherein the coding of the index adopts a binary mode of a truncated binary code.
In the prior art, when transmitting the index of the neighboring block, for example, when encoding the index of the motion vector difference (index) after the offset of the high-level motion vector expression (UMVE), the variables (i.e., awp _ mvr _ cand _ step and awp _ mvr _ cand _ dir) are described by using two syntaxes, namely step and direction (direction), where the step has a value range of 0-4, and binarized in the following manner: 0:1,1:01,2:001,3:0001,4:00001. The value range of the direction is 0-3, and the following mode is adopted for binarization: 0:00,1:01,2:10,3:11.
The inventor of the inter-frame prediction method of the present application considers that the variable index is a symbol which is uniformly distributed and has a limited value range (value range 0-19), and creatively thinks of using a binarization method of truncated binary code to reduce the transmission of the syntax element.
For example, for a part with an index value less than 12, a binary system with a fixed length of 4 bits is used for encoding, and for a part with an index value greater than or equal to 12, a binary system with a fixed length of 5 bits is used for encoding, as shown in table 1:
Index binarization method
0 0000
1 0001
2 0010
12 01100
13 01101
19 10011
TABLE 1
Thus, in the syntax representation shown in table 2, the original step and direction related syntax elements can be removed:
coding unit definition Descriptor(s)
if ((SkipFlag || DirectFlag) && AwpFlag) {
awp_idx ae(v)
awp_mvr_cand_enable_flag0 ae(v)
if (AwpMvrCandFlag0) {
awp_mvr_cand_step0 ae(v)
awp_mvr_cand_dir0 ae(v)
awp_mvr_idx0 ae(v)
}
awp_mvr_cand_enable_flag1 ae(v)
if (AwpMvrCandFlag1) {
awp_mvr_cand_step1 ae(v)
awp_mvr_cand_dir1 ae(v)
awp_mvr_idx1 ae(v)
}
The representation of the index is not dependent on the removed syntax element any more, and the index is directly coded by adopting a truncated binary code mode, so that the transmission of the syntax element is reduced.
According to an embodiment of the present application, adaptively adjusting a scanning order of spatial domain candidate motion information according to coding performance includes: scanning the airspace candidate motion information in different scanning sequences according to a set rule to obtain at least one scanning result; determining a scanning result with the optimal coding performance from at least one scanning result according to the coding performance as a first scanning result; and setting the scanning sequence of the obtained first scanning result as the scanning sequence for scanning the spatial domain candidate motion information.
For example, in the positional relationship of the current block and the neighboring block A, B, C, D, F, G as shown in fig. 2, scanning in the order shown below may be considered:
(1) scanning the left adjacent block and then scanning the upper adjacent block: FADBBC, FADCGB, DAFBGC, DAFCGB, ADFGBC, and the like.
(2) Scanning the upper adjacent block and then scanning the left block: BGCFAD, BGCDAF, CGBFAD and the like.
(3) Alternate scanning of the left and upper blocks: FGACDB, AGFCDB, and the like.
Determining a scanning result with the optimal coding performance from a plurality of scanning results of the scanning sequence according to the coding performance as a first scanning result; and setting the scanning sequence of the obtained first scanning result as the scanning sequence for scanning the spatial domain candidate motion information. In this way, the coding performance can be optimized as much as possible.
According to an embodiment of the present application, adaptively adjusting the selected position of the temporal motion information according to the coding performance includes: determining a coding unit with optimal coding performance as a first coding unit from coding units at four positions of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the co-located block of the current block; setting motion information of a first coding unit as motion information of a co-located block; and scaling the motion information of the co-located block to obtain time domain motion information.
The coding unit (scu) generally refers to the smallest coding unit, and is generally a 4 × 4 sub-block. In existing schemes, temporal motion information is located at the first position of the current candidate list.
The inter-frame prediction method changes the rule, considers the coding units at four positions of the current block co-located block, namely the upper left corner, the upper right corner, the lower left corner and the lower right corner, and can randomly change the position of the candidate item when inserting the candidate list, wherein the position of the candidate item can be positioned before the spatial motion information, after the spatial motion information or between the spatial motion information. And then comparing the coding performances of various schemes, selecting the motion information of the optimal position coding unit as the motion information of the co-located block, and then scaling the motion information to obtain time domain motion information. In this way, the coding performance can be optimized as much as possible.
According to an embodiment of the present application, the adaptive adjustment of the binding mode between the motion information and the position in each direction according to the coding performance includes: and selecting the motion information in different directions for the adjacent blocks of the same coding unit, and selecting the motion information in any direction for the rest adjacent blocks according to the coding performance.
As mentioned above, the binding manner between the motion information and the position in each direction may affect the differential distribution of the candidate motion information. If the motion information of one side neighboring block is from different motion information lists and the motion information of different sides is from the same motion information list, the difference distribution of the motion information candidate list can be made more uniform and the samples are richer to the greatest extent. Assuming that the L0 list is a motion information list in one direction and the L1 list is a motion information list in another direction, for the spatial neighboring blocks, F (L0) → G (L0) → C (L1) → a (L1) → B (L1) → D (L0), F (L1) → G (L1) → C (L0) → a (L0) → B (L0) → D (L1), and the like may be bound in the following manner. That is, in the candidate neighbor list, the F neighbor takes the motion information in the L0 list, the G neighbor takes the motion information in the L0 list, the C neighbor takes the motion information in the L1 list, and so on.
Further, neighboring blocks from the same coding unit may select motion information in different lists, and the remaining neighboring blocks may select motion information in any of the lists. For example: the F block and the a block may be from the same coding unit, and the B block and the G block may be from the same coding unit. For this reason, when selecting motion information, it is preferable to select motion information in different directions so as to ensure that the obtained motion information has a certain difference as much as possible. For spatial neighboring blocks, the following binding may be performed: f (L0) → a (L1), B (L0) → G (L1), and for the remaining neighboring views D and C, then motion information in the L0 or L1 list can be selected.
According to an embodiment of the present application, determining a first neighbor block matching a current block from a candidate neighbor block list includes: determining motion information with the minimum RDcost as first motion information according to each inter-frame angle weighted prediction mode, wherein the division threshold value and the weight range size of the inter-frame angle weighted prediction mode are configurable or obtained by self-adaptive adjustment; a first neighboring block is determined based on the first motion information.
In the existing inter-frame prediction method, especially in the inter-frame angle weighted prediction mode, the result of the implicit partition of the current coding unit is mainly affected by the derivation formula of the current coding unit, and is related to the size of the threshold and the weight range of the partition.
The inter-frame angle weighted prediction mode will be described in detail below as an example. The inter-frame angle weighted prediction mode is a new prediction mode in merge mode, and the supported block size range is 8x8 to 64x 64. The prediction mode is realized by means of the intra angle prediction idea: the method comprises the steps of firstly setting reference weight values of peripheral positions (whole pixel positions and sub-pixel positions) of a current block, then obtaining a weight value corresponding to each pixel position in the current block by utilizing an angle, and realizing weighting of two different interframe predicted values through a finally obtained weight array.
The weight value corresponding to each pixel position in the current block is determined by the angle area and the reference weight configuration supported by each angle area. In the existing inter-frame angle weighted prediction mode, 8 angle regions are fixed, 7 reference weight configurations supported by each angle region are fixed, a part with a weight threshold value larger than or equal to 4 is a subblock, a part with a weight threshold value smaller than 4 is another subblock, and the value range of the weight is 0-8.
In this case, the same partition granularity and weight representation is adopted regardless of the texture of the image, and the following problems may arise: for an image with a fine texture, the existing fixed division granularity may be too coarse, so that the prediction result is inaccurate; for the image with thicker texture, the existing fixed partition granularity may be too fine, resulting in lower prediction efficiency.
In order to solve the above problems, in the embodiments of the present application, the number of divided angle regions, the number of reference weight configurations supported by each angle region, and a weight threshold are adjusted to change the division result, so as to optimize the encoding process. For example, keeping the rest parameters unchanged, changing the result of sub-block division by adjusting the size of the weight threshold up or down, for example, the threshold can be adjusted to be 2, 3, 5, 6, etc., and searching for the optimal threshold by analyzing the coding results under various thresholds; keeping the rest parameters unchanged, changing the sub-block division result by changing the value range of the weight, adjusting the weight range to 0-6, 0-10 and the like, and synchronously adjusting the weight threshold value again after the weight range is changed.
In addition, the selection of the weight threshold value can be guided by combining the actual texture feature distribution of the coding block. For example, a texture finer weight threshold may be adjusted downward, a texture coarser weight threshold may be adjusted upward, and so on. Common texture direction calculation methods include Gabor filtering, gray level co-occurrence matrix, sobel gradient and the like. Therefore, the prediction result of the inter-frame prediction can be more accurate and the accuracy is higher.
According to an embodiment of the present application, determining motion information with a minimum RDCost according to an inter-frame angle weighted prediction mode includes: performing RDO rough selection to obtain an alternative motion information set according to the angle weighted prediction mode between frames, wherein the number of the alternative motion information is obtained by self-adaptive adjustment; and performing RDO (remote data object) selection on the candidate motion information list to obtain the motion information with the minimum RDcost.
In the existing scheme, the number of candidates entering the RDO refining is fixed, that is, no matter how many candidates satisfying the condition are obtained by rough refining, only 7 candidates are finally obtained by refining.
In the embodiment of the present application, the number of candidates participating in RDO in AWP can be adaptively determined according to the number of candidates obtained through rough selection. For example, according to the statistical result, the number num of candidates satisfying the condition after rough selection ranges from 0 to 896, where 896=56 × 4, 56 represents the total prediction mode category of the prior art, and the first 4 represents the motion information with the minimum RDcost selected from the first motion information (including at most two motion information with UMVE offset and two motion information without UMVE offset); the second 4 represents the motion information with the minimum RDcost selected from the second motion information (containing at most two motion information with UMVE offset and two motion information without UMVE offset).
The range is divided into several intervals in a uniform or non-uniform mode, and when num is positioned in different intervals, different numbers of candidates are adopted to carry out the RDO process. For example, the RDO may be divided into 4 intervals uniformly, each interval has a length of 224, and the number of candidates of the RDO is determined according to the following rule:
(1) when num is <224, performing an RDO process with a maximum of 7 candidates;
(2) when 224 is less than or equal to num <448, 8 candidates are adopted to carry out the RDO process;
(3) when 448 is more than or equal to num and is less than 672, 9 candidate items are adopted to carry out RDO process;
(4) when num is larger than or equal to 672, 10 candidates are adopted to carry out the RDO process.
In addition, the num distribution mode can be counted, the whole range is divided unevenly according to the distribution condition, and more reasonable RDO candidate number is adopted for the interval with larger occupation ratio to carry out RDO, so that the accuracy of the final prediction mode is ensured.
The above embodiments are exemplary illustrations of how to further refine and expand on the basis of the basic embodiment shown in fig. 1, and an implementer may combine various implementations in the above embodiments to form a new embodiment according to specific implementation conditions and needs, so as to achieve a more ideal implementation effect.
Fig. 3 shows an implementation flow of another embodiment of the inter prediction method of the present application, which combines various implementation manners of the above embodiments to finally form an embodiment with better implementation effect. The embodiment is applied to the inter-frame prediction process under the inter-frame angle weighted prediction mode, and mainly comprises the following operations:
operation S3010, configuring a threshold and a weight of an inter-frame angle weighted prediction mode;
operation S3020, deriving pixel-by-pixel weights according to the inter-frame angle weighted prediction mode;
operation S3030, determining a scanning order of the spatial domain candidate motion information according to the coding efficiency, and determining availability of the spatial domain candidate motion information;
operation S3040, selecting motion information in different directions for neighboring blocks of the same coding unit, and adding the remaining neighboring blocks to the candidate neighboring block list in a manner of selecting motion information in any direction according to coding performance;
operation S3050, determining whether the length of the candidate neighbor block list is less than 5, if so, continuing operation S3060, and if not, continuing operation S3070;
operation S3060, determining, from the coding units at the four positions of the top left corner, the top right corner, the bottom left corner and the bottom right corner of the co-located block of the current block, a coding unit with the optimal coding performance as a coding unit, and expanding the candidate neighboring block list to obtain 5 candidate neighboring blocks;
operation S3070, according to the inter-frame angle weighted prediction mode and the UMVE offset scheme, modifying the motion information of the candidate neighbor blocks in the candidate neighbor block list to obtain two sets of motion information with the minimum RDCost, wherein the number of candidate items entering the RDO fine selection stage is determined in an adaptive determination manner in the RDO rough selection stage;
in operation S3080, predicting motion information of a matching block of the current block according to the derived weight matrix of each pixel and two sets of motion information having the smallest RDCost;
operation S3090, storing motion information of a matching block of the current block;
in operation S3100, an index of a matching block of the current block is encoded, wherein the index is encoded using a binarization method of a truncated binary code.
It should be noted that the application shown in fig. 3 is only an exemplary illustration of the inter-frame prediction method of the present application and is not a limitation to the embodiment and application scenario of the inter-frame prediction method of the present application. The implementer can adopt any applicable implementation mode and be applied to any applicable application scene according to specific implementation conditions.
Further, according to an embodiment of the present application, there is provided an inter prediction apparatus, as shown in fig. 4, the apparatus 40 includes: a candidate neighbor block list constructing module 401, configured to construct a candidate neighbor block list from at least one candidate frame for a current block, where in the process of constructing the candidate neighbor block list, at least one of a scanning sequence of spatial candidate motion information, a selection position of temporal motion information, an insertion position of the candidate list, and a binding manner between motion information and positions in each direction is adaptively adjusted according to coding performance; a matching neighboring block determining module 402, configured to determine a first neighboring block matching the current block from the candidate neighboring block list; an index and motion vector recording module 403, configured to record an index of the first neighboring block and a motion vector of the first neighboring block, so as to predict pixel information of the current block according to the index of the first neighboring block, the motion vector of the first neighboring block, and pixels of the first neighboring block.
According to an embodiment of the present application, the apparatus 40 further includes: the coding module is used for coding the index of the first adjacent block to obtain coding information, wherein the coding of the index adopts a binary mode of truncated binary codes.
According to an embodiment of the present application, the candidate neighbor block list constructing module 401 includes: the scanning submodule is used for scanning the airspace candidate motion information in different scanning sequences according to a set rule to obtain at least one scanning result; the first scanning result determining submodule is used for determining a scanning result with the optimal coding performance from at least one scanning result according to the coding performance as a first scanning result; and the scanning sequence setting sub-module is used for setting the scanning sequence of the obtained first scanning result as the scanning sequence for scanning the spatial domain candidate motion information.
According to an embodiment of the present application, the candidate neighbor block list constructing module 401 includes: the first coding unit determining submodule is used for determining a coding unit with the optimal coding performance from coding units at four positions of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the co-located block of the current block as a first coding unit; a motion information setting sub-module of the co-located block for setting the motion information of the first encoding unit as the motion information of the co-located block; and the time domain motion information acquisition module is used for scaling the motion information of the co-located block to obtain the time domain motion information.
According to an embodiment of the present application, the candidate neighboring block list building module 401 is specifically configured to select motion information in different directions for neighboring blocks of the same coding unit, and select motion information in any direction for the remaining neighboring blocks according to coding performance.
According to an embodiment of the present application, the matching neighboring block determining module 402 includes: the first motion information determining submodule is used for determining motion information with the minimum RDcost as first motion information according to each inter-frame angle weighted prediction mode, wherein the division threshold value and the weight range size of the inter-frame angle weighted prediction mode are obtained by configurable or adaptive adjustment; and the first adjacent block determining submodule is used for determining the first adjacent block according to the first motion information.
According to an embodiment of the present application, the first motion information determination sub-module includes: the RDO rough selection unit is used for carrying out RDO rough selection according to the angle weighted prediction mode among the frames to obtain an alternative motion information set, wherein the number of the alternative motion information is obtained by self-adaptive adjustment; and the RDO selection unit is used for carrying out RDO selection on the candidate motion information list to obtain the motion information with the minimum RDcost.
According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of the above inter-frame prediction methods when executing a program stored in the memory.
According to a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of any of the above-described inter-prediction methods.
Here, it should be noted that: the above description on an embodiment of an inter-frame prediction apparatus, the above description on an embodiment of an electronic device, and the above description on an embodiment of a computer-readable storage medium are similar to the descriptions on the foregoing method embodiments, and have similar beneficial effects to the foregoing method embodiments, and therefore, no further description is given. For technical details that have not been disclosed in the description of the embodiments of the inter-frame prediction apparatus, the description of the embodiments of the electronic device, and the description of the embodiments of the computer-readable storage medium, please refer to the description of the embodiments of the method described above in this application for understanding, and therefore, for brevity, will not be described again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of a unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another device, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage medium, a Read Only Memory (ROM), a magnetic disk, and an optical disk.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a removable storage medium, a ROM, a magnetic disk, an optical disk, or the like, which can store the program code.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of inter-prediction, the method comprising:
constructing a candidate neighbor block list from at least one candidate frame aiming at a current block, wherein in the process of constructing the candidate neighbor block list, at least one item of scanning sequence of spatial domain candidate motion information, selection position of time domain motion information, insertion position of the candidate list and binding mode of motion information and position in each direction is subjected to self-adaptive adjustment according to coding performance;
determining a first neighbor block matched with the current block from the candidate neighbor block list;
recording the index of the first neighboring block and the motion vector of the first neighboring block to predict pixel information of the current block according to the index of the first neighboring block, the motion vector of the first neighboring block and pixels of the first neighboring block.
2. The method of claim 1, wherein after said recording the index of the first neighbor block and the motion vector of the first neighbor block, the method further comprises:
and coding the index of the first adjacent block to obtain coding information, wherein the coding of the index adopts a binary mode of a truncated binary code.
3. The method of claim 1, wherein adaptively adjusting the scanning order of the spatial domain candidate motion information according to the coding performance comprises:
scanning the airspace candidate motion information in different scanning sequences according to a set rule to obtain at least one scanning result;
determining a scanning result with the optimal coding performance from the at least one scanning result according to the coding performance as a first scanning result;
and setting the scanning sequence of the first scanning result as the scanning sequence for scanning the spatial domain candidate motion information.
4. The method of claim 1, wherein adaptively adjusting the selected position of the temporal motion information according to the coding performance comprises:
determining a coding unit with optimal coding performance as a first coding unit from coding units at four positions of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the co-located block of the current block;
setting motion information of the first coding unit as motion information of the co-located block;
and scaling the motion information of the co-located block to obtain time domain motion information.
5. The method of claim 1, wherein adaptively adjusting the binding of the motion information and the position in each direction according to the coding performance comprises:
and selecting the motion information in different directions for the adjacent blocks of the same coding unit, and selecting the motion information in any direction for the rest adjacent blocks according to the coding performance.
6. The method of claim 1, wherein determining the first neighbor block from the candidate neighbor block list that matches the current block comprises:
determining motion information with the minimum RDcost as first motion information according to each inter-frame angle weighted prediction mode, wherein the size of a threshold value and a weight range of implicit division of the inter-frame angle weighted prediction mode is configurable or obtained by self-adaptive adjustment;
and determining a first adjacent block according to the first motion information.
7. The method of claim 6, wherein determining the motion information with the minimum RDCost according to the inter-frame angular weighted prediction mode comprises:
performing RDO rough selection to obtain an alternative motion information set according to the angle weighted prediction mode between frames, wherein the number of the alternative motion information is obtained by adaptive adjustment;
and performing RDO (remote data object) selection on the candidate motion information list to obtain the motion information with the minimum RDcost.
8. An inter-prediction apparatus, the apparatus comprising:
the candidate neighbor block list construction module is used for constructing a candidate neighbor block list from at least one candidate frame aiming at a current block, wherein in the process of constructing the candidate neighbor block list, at least one item of scanning sequence of space domain candidate motion information, selection position of time domain motion information, insertion position of the candidate list and binding mode of motion information and position in each direction is subjected to self-adaptive adjustment according to coding performance;
a matching neighboring block determining module, configured to determine, from the candidate neighboring block list, a first neighboring block that matches the current block;
and the index and motion vector recording module is used for recording the index of the first adjacent block and the motion vector of the first adjacent block so as to predict the pixel information of the current block according to the index of the first adjacent block, the motion vector of the first adjacent block and the pixel of the first adjacent block.
9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any one of claims 1 to 7 when executing a program stored in the memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202110456467.7A 2021-04-27 2021-04-27 Inter-frame prediction method and device, electronic equipment and storage medium Active CN112995669B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110456467.7A CN112995669B (en) 2021-04-27 2021-04-27 Inter-frame prediction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110456467.7A CN112995669B (en) 2021-04-27 2021-04-27 Inter-frame prediction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112995669A CN112995669A (en) 2021-06-18
CN112995669B true CN112995669B (en) 2021-08-03

Family

ID=76340278

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110456467.7A Active CN112995669B (en) 2021-04-27 2021-04-27 Inter-frame prediction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112995669B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113507609B (en) * 2021-09-09 2021-11-19 康达洲际医疗器械有限公司 Interframe image parallel coding method based on time-space domain prediction

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101854542A (en) * 2008-11-20 2010-10-06 联发科技股份有限公司 Scanning methods, processing apparatus and processing order determining method
CN110121883A (en) * 2016-12-05 2019-08-13 Lg电子株式会社 The method and apparatus that image is decoded in image encoding system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3849192B1 (en) * 2011-06-28 2023-01-11 LG Electronics, Inc. Method for deriving a motion vector for video decoding and video encoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101854542A (en) * 2008-11-20 2010-10-06 联发科技股份有限公司 Scanning methods, processing apparatus and processing order determining method
CN110121883A (en) * 2016-12-05 2019-08-13 Lg电子株式会社 The method and apparatus that image is decoded in image encoding system

Also Published As

Publication number Publication date
CN112995669A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
US20220295058A1 (en) Image encoding/decoding method and device employing in-loop filtering
KR102709345B1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
CN110024399B (en) Method and apparatus for encoding/decoding image and recording medium storing bit stream
US9363520B2 (en) Method and apparatus for processing a video signal
CN110024385B (en) Video encoding/decoding method, apparatus, and recording medium storing bit stream
US12081754B2 (en) Image encoding/decoding method and apparatus, and recording medium storing bitstream
CN109891883A (en) The recording medium of video coding/decoding method and equipment and stored bits stream
CN116366842A (en) Image encoding/decoding method and apparatus using sample filtering
CN109691099A (en) Video coding/decoding method and device and recording medium in wherein stored bits stream
US20220264151A1 (en) Image encoding/decoding method and device using secondary transform, and recording medium storing bitstream
KR20240088650A (en) Method and apparatus for encoding and decoding image using prediction network
CN110089113A (en) Image coding/decoding method, equipment and the recording medium for stored bits stream
CN112740684A (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
CN112673629A (en) Video encoding/decoding method and apparatus and recording medium for storing bit stream
KR20190090867A (en) Intra prediction mode based image processing method and apparatus therefor
CN112995669B (en) Inter-frame prediction method and device, electronic equipment and storage medium
JP5178616B2 (en) Scene change detection device and video recording device
CN113906754A (en) Image encoding/decoding method and apparatus
CN116634137A (en) Screen content video intra-frame mode fast decision based on feature crossover
US20240146916A1 (en) Filtering method and apparatus and devices
CN113365080B (en) Encoding and decoding method, device and storage medium for string coding technology
CN114071138A (en) Intra-frame prediction encoding method, intra-frame prediction encoding device, and computer-readable medium
RU2806878C2 (en) Method and device for image encoding/decoding and record medium which stores bitstream
RU2818968C2 (en) Method and apparatus for encoding/decoding image and recording medium on which bit stream is stored
CN116980590A (en) Adaptive selection of IBC reference regions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant