CN110611820A

CN110611820A - Video coding method and device, electronic equipment and storage medium

Info

Publication number: CN110611820A
Application number: CN201910859788.4A
Authority: CN
Inventors: 黄跃; 郑云飞; 闻兴; 陈宇聪; 陈敏; 王晓楠; 黄晓政; 赵明菲; 郭磊
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-09-11
Filing date: 2019-09-11
Publication date: 2019-12-24

Abstract

The disclosure relates to a video encoding method, a video encoding device, an electronic device and a storage medium. The method comprises the following steps: determining a current prediction block corresponding to a bidirectional prediction frame in a video to be coded; when the current prediction block meets the bidirectional prediction limiting condition, acquiring forward motion information of a target block corresponding to the current prediction block, and taking the forward motion information as a merging candidate; and carrying out video coding on the video to be coded according to the merging candidate. The bidirectional prediction restriction can be considered in the process of acquiring the merging candidate, the occurrence of invalid candidates can be obviously reduced, repeated candidates can be effectively cut, and the calculation complexity of video encoding and decoding can be reduced by only referring to the forward motion information, and the saved candidate list position can be configured to try more candidates.

Description

Video coding method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of video encoding technologies, and in particular, to a video encoding method and apparatus, an electronic device, and a storage medium.

Background

Recently, demands for high-resolution, high-quality video, such as High Definition (HD) video and Ultra High Definition (UHD) video, have increased in various application fields. When the video data has higher resolution and higher quality, the video data is larger in amount than the conventional video data. Thus, if video data is transmitted over an existing medium such as a wired/wireless broadband circuit or stored in an existing storage medium, transmission cost and storage cost increase. To avoid these problems encountered with high resolution, high quality video data, efficient video compression techniques may be used.

In order to improve the compression efficiency as much as possible, the prediction block in the B frame can be referred to in the forward direction or the backward direction during inter prediction, i.e. bi-directional prediction. Bi-directional prediction doubles the computational complexity while bringing compression efficiency. Controlling the power consumption and memory of the decoder is an important metric in the design and formulation of video standards. Therefore, the partition size of the prediction block is generally limited when bi-directional prediction is performed.

Many video content blocks have large motion similarities with neighboring blocks, and inter prediction mainly solves the redundancy of the motion similarities. In HEVC/h.265 and the latest ongoing VVC video coding standard, inter prediction includes both AMVP mode and merge (merge) mode. The merge mode [1] is introduced to directly use the motion information of the neighboring blocks, so that the cost for encoding the video area with high motion similarity is very small.

In HEVC/h.265, the motion information in the candidate list includes: for spatial candidates as an example, the processing method is to check the motion information of the prediction blocks in neighboring positions, and referring to fig. 1, a schematic diagram of a spatial neighboring block in the prior art is shown, as shown in fig. 1, the available information of the prediction blocks in 5 neighboring positions is checked in the order of a1(left) - > B1(above) - > B0(above-right) - > a0(left-bottom) - > (B2(above-left)), and a simple de-rescaling operation is performed (e.g., it is checked whether B1 is equal to a 1). And after the merge candidate list is built, uniformly applying bidirectional prediction restriction according to the partition size of the current prediction block.

Separating merge candidate list construction from bi-directional prediction restriction, introducing redundancy of merge candidates, adding extra comparison and candidate calculation steps, and possibly causing candidates not to enter the list due to existence of redundant candidates.

Disclosure of Invention

In order to overcome the problems that the construction of a candidate list is separated from bidirectional prediction limitation in the related art, the redundancy of merge candidates is introduced, extra comparison and candidate calculation steps are added, and the candidates cannot enter the list possibly due to the existence of the redundancy candidates, the disclosure provides a video coding method, a device, electronic equipment and a storage medium.

According to a first aspect of embodiments of the present disclosure, there is provided a video encoding method, including:

determining a current prediction block corresponding to a bidirectional prediction frame in a video to be coded;

when the current prediction block meets the bidirectional prediction limiting condition, acquiring forward motion information of a target block corresponding to the current prediction block, and taking the forward motion information as a merging candidate;

and carrying out video coding on the video to be coded according to the merging candidate.

In one specific implementation of the present disclosure, the step of acquiring forward motion information of a target block corresponding to the current prediction block when the current prediction block satisfies a bi-directional prediction restriction condition includes:

when the current prediction block is determined to meet the bidirectional prediction condition according to the partition size of the current prediction block, acquiring a spatial domain adjacent block of the current prediction block;

detecting whether the spatial domain adjacent blocks use forward reference information or not;

when the spatial domain adjacent block uses forward reference information, acquiring spatial domain forward motion information of the spatial domain adjacent block, and taking the spatial domain forward motion information as the merging candidate;

when the number of the merging candidates does not reach a preset candidate number, acquiring a time domain associated block of the current prediction block;

and acquiring time domain forward motion information of the time domain associated block according to a forward time domain reference frame list corresponding to the time domain associated block, and taking the time domain forward motion information as the merging candidate.

In one specific implementation of the present disclosure, after the step of using the spatial domain forward motion information as the merging candidate, the method further includes:

determining whether motion information identical to the spatial forward motion information exists in a candidate list;

adding the spatial domain forward motion information to the candidate list when the motion information identical to the spatial domain forward motion information does not exist in the candidate list;

after the step of using the time domain forward motion information as the merging candidate, the method further comprises:

adding the temporal forward motion information to the candidate list when the same motion information as the temporal forward motion information does not exist in the candidate list.

In a specific implementation of the present disclosure, the step of obtaining time domain forward motion information of the time domain associated block according to the forward time domain reference frame list corresponding to the time domain associated block includes:

acquiring a time domain reference frame corresponding to the time domain associated block according to the forward time domain reference frame list;

and acquiring reference forward motion information corresponding to the time domain reference frame, and taking the reference forward motion information as the time domain forward motion information.

In one specific implementation of the present disclosure, after the step of using the time-domain forward motion information as the merging candidate, the method further includes:

when the number of the merging candidates does not reach the preset candidate number, acquiring a target candidate construction algorithm capable of generating forward motion information;

acquiring target forward motion information corresponding to the target candidate construction algorithm, and taking the target forward motion information as the merging candidate;

the target candidate construction algorithm comprises at least one of a motion information prediction candidate algorithm, a combined average candidate algorithm and a zero motion candidate algorithm.

According to a second aspect of the embodiments of the present disclosure, there is provided a video encoding apparatus, including:

a current prediction block determining module configured to determine a current prediction block corresponding to a bidirectional prediction frame in a video to be encoded;

a merging candidate generating module configured to acquire forward motion information of a target block corresponding to the current prediction block when the current prediction block satisfies a bi-directional prediction constraint condition, and use the forward motion information as a merging candidate;

and the video coding module to be coded is configured to perform video coding on the video to be coded according to the merging candidate.

In a specific implementation of the present disclosure, the merge candidate generation module includes:

a spatial domain neighboring block obtaining sub-module configured to obtain a spatial domain neighboring block of the current prediction block when it is determined that the current prediction block satisfies a bi-directional prediction condition according to the partition size of the current prediction block;

a forward reference information detection sub-module configured to detect whether the spatial neighboring block uses forward reference information;

a spatial domain motion information acquisition sub-module configured to acquire spatial domain forward motion information of the spatial domain neighboring block when the spatial domain neighboring block uses forward reference information, and to take the spatial domain forward motion information as the merge candidate;

a time domain associated block obtaining sub-module configured to obtain a time domain associated block of the current prediction block when the number of the merging candidates does not reach a preset candidate number;

and the time domain motion information acquisition sub-module is configured to acquire the time domain forward motion information of the time domain associated block according to the forward time domain reference frame list corresponding to the time domain associated block, and take the time domain forward motion information as the merging candidate.

In a specific implementation of the present disclosure, the apparatus further comprises:

a spatial motion information determination module configured to determine whether motion information identical to the spatial forward motion information exists in a candidate list;

a spatial motion information adding module configured to add the spatial forward motion information to the candidate list when there is no motion information in the candidate list that is the same as the spatial forward motion information;

the device further comprises:

a time domain motion information determining module configured to determine whether motion information identical to the time domain forward motion information exists in the candidate list;

a temporal motion information adding module configured to add the temporal forward motion information to the candidate list when there is no motion information in the candidate list that is the same as the temporal forward motion information.

In a specific implementation of the present disclosure, the time domain motion information obtaining sub-module includes:

a time domain reference frame obtaining sub-module configured to obtain a time domain reference frame corresponding to the time domain associated block according to the forward time domain reference frame list;

and the time domain forward motion information acquisition sub-module is configured to acquire reference forward motion information corresponding to the time domain reference frame and use the reference forward motion information as the time domain forward motion information.

a target candidate algorithm obtaining module configured to obtain a target candidate construction algorithm that can generate forward motion information when the number of the merge candidates does not reach the preset candidate number;

a target motion information acquisition module configured to acquire target forward motion information corresponding to the target candidate construction algorithm and take the target forward motion information as the merging candidate;

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory configured to store processor-executable instructions;

wherein the processor is configured to perform any of the video encoding methods described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium having instructions which, when executed by a processor of a mobile terminal, enable the mobile terminal to perform any one of the video encoding methods described above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the embodiment of the disclosure obtains forward motion information corresponding to a target block corresponding to a current prediction block by determining the current prediction block corresponding to a bidirectional prediction frame in a video to be coded, and takes the forward motion information as a merging candidate, and performs video coding on the video to be coded according to the merging candidate when the current prediction block meets a bidirectional prediction limiting condition. The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively crop repeated candidates by considering bidirectional prediction restriction in the process of acquiring merging candidates, and the embodiment of the disclosure can reduce the computational complexity of video encoding and decoding by only referring to forward motion information, and the saved candidate list position can be used for trying more candidates.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a diagram of a spatially adjacent block in the prior art;

FIG. 2 is a flow diagram illustrating a method of video encoding in accordance with an exemplary embodiment;

FIG. 3 is a flow chart illustrating a method of video encoding in accordance with an exemplary embodiment;

FIG. 4 is a block diagram illustrating a video encoding apparatus according to an example embodiment;

FIG. 5 is a block diagram illustrating a video encoding apparatus according to an example embodiment;

FIG. 6 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment;

fig. 7 is a block diagram illustrating an apparatus for video encoding according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

Example one

Fig. 2 is a flow chart illustrating a video encoding method according to an example embodiment, which may include the following steps, as shown in fig. 2.

In step S11, a current prediction block corresponding to a bidirectional prediction frame in the video to be encoded is determined.

The embodiment of the invention can be configured to acquire a scene of a merged option corresponding to a prediction block in a B frame in an inter-frame prediction process.

Inter-frame prediction refers to predicting pixels of a current frame image by using pixel values of adjacent coded frames so as to achieve the purpose of effectively removing video time domain redundancy. Since images are generally less likely to change drastically, the correlation between adjacent frames is generally stronger than the correlation between pixels within a frame, and therefore the compression ratio is greater. Since inter prediction is also larger. Since inter-frame prediction mostly uses motion characteristics in video for compression prediction, the complexity of the prediction method is increased while a larger compression ratio is brought about. The disclosed embodiments aim to reduce the computational complexity of the prediction method.

The video to be encoded refers to video that needs to be subjected to video encoding.

The bi-prediction block is a prediction block in the B frame. The B-frame method (B frame) generally used when inter-frame prediction is performed is an inter-frame compression algorithm for bi-directional prediction. When compressing a frame into a B frame, it compresses the frame according to the difference of the adjacent previous frame, the current frame and the next frame data, that is, only the difference between the current frame and the previous and next frames is recorded. Only with B-frame compression can 200 be achieved: 1, high compression.

The current prediction block is a prediction block in a bi-directional prediction frame (i.e., a B-frame).

When video coding is performed, a B frame in the video frame may be acquired as a bidirectional prediction frame, and a current prediction block in the bidirectional prediction frame may be acquired.

After determining the current prediction block corresponding to the bidirectional prediction frame in the video to be encoded, step S12 is performed.

In step S12, when the current prediction block satisfies the bi-directional prediction restriction condition, the forward motion information of the target block corresponding to the current prediction block is obtained, and the forward motion information is used as a merging candidate.

Bi-directional prediction refers to prediction using a forward frame (forward reference) and a backward frame (backward reference) as reference frames, respectively.

The bi-prediction restriction condition refers to a condition that the current prediction block can perform forward prediction and backward prediction at the same time. The prediction blocks in the bi-directional prediction frame all determine bi-directional prediction constraint conditions, and whether bi-prediction conditions are met is checked to exclude bi-prediction.

After the current prediction block is obtained, it may be determined whether the current prediction block satisfies a bi-directional prediction restriction condition, specifically, whether the current prediction block satisfies the bi-directional prediction restriction condition may be determined according to a partition size of the current prediction block, and specifically, it will be described in detail in embodiment two below, and the embodiment of the present disclosure will not be described in detail herein.

Inter-frame prediction is usually performed first for motion estimation and then for motion compensation. Most of the motion of objects in the video image is translational motion, so if it can be determined that some object in the current frame is moved from some object in the forward frame, the pixel points in the area near the object in the forward frame can be directly 'copied' to the corresponding position in the current frame to be used as the predicted value of the macroblock, and the calculated translation amount becomes the motion vector MV of the macroblock between two frames.

The motion information mentioned in the embodiments of the present disclosure is a motion vector MV.

The forward motion information is a motion vector MV obtained by using a backward frame of the bidirectional prediction frame as a reference frame.

When it is determined that the current prediction block does not satisfy the bi-directional prediction restriction condition, the bi-directional prediction frame is not processed by using the subsequent steps, and is processed according to the existing processing mode.

The target block refers to a forward reference block corresponding to the current prediction block.

And under the condition that the current prediction block is judged to meet the bidirectional prediction limiting condition, acquiring the forward motion information of the target block corresponding to the current prediction block, and taking the forward motion information as a merging candidate.

Of course, in the embodiment of the present disclosure, when obtaining the forward motion information, the forward motion information may be according to spatial domain forward motion information, temporal domain forward motion information, and the like, and specifically, the detailed description will be described in the following embodiment two, and the embodiment of the present disclosure is not described herein again.

The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively clip duplicate candidates by considering bi-directional prediction restriction in the process of obtaining merging candidates, and can reduce the computational complexity of video encoding and decoding by considering only forward motion information, and the saved candidate list position can be used for trying more candidates.

After acquiring the forward motion information of the target block corresponding to the current prediction block and using the forward motion information as a merging candidate, step S13 is performed.

In step S13, the video to be encoded is video-encoded according to the merge candidate.

After the merging candidates are obtained, the video to be coded may be coded according to the merging candidates, which, as will be understood, the video to be coded contains a plurality of current prediction blocks, each current prediction block corresponding to a merging candidate, after the merging candidate of each current prediction block in the video to be coded is obtained, the merging candidate is adopted to carry out compression coding on the corresponding current prediction block, for the purpose of encoding the video to be encoded, for example, the video to be encoded includes 3 current prediction blocks, such as prediction block 1, prediction block 2, and prediction block 3, wherein the merging candidate corresponding to the prediction block 1 is a, the merging candidate corresponding to the prediction block 2 is b, the merging candidate corresponding to the prediction block 3 is c, the prediction block 1 is compression coded with a, the prediction block 2 with b and the prediction block 3 with c.

It is to be understood that the above examples are only examples set forth for a better understanding of the technical solutions of the embodiments of the present disclosure, and are not to be taken as the only limitations on the embodiments of the present disclosure.

The embodiment of the disclosure can reduce the computation complexity of video coding and decoding and save the video coding and decoding time by effectively cutting the repeated candidates.

In the video coding method provided by the embodiment of the present disclosure, by determining a current prediction block corresponding to a bidirectional prediction frame in a video to be coded, when the current prediction block meets a bidirectional prediction constraint condition, forward motion information corresponding to a target block corresponding to the current prediction block is obtained, the forward motion information is used as a merge candidate, and video coding is performed on the video to be coded according to the merge candidate. The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively crop repeated candidates by considering bidirectional prediction restriction in the process of acquiring merging candidates, and the embodiment of the disclosure can reduce the computational complexity of video encoding and decoding by only referring to forward motion information, and the saved candidate list position can be used for trying more candidates.

Example two

Fig. 3 is a flow chart illustrating a video encoding method according to an example embodiment, which may include the following steps, as shown in fig. 3.

In step S21, a current prediction block corresponding to a bidirectional prediction frame in the video to be encoded is determined.

The embodiment of the invention can be applied to the inter-frame prediction process to acquire the scene of the merged option corresponding to the prediction block in the B frame.

The bi-prediction block is a prediction block in the B frame. The B-frame method (B frame) generally used when inter-frame prediction is performed is an inter-frame compression algorithm for bi-directional prediction. When compressing a frame into a B-frame, it compresses the frame according to the difference between the adjacent forward frame, the frame and the backward frame data, i.e. only the difference between the frame and the previous and next frames is recorded. Only with B-frame compression can 200 be achieved: 1, high compression.

After determining the current prediction block corresponding to the bidirectional prediction frame in the video to be encoded, step S22 is performed.

In step S22, when it is determined that the current prediction block satisfies the bi-directional prediction condition according to the partition size of the current prediction block, a spatial neighboring block of the current prediction block is obtained.

The partition size refers to a size of a video block that is previously partitioned, and a typical partition size of the video block may be 8 × 4, 4 × 4, 8 × 8, 16 × 16, and so on.

After determining the current prediction block in the bi-directional prediction frame, the partition size of the current prediction block may be acquired.

After obtaining the partition size of the current prediction block, it may be determined whether the current prediction block satisfies the bi-directional prediction restriction condition according to the partition size, for example, when the partition size of the current prediction block is 8 × 4 or 4 × 8, the current prediction block satisfies the bi-directional prediction restriction condition; and when the partition size of the current prediction block is 4 x 4, the current prediction block does not satisfy the bi-directional prediction restriction condition.

And in case that the current prediction block satisfies the bi-prediction restriction condition, a spatially neighboring block of the current prediction block may be acquired, for example, as shown in fig. 1, the spatially neighboring block of the current prediction block includes a1, a0, B1, B0, and B2.

After the spatial neighboring block of the current prediction block is acquired, step S23 is performed.

In step S23, it is detected whether the spatial neighboring block uses forward reference information.

After the spatial domain neighboring block corresponding to the current prediction block is obtained, whether a forward reference frame list (i.e., forward L0) exists in the spatial domain neighboring block may be determined, and whether the spatial domain neighboring block uses forward reference information may be determined according to the forward reference frame list, so as to determine whether the spatial domain neighboring block is available. Specifically, the forward L0 may exist in the spatial neighboring block, and when the forward reference information is used, the spatial neighboring block is an available spatial neighboring block; when the forward L0 of the spatial neighboring block is not available (i.e. backward direction prediction), the motion information of the spatial neighboring block cannot use the forward reference information, and the candidate is excluded.

After detecting whether the spatial neighboring blocks use the forward reference information, step S24 is performed.

In step S24, when the spatial neighboring block uses the forward reference information, spatial forward motion information of the spatial neighboring block is obtained, and the spatial forward motion information is used as the merging candidate.

The spatial domain forward motion information refers to forward motion information acquired from spatial domain neighboring blocks.

When the forward L0 exists in the spatial neighboring block and the forward reference information is used, it indicates that the spatial neighboring block is available, and the spatial forward motion information of the spatial neighboring block can be obtained and used as a merging candidate.

Specifically, referring to the example in the above step S23, the spatial neighboring blocks include a1, a0, B1, B0, and B2, and when a1, a0, and B0 are available, spatial forward motion information corresponding to a1, a0, and B0, respectively, may be acquired and used as a merge candidate.

The scheme of how to obtain spatial forward motion information of spatial neighboring blocks is well-known in the art, and this process is not described in detail in the embodiments of the present disclosure.

After obtaining the spatial domain candidates, the spatial domain candidates may be added to the candidate list, specifically, described in detail in the following specific implementation.

In a specific implementation of the present disclosure, after the step S23, the method may further include:

step A1: determining whether the same motion information as the spatial forward motion information is present in a candidate list.

In the embodiments of the present disclosure, the candidate list refers to a list formed by prediction candidates corresponding to bidirectional prediction frames.

After obtaining the spatial domain forward motion information of the target block corresponding to at least one spatial domain neighboring block, each spatial domain forward motion information may be added to the candidate list, and specifically, before adding the spatial domain forward motion information to the candidate list, it may be detected whether there is motion information identical to each spatial domain forward motion information in the candidate list.

After determining whether the same motion information as the spatial domain forward motion information exists in the candidate list, step a2 is performed.

Step A2: adding the spatial forward motion information to the candidate list when the same motion information as the spatial forward motion information does not exist in the candidate list.

When it is determined that the same motion information as each spatial forward motion information does not exist in the candidate list, all the spatial forward motion information may be added to the candidate list.

And when motion information identical to one or more spatial domain forward motion information among the spatial domain forward motion information exists in the candidate list, adding spatial domain forward motion information in which the identical motion information does not exist to the candidate list, for example, the spatial domain forward motion information includes information 1, information 2, information 3, and information 4, and when motion information identical to information 2 and information 3 exists in the candidate list, only information 1 and information 4 are added to the candidate list, and information 2 and information 3 are excluded.

After spatial domain forward motion information of the spatial domain neighboring blocks is acquired and the spatial domain forward motion information is used as a merge candidate, step S25 is performed.

In step S25, when the number of merging candidates does not reach a preset number of candidates, a time-domain associated block of the current prediction block is obtained.

The preset candidate number refers to the number that needs to be reached by merge candidates preset by a service person, and the preset candidate number may be 5, 6, 7, and the like, and specifically, may be determined according to a service requirement, which is not limited in this embodiment of the present disclosure.

The temporal associated block refers to a video block in a reference frame associated with the current prediction block when the temporal candidate is obtained.

In the case that the number of merging candidates does not reach the preset number of candidates, the time domain associated block corresponding to the current prediction block may be acquired, and specifically, the time domain associated block corresponding to the current prediction block may be acquired according to a reference frame list corresponding to the current prediction block.

After the time-domain associated block corresponding to the current prediction block is acquired, step S26 is performed.

In step S26, time domain forward motion information of the time domain associated block is obtained according to the forward time domain reference frame list corresponding to the time domain associated block, and the time domain forward motion information is used as the merging candidate.

The forward temporal reference frame list refers to a list of temporal reference frames corresponding to the current prediction block.

The temporal forward motion information refers to forward motion information acquired from the temporal correlation block.

It can be understood that the time domain associated block exists in the time domain reference frame list corresponding to the current prediction block, and after the time domain associated block is obtained, the time domain forward motion information corresponding to the time domain associated block can be obtained according to the forward time domain reference frame list corresponding to the time domain associated block, and the time domain forward motion information is taken as a merging candidate.

For the process of acquiring temporal forward motion information, reference may be made to the following description of specific implementations.

In a preferred embodiment of the present disclosure, the step S276 may include:

substep B1: acquiring a time domain reference frame corresponding to the time domain associated block according to the forward time domain reference frame list;

substep B2: and acquiring reference forward motion information corresponding to the time domain reference frame, and taking the reference forward motion information as the time domain forward motion information.

In the embodiment of the present disclosure, the temporal reference frame refers to a video frame where the temporal association block is located, for example, when the temporal association block is one video block in the video frame a, the temporal reference frame is the video frame a.

After obtaining the forward time domain reference frame corresponding to the time domain correlation block, the time domain reference frame corresponding to the time domain correlation block may be obtained according to the forward time domain reference frame.

After the time-domain reference frame is obtained, reference forward motion information corresponding to the time-domain reference frame may be obtained and used as the time-domain forward motion information.

However, the scheme of how to obtain the time domain forward motion information is a mature technology in the field, and the embodiments of the present disclosure are not described herein again.

After obtaining the temporal forward motion information, the temporal forward motion information may be added to the candidate list, and in particular, described in the following detailed implementation.

In a specific implementation of the present disclosure, after the step S26, the method may further include:

step C1: adding the temporal forward motion information to the candidate list when the same motion information as the temporal forward motion information does not exist in the candidate list.

After obtaining the time domain forward motion information corresponding to the time domain associated block, the time domain forward motion information may be added to the candidate list, and specifically, before adding the time domain forward motion information to the candidate list, it may be detected whether motion information identical to the time domain forward motion information exists in the candidate list.

The temporal forward motion information may be excluded when the presence of the same motion information as the temporal forward motion information in the candidate list is detected.

And if it is determined that the motion information identical to the temporal forward motion information does not exist in the candidate list, the temporal forward motion information may be added to the candidate list.

After the time domain forward motion information of the time domain associated block is obtained according to the forward time domain reference frame list corresponding to the time domain associated block and is used as a merging candidate, step S27 is executed.

In step S27, when the number of merge candidates does not reach the preset candidate number, a target candidate construction algorithm that can generate forward motion information is acquired.

In the embodiment of the present disclosure, the target candidate construction algorithm may include one or more of a motion information prediction candidate algorithm, a combination average candidate algorithm, a zero motion candidate algorithm, and other candidate algorithms.

In the above process, under the condition that the total number of the acquired airspace forward motion information and the time domain forward motion information does not reach the preset candidate number, the target candidate construction algorithm can be acquired.

The target candidate construction algorithm may include a plurality of algorithms, and for a specific acquisition flow, the method may include acquiring a prediction candidate algorithm first, acquiring, when forward motion information exists in the prediction candidate algorithm, prediction forward motion information according to the prediction candidate algorithm, and using the prediction forward motion information as a merging candidate; when the number of the merging candidates does not reach the preset number, acquiring a combined average candidate algorithm, acquiring combined average forward motion information according to the combined average candidate algorithm, and taking the combined average forward motion information as a merging candidate; and when the number of the merging candidates does not reach the preset number, acquiring a zero motion candidate algorithm, acquiring zero motion forward motion information according to the zero motion candidate algorithm, and taking the zero motion forward motion information as a merging candidate and the like.

It should be understood that the above-mentioned schemes are only listed for better understanding of the technical schemes of the embodiments of the present disclosure, and are not to be taken as the only limitation of the embodiments of the present disclosure.

After obtaining the target candidate construction algorithm that can generate the forward motion information, step S28 is performed.

In step S28, target forward motion information corresponding to the target candidate construction algorithm is obtained, and the target forward motion information is used as the merging candidate.

After obtaining the target candidate construction algorithm, target forward motion information corresponding to the target candidate construction algorithm, such as the predicted forward motion information, the combined average forward motion information and the zero motion forward motion information, etc., as mentioned above in step S27, may be obtained.

After the target forward motion information is acquired, the target forward motion information may be taken as a merging candidate.

The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively crop repeated candidates by considering bidirectional prediction restriction in the process of acquiring merging candidates, and the embodiment of the disclosure can reduce the computational complexity of video encoding and decoding by only referring to forward motion information, and the saved candidate list position can be used for trying more candidates.

After the target forward motion information corresponding to the target candidate construction algorithm is acquired and is used as a merging candidate, step S29 is performed.

In step S29, the video to be encoded is video-encoded according to the merge candidate.

According to the video coding method provided by the embodiment of the disclosure, by determining the current prediction block corresponding to the bidirectional prediction frame in the video to be coded, when the current prediction block meets the bidirectional prediction constraint condition, the forward motion information corresponding to the target block corresponding to the current prediction block is acquired, the forward motion information is used as a merging candidate, and the video to be coded is subjected to video coding according to the merging candidate. The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively crop repeated candidates by considering bidirectional prediction restriction in the process of acquiring merging candidates, and the embodiment of the disclosure can reduce the computational complexity of video encoding and decoding by only referring to forward motion information, and the saved candidate list position can be used for trying more candidates.

EXAMPLE III

Fig. 4 is a block diagram illustrating a video encoding apparatus according to an example embodiment. Referring to fig. 4, the apparatus includes a current prediction block determination module 131, a merge candidate generation module 132, and a to-be-encoded video encoding module 133.

The current prediction block determination module 131 may be configured to determine a current prediction block corresponding to a bi-directionally predicted frame in the video to be encoded;

the merge candidate generation module 132 may be configured to, when the current prediction block satisfies a bi-directional prediction constraint condition, acquire forward motion information of a target block corresponding to the current prediction block, and use the forward motion information as a merge candidate;

the to-be-encoded video encoding module 133 may be configured to perform video encoding on the to-be-encoded video according to the merge candidate.

The video coding device provided by the embodiment of the disclosure obtains forward motion information corresponding to a target block corresponding to a current prediction block by determining the current prediction block corresponding to a bidirectional prediction frame in a video to be coded when the current prediction block meets a bidirectional prediction constraint condition, takes the forward motion information as a merging candidate, and performs video coding on the video to be coded according to the merging candidate. The embodiment of the disclosure can significantly reduce the occurrence of invalid candidates and effectively crop repeated candidates by considering bidirectional prediction restriction in the process of acquiring merging candidates, and the embodiment of the disclosure can reduce the computational complexity of video encoding and decoding by only referring to forward motion information, and the saved candidate list position can be used for trying more candidates.

Example four

Fig. 5 is a block diagram illustrating a video encoding apparatus according to an example embodiment. Referring to fig. 5, the apparatus includes a current prediction block determining module 141, a merge candidate generating module 142, a target candidate algorithm obtaining module 143, a target motion information obtaining module 144, and a to-be-encoded video encoding module 145;

the current prediction block determination module 141 may be configured to determine a current prediction block corresponding to a bi-directionally predicted frame in the video to be encoded;

the merge candidate generation module 142 may be configured to, when the current prediction block satisfies a bi-directional prediction constraint condition, acquire forward motion information of a target block corresponding to the current prediction block, and use the forward motion information as a merge candidate;

the target candidate algorithm obtaining module 143 may be configured to obtain a target candidate construction algorithm that may generate forward motion information when the number of merge candidates does not reach the preset candidate number;

the target motion information obtaining module 144 may be configured to obtain target forward motion information corresponding to the target candidate construction algorithm, and use the target forward motion information as the merging candidate; the target candidate construction algorithm comprises at least one of a motion information prediction candidate algorithm, a combined average candidate algorithm and a zero motion candidate algorithm;

the video to be encoded encoding module 145 may be configured to perform video encoding on the video to be encoded according to the merge candidate.

In a specific implementation of the present disclosure, the merge candidate generation module 142 includes: a space domain adjacent block obtaining sub-module 1421, a forward reference information detection sub-module 1422, a space domain motion information obtaining sub-module 1423, a time domain associated block obtaining sub-module 1424, and a time domain motion information obtaining sub-module 1425;

the spatial neighboring block obtaining sub-module 1421 may be configured to obtain a spatial neighboring block of the current prediction block when it is determined that the current prediction block satisfies a bi-directional prediction condition according to the partition size of the current prediction block;

the forward reference information detection sub-module 1422 may be configured to detect whether the spatially neighboring block uses forward reference information;

the spatial domain motion information obtaining sub-module 1423 may be configured to obtain spatial domain forward motion information of the spatial domain neighboring block when the spatial domain neighboring block uses forward reference information, and use the spatial domain forward motion information as the merging candidate;

the time domain association block obtaining sub-module 1424 may be configured to obtain the time domain association block of the current prediction block when the number of merging candidates does not reach a preset candidate number;

the time domain motion information obtaining sub-module 1425 may be configured to obtain the time domain forward motion information of the time domain associated block according to the forward time domain reference frame list corresponding to the time domain associated block, and use the time domain forward motion information as the merging candidate.

the device further comprises:

In a specific implementation of the present disclosure, the time domain motion information obtaining sub-module 1425 includes: a time domain reference frame obtaining submodule and a time domain forward motion information obtaining submodule;

the time domain reference frame obtaining sub-module may be configured to obtain, according to the forward time domain reference frame list, a time domain reference frame corresponding to the time domain associated block;

the time domain forward motion information obtaining sub-module may be configured to obtain reference forward motion information corresponding to the time domain reference frame, and use the reference forward motion information as the time domain forward motion information.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 6 is a block diagram illustrating an electronic device 800 in accordance with an example embodiment. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 6, electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker configured to output audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

Sensor assembly 814 includes one or more sensors configured to provide various aspects of state assessment for electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, the orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, configured for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi, a carrier network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components configured to perform the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the electronic device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Fig. 7 is a block diagram illustrating a device configured as a video encoding device 1900 according to an example embodiment. For example, the apparatus 1900 may be provided as a server. Referring to fig. 7, the device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, configured to store instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the method described above: determining a current prediction block corresponding to a bidirectional prediction frame in a video to be coded; when the current prediction block meets the bidirectional prediction limiting condition, acquiring forward motion information of a target block corresponding to the current prediction block, and taking the forward motion information as a merging candidate; and carrying out video coding on the video to be coded according to the merging candidate.

The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, MacOS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A video encoding method, comprising:

2. The method according to claim 1, wherein said step of obtaining forward motion information of a target block corresponding to the current prediction block when the current prediction block satisfies a bi-prediction constraint condition comprises:

3. The method of claim 2, further comprising, after the step of using the spatial forward motion information as the merge candidate:

4. The method according to claim 2, wherein said step of obtaining time domain forward motion information of said time domain associated block according to a forward time domain reference frame list corresponding to said time domain associated block comprises:

5. The method of claim 2, further comprising, after the step of using the temporal forward motion information as the merge candidate:

6. A video encoding apparatus, comprising:

7. The apparatus of claim 6, wherein the merge candidate generation module comprises:

8. The apparatus of claim 7, further comprising:

the device further comprises:

9. An electronic device, comprising:

a processor;

a memory configured to store processor-executable instructions;

wherein the processor is configured to perform the video encoding method of any of claims 1 to 5.

10. A non-transitory computer readable storage medium, instructions in which, when executed by a processor of a mobile terminal, enable the mobile terminal to perform the video encoding method of any of claims 1 to 5.