CN115514965A - Video encoding method and device, electronic equipment and storage medium - Google Patents

Video encoding method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115514965A
CN115514965A CN202211078216.0A CN202211078216A CN115514965A CN 115514965 A CN115514965 A CN 115514965A CN 202211078216 A CN202211078216 A CN 202211078216A CN 115514965 A CN115514965 A CN 115514965A
Authority
CN
China
Prior art keywords
loss
frame
frames
block
reference frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211078216.0A
Other languages
Chinese (zh)
Inventor
钟婷婷
谷嘉文
闻兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202211078216.0A priority Critical patent/CN115514965A/en
Publication of CN115514965A publication Critical patent/CN115514965A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure relates to a video encoding method, apparatus, electronic device, and storage medium, wherein the method includes: under the condition that a reference frame corresponding to a current coding block of a video comprises multiple frames, determining first reference loss which is generated by the current coding block and refers to each reference frame respectively and corresponds to each reference frame; distributing corresponding propagation weights for the reference frames according to the first reference loss corresponding to the reference frames, wherein the propagation weights and the first reference loss are in a negative correlation relationship; and encoding the current coding block based on the propagation weight. According to the technical scheme disclosed by the invention, the propagation weight of the CU-tree is dynamically adjusted based on the sequence characteristics, so that the compression performance of the video can be improved, and each code rate segment has a better effect.

Description

Video encoding method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a video encoding method, an apparatus, an electronic device, a storage medium, and a computer program product.
Background
A CU-tree (Coding Unit tree) is a technique of adaptive Quantization, which aims to adjust the QP (Quantization Parameter) value of a current block by predicting how much information the current block refers to in an inter prediction by a future reference frame (using a propagation cost). If the more information the current block contributes to the reference frame, the coding quality of the current block should be improved, and the QP value should be reduced; otherwise, the QP value for the current block is increased.
In the related art, if the reference frame corresponding to the current frame to which the current block belongs is a multi-frame, it is usually necessary to allocate a corresponding propagation weight to each reference frame according to a distance between the current frame and each reference frame by depending on a reference structure between the current frame and the reference frame. And distributing the total propagation cost to each reference frame according to the propagation weight corresponding to each reference frame to obtain the propagation cost transmitted to each reference frame by the current block. However, the reference structure is usually fixed, which easily causes propagation weights corresponding to different coding blocks in the same frame to be fixed parameters, thereby causing a failure in accurately propagating the block-level features.
Disclosure of Invention
The present disclosure provides a video encoding method, apparatus, electronic device, storage medium, and computer program product, so as to at least solve a problem that a CU-tree in the related art cannot accurately propagate block-level features as a block-level propagation tool. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a video encoding method, including:
under the condition that a reference frame corresponding to a current coding block of a video comprises multiple frames, determining first reference loss which is generated by the current coding block and refers to each reference frame respectively and corresponds to each reference frame;
distributing corresponding propagation weights to the reference blocks in each reference frame according to the first reference loss corresponding to each reference frame, wherein the propagation weights and the first reference loss are in a negative correlation relationship;
encoding the current encoding block based on the propagation weights.
In one embodiment, the allocating, according to the first reference loss corresponding to each of the reference frames, a corresponding propagation weight to a reference block in each of the reference frames includes:
determining a second reference loss generated by the current coding block simultaneously referencing the reference frames of the multiple frames;
and distributing corresponding propagation weights to the reference blocks in the reference frames according to the first reference loss and the second reference loss corresponding to the reference frames.
In one embodiment, the allocating, according to the first reference loss and the second reference loss corresponding to each of the reference frames, a corresponding propagation weight to a reference block in each of the reference frames includes:
determining a loss difference value between a first reference loss and a second reference loss corresponding to each reference frame;
and distributing corresponding propagation weights for the reference frames according to the loss difference values corresponding to the reference frames.
In one embodiment, the allocating, according to the loss difference value corresponding to each of the reference frames, a corresponding propagation weight to a reference block in each of the reference frames includes:
aiming at any reference frame, obtaining the sum of loss difference values of other reference frames, wherein the other reference frames are the reference frames except the any reference frame in the plurality of reference frames;
obtaining the sum of loss difference values of a plurality of reference frames;
and generating a propagation weight corresponding to the reference block in any reference frame according to the sum of the loss difference values of other reference frames and the sum of the loss difference values of a plurality of reference frames.
In one embodiment, the determining the loss difference value between the first reference loss and the second reference loss corresponding to each of the reference frames includes:
and acquiring the difference between the first reference loss and the second reference loss corresponding to each reference frame as the loss difference value corresponding to each reference frame.
In one embodiment, the determining a first reference loss corresponding to each of the reference frames, which is generated by the current coding block referring to each of the reference frames respectively, includes:
dividing a current frame to which the current coding block belongs and each reference frame to obtain a plurality of blocks of the current frame and each reference frame;
for any block in a plurality of blocks of a current frame, performing motion search in a plurality of blocks of each reference frame, and determining a reference loss between the any block and each block in each reference frame;
and determining the first reference loss according to the reference loss between each block in the current frame and each block in each reference frame.
In one embodiment, in the case that the current frame to which the current coding block belongs is a B frame, the reference frames of the plurality of frames include a forward reference frame and a backward reference frame of the current frame.
According to a second aspect of the embodiments of the present disclosure, there is provided an encoding apparatus of a video, including:
the loss determining module is configured to determine first reference losses, which are generated by a current coding block by respectively referring to reference frames of a video and correspond to the reference frames, under the condition that the reference frames corresponding to the current coding block comprise multiple frames;
the weight determining module is configured to execute distribution of corresponding propagation weights for the reference blocks in each reference frame according to the first reference loss corresponding to each reference frame, wherein the propagation weights and the first reference loss are in a negative correlation relationship;
an encoding module configured to perform encoding of the current encoding block based on the propagation weights.
In one embodiment, the weight determining module includes:
a loss determining unit configured to perform determining a second reference loss generated by the current coding block while referring to the reference frame of the multiple frames;
and the weight distribution unit is configured to distribute corresponding propagation weights to the reference blocks in the reference frames according to the first reference loss and the second reference loss corresponding to the reference frames.
In one embodiment, the weight assignment unit includes:
a difference value generation subunit configured to perform determining a loss difference value between the first reference loss and the second reference loss corresponding to each of the reference frames;
and the weight distribution subunit is configured to distribute corresponding propagation weights to the reference blocks in the reference frames according to the loss difference values corresponding to the reference frames.
In one embodiment, the weight assignment subunit is configured to perform obtaining, for any reference frame, a sum of loss difference values of other reference frames, where the other reference frames are reference frames other than the any reference frame in the plurality of reference frames; obtaining a sum of loss difference values of a plurality of the reference frames; and generating a propagation weight corresponding to the reference block in any reference frame according to the sum of the loss difference values of other reference frames and the sum of the loss difference values of a plurality of reference frames.
In one embodiment, the difference value generating subunit is configured to obtain a difference between the first reference loss and the second reference loss corresponding to each of the reference frames as the difference value of the loss corresponding to each of the reference frames.
In one embodiment, the loss determining module is configured to perform partitioning on a current frame to which the current coding block belongs and each of the reference frames to obtain a plurality of blocks corresponding to the current frame and each of the reference frames;
for any block in a plurality of blocks of a current frame, performing motion search in a plurality of blocks of each reference frame, and determining a reference loss between the any block and each block in each reference frame;
and determining the first reference loss according to the reference loss between each block in the current frame and each block in each reference frame.
In one embodiment, in the case that the current frame to which the current coding block belongs is a B frame, the multiple frames of the reference frame include a forward reference frame and a backward reference frame of the current frame.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the video encoding method of any one of the above first aspects.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method of any one of the first aspect.
According to a fifth aspect of the embodiments of the present disclosure, there is provided a computer program product, including instructions that, when executed by a processor of an electronic device, enable the electronic device to perform the video encoding method of any one of the first aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
under the condition that a reference frame corresponding to a current coding block of a video comprises multiple frames, determining first reference losses which are generated by the current coding block and respectively refer to each reference frame and correspond to each reference frame, dynamically adjusting the propagation weight of a CU-tree based on sequence characteristics according to the first reference losses corresponding to each reference frame, and compared with the prior art that all frame coding blocks use the same propagation weight based on inter-frame distance, the video compression performance can be improved, and each code rate segment has a good effect (for example, BD-rate profit, an objective index for measuring coding performance).
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a flow chart illustrating a method of encoding video according to an exemplary embodiment.
Fig. 2 is a flow chart illustrating another method of encoding video according to an example embodiment.
Fig. 3 is a block diagram illustrating an apparatus for encoding video according to an example embodiment.
FIG. 4 is a block diagram illustrating an electronic device in accordance with an example embodiment.
FIG. 5 is a block diagram illustrating another electronic device in accordance with an exemplary embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.
It should also be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are both information and data that are authorized by the user or sufficiently authorized by various parties.
The MB-Tree (MacroBlock-Tree) algorithm is a new technology introduced in x264 (a video coding standard), and the main idea of the algorithm is to consider the influence of various coding decisions on the coding quality of a future image frame when a current image frame is coded, so that the subjective and objective quality of a video can be effectively improved. This algorithm is also inherited in x265, named CU-tree.
The CU-tree is a block-level QP (Quantization Parameter) control method. For each CU block (coding block), a certain number of image frames are predicted forward, and the coding block is quantized using a corresponding delta QP, depending on the situation in which it is referenced in the image frame. If the more information the current coding block contributes to the subsequent image frame, the higher the importance of the current coding block is, the coding quality of the coding block should be improved, and the quantization parameter value should be reduced; otherwise, the quantization parameter value of the coding block is increased.
The core of the CU-tree algorithm includes the calculation of the propagation cost and the calculation of the QP increment. The calculation flow is as follows:
(1) Obtaining the intra _ cost and the inter _ cost of the current coding block, and calculating a propagation factor (propagation _ fraction) from the current coding block to a reference coding block according to the intra _ cost and the inter _ cost:
propagate_fraction=1–inter_cost/intra_cost
wherein inter _ cost represents the SATD (Sum of Absolute Transformed Difference) loss cost of estimating the predicted and original values using inter mode; intra _ cost represents the SATD penalty cost of estimating the predicted and original values using intra mode.
(2) Calculating the propagation total (propagate _ cost) from the current coding block to its reference coding block (i.e. reference block) out ):
propagate_cost out =propagate_fraction*(intra_cost+propagate_cost in )
Wherein, the propagate _ cost in Representing the total amount of propagation that the current coding block is referenced to and thus transmitted.
(3) In the case that the number of reference frames to which the reference block belongs is multiple, since the reference blocks are not located on the same coding unit, the propagation total amount may be allocated to each reference block according to the reference region area (overlap _ area/CU _ area) and according to the distance between the current frame to which the current coding block belongs and each reference frame, so as to obtain the propagation cost of each reference block.
Take the current image frame as B frame as an example. The B frame is a bidirectional predictive coding frame, and with a previous I frame (intra-coded frame) or P frame (forward predictive coding frame) and a subsequent P frame as reference frames, propagation costs corresponding to reference blocks in the two reference frames (forward reference frame and backward reference frame) can be obtained as follows:
propagate_cost ref1 =(d 2 /(d 1 +d 2 ))*propagate_cost out *(overlap_area/CU_area)
propagate_cost ref2 =(d 1 /(d 1 +d 2 ))*propagate_cost out *(overlap_area/CU_area)
wherein, the proxy _ cost ref1 Representing the corresponding propagation cost of the reference block in the forward reference frame; propagate _ cost ref2 Representing the corresponding propagation cost of the reference block in the backward reference frame; d 1 Representing the inter-frame distance from the current frame to the forward reference frame; d is a radical of 2 Representing the inter-frame distance from the current frame to the backward reference frame; d 2 /(d 1 +d 2 ) Representing propagation weights for reference blocks in a forward reference frame; d is a radical of 1 /(d 1 +d 2 ) Representing propagation weights for reference blocks in a backward reference frame; overlap _ area/CU _ area represents a reference region area.
(4) Determining QP deltas
Continuing to take the above B frame as an example, after the propagation costs respectively corresponding to the reference blocks in the forward reference frame and the backward reference frame are obtained, the QP increment respectively corresponding to each reference block can be calculated according to the propagation costs:
DeltaQP_1=-strength*log2(1+propagate_cost ref1 /intra_cost)
DeltaQP_2=-strength*log2(1+propagate_cost ref2 /intra_cost)
wherein, delta QP _1 represents the QP increment corresponding to the reference block in the forward reference frame; delta QP _2 represents a QP increment corresponding to a reference block in the backward reference frame; strength is an adjustable strength value.
As can be seen from the formula in step (3), for the coding standards x264, x265, etc., the CU-tree sets propagation weights based on the distance between the current frame and its reference frame. The propagation weight depends on a fixed reference structure, which easily causes the propagation weights of different coding blocks in the current frame to be the same value. Taking the current frame as the B frame as an example, the structure of the bidirectional predictive coding frame is usually fixed, and is mostly a binary frame structure, that is, the propagation weight is mostly a fixed value 1/2. And the CU-tree is used as a block-level transmission tool, all block levels use frame-level fixed parameters, and the block-level characteristics cannot be accurately transmitted.
In order to solve the above problem, the present disclosure provides a video encoding method, which may be applied to an electronic device, where the electronic device may be a terminal, a server, or a system formed by a terminal and a server, and the method is implemented by the terminal and the server interactively. In one embodiment, the electronic device is installed with a client capable of publishing videos. The client can be a live broadcast client, an e-commerce client, a social contact client, an instant messaging client and the like. After the terminal acquires the video shot by the user, the video can be compressed and encoded by the video encoding method provided by the disclosure, the compressed video is sent to the server, and the server sends the compressed video to other clients. The other client is any client playing the video. After receiving the compressed video, the other clients can decode the video in a decoding mode corresponding to the compression coding to play the video.
The terminal can be but not limited to various personal computers, notebook computers, smart phones, tablet computers, internet of things equipment and portable wearable equipment, and the internet of things equipment can be smart sound boxes, smart televisions, smart air conditioners, smart vehicle-mounted equipment and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers.
Fig. 1 is a flowchart illustrating a video encoding method according to an exemplary embodiment, where the video encoding method is used in an electronic device as illustrated in fig. 1 and includes steps S110 to S130.
In step S110, in a case that a reference frame corresponding to a current coding block of a video includes multiple frames, first reference losses corresponding to respective reference frames, which are generated by the current coding block referring to the respective reference frames, are determined.
The first reference loss may be characterized by loss indexes such as SATD, SSE (Sum of Squared dual to Error), MSE (Mean Squared Error), SAD (Sum of Absolute Difference), and The like, which are generated by The current coding block referring to each reference frame individually. It can be understood that the first reference penalty is a penalty in inter mode.
Specifically, in the case that the current coding block corresponds to a plurality of reference coding blocks, and the reference frames to which the plurality of reference blocks belong are multi-frames, after generating the total propagation amount between the current coding block and the plurality of reference blocks, the electronic device calculates first reference losses corresponding to the reference frames, which are generated by the current coding block referring to the reference blocks in the reference frames individually. The specific calculation method of the total amount of propagation can refer to the above contents, and is not specifically described herein.
One way of obtaining the first reference loss is explained below:
and downsampling the current coding block, the current frame and each reference frame, and dividing the current frame and each reference frame into blocks with preset sizes. The preset size may be 8x8. And aiming at a certain reference frame, performing motion search in the divided reference frame by adopting the current block in the current frame, determining the reference loss between the current block and each block in the reference frame, and finding the block with the minimum loss from the reference loss as the best matching block of the current block. And after all the blocks in the current frame are searched, taking the sum of the reference losses between all the blocks and the respective best matching block as a first reference loss.
In step S120, corresponding propagation weights are assigned to the reference blocks in each reference frame according to the first reference loss corresponding to each reference frame.
The propagation weight and the first reference loss are in a negative correlation relationship, that is, the greater the first reference loss corresponding to the reference frame is, the smaller the importance of the reference frame being referred to is, and the smaller the propagation weight should be set for the reference block in the reference frame, so as to reduce the QP quantization loss of the reference block.
In one possible embodiment, a mapping relationship between the propagation weight and the first reference loss may be preset, and the propagation weight obtained based on the mapping relationship is in a negative correlation with the first reference loss. After obtaining the first reference loss corresponding to each reference frame, the propagation weight corresponding to the first reference loss can be obtained based on the mapping relationship.
In step S130, the current coding block is encoded based on the propagation weights.
Specifically, after obtaining the propagation weight corresponding to each reference frame, the electronic device may allocate the total propagation amount to the reference blocks in the reference frame according to the propagation weight of each reference frame, so as to obtain the propagation cost corresponding to each reference block. And then calculating the QP increment of each reference block according to the propagation cost of each reference block. The current coding block is encoded based on the QP delta for each reference block.
In the video coding method, under the condition that a reference frame corresponding to a current coding block of a video comprises multiple frames, first reference losses which are generated by the current coding block and refer to the reference frames respectively and correspond to the reference frames are determined, the propagation weight of the CU-tree is dynamically adjusted based on sequence characteristics according to the first reference losses corresponding to the reference frames, and compared with the prior art that the whole frame coding blocks use the same propagation weight based on the inter-frame distance, the video compression performance can be improved, and each code rate segment has better benefits.
In an exemplary embodiment, step S120, allocating a corresponding propagation weight to a reference block in each reference frame according to a first reference loss corresponding to each reference frame, includes: determining a second reference loss generated by the current coding block simultaneously referencing the multi-frame reference frame; and distributing corresponding propagation weights for the reference blocks in each reference frame according to the first reference loss and the second reference loss corresponding to each reference frame.
The second reference loss may be characterized by using indexes such as SATD, SSE, MSE, SAD, and the like, which are generated by the current coding block referring to multiple reference frames at the same time.
Specifically, if there is one reference block corresponding to the current coding block, the current coding block may be marked as "there are no multiple reference blocks", and the step of calculating the second reference loss is not performed. If a plurality of reference blocks corresponding to the current coding block exist, the plurality of reference blocks belong to a multi-frame reference frame, and different reference blocks belong to different reference frames, after the total propagation amount between the current coding block and the plurality of reference blocks is generated, the electronic equipment calculates a second reference loss generated when the current coding block simultaneously refers to the multi-frame reference frame. Therefore, it can be understood that the second reference penalty is less than or equal to the first reference penalty corresponding to any one reference frame. The electronic device may calculate a propagation weight having a negative correlation with the first reference loss corresponding to each reference frame based on the first reference loss and the second reference loss corresponding to each reference frame.
In one possible embodiment, for any reference frame, the electronic device may determine the propagation weight of the reference block in each reference frame based on the ratio of the first reference loss to the second reference loss of the other reference frames. Wherein, the other reference frames are all the other reference frames except the any one reference frame in the plurality of reference frames.
In this embodiment, the propagation weights of the reference frames are calculated based on the second reference loss generated by the current coding block simultaneously referencing multiple reference frames and the first reference loss of each reference frame, so that the propagation weights of the reference blocks in each reference frame can be dynamically adjusted by reasonably utilizing the loss caused by actual referencing of each reference frame, thereby being beneficial to obtaining the propagation weights with higher accuracy.
In an exemplary embodiment, after obtaining the first reference loss and the second reference loss corresponding to each reference frame, the electronic device may calculate a loss difference value between the first reference loss and the second reference loss corresponding to each reference frame, and further calculate a propagation weight of the reference block in each reference frame according to the loss difference value corresponding to each reference frame. Wherein, the loss difference value can be represented by difference value, ratio value and the like.
In one possible embodiment, for any reference frame, the electronic device may determine the propagation weight of the reference block in each reference frame based on a ratio of the total amount of loss difference values of other reference frames to the total amount of loss difference values of all reference frames. The total loss difference value can be calculated by adopting addition, multiplication and other operation modes.
Take the example of the loss difference value being the difference value and the total loss difference value being the sum of the addition. If the reference frames comprise n (n is a positive integer greater than 1) frames, aiming at any reference frame, the electronic equipment obtains the sum of loss difference values of other reference frames; obtaining the sum of loss difference values of a plurality of reference frames; generating a propagation weight corresponding to the reference block in any reference frame according to the sum of the loss difference values of other reference frames and the sum of the loss difference values of a plurality of reference frames, wherein the propagation weight w of the reference block in the ith reference frame can be expressed by the following formula i
Figure BDA0003832574440000101
Wherein, ref _ cost i Representing a first reference loss corresponding to the ith reference frame; ref _ cost represents a second reference loss.
In this embodiment, the propagation weights of the reference blocks in the reference frames are calculated based on the second reference loss generated by the current coding block referring to multiple reference frames at the same time and the first reference loss of each reference frame, so that the propagation weights of the reference frames can be dynamically adjusted by reasonably utilizing the loss caused by actual reference of each reference frame, thereby being beneficial to obtaining the propagation weights with higher accuracy.
In an exemplary embodiment, as shown in fig. 2, a video encoding method is provided, in which a current frame to which a current coding block belongs is taken as a B frame for explanation. In the case where the current frame is a B frame, the multi-frame reference frames include a forward reference frame (assumed to be a P0 frame) and a backward reference frame (assumed to be a P1 frame, P1 being greater than P0) of the current frame. The method can be realized by the following steps:
in step S202, intra _ cost and inter _ cost of the current coding block are obtained, and a propagation factor from the current coding block to the reference coding block is calculated according to the intra _ cost and the inter _ cost.
In step S204, the total amount of propagation from the current coding block to its forward and backward reference frames is calculated.
In step S206, a first reference loss ref _ cost corresponding to the forward reference frame generated by the current coding block referring to the forward reference frame is determined 1 D (p 0, b, b), the first reference loss ref _ cost corresponding to the backward reference frame generated by the current coding block referring to the backward reference frame 2 D (b, b, p 1), and a second reference loss ref _ cost = D (p 0, b, p 1) that the current coding block makes reference to both the forward reference frame and the backward reference frame.
The following describes a method of determining the first reference loss of the P0 frame:
(1) The P0 frame and the B frame are down-sampled by 1/2, and the down-sampled P0 frame and B frame are divided into 8 × 8 blocks, respectively.
(2) And taking the value of the current block I of the B frame as a search value, performing motion search in the P0 frame, and finding the block with the minimum SATD loss as a best matching block I'.
(3) When the search of each block in the B frame is finished, the SATD loss of each block in the current frame is added to obtain D (p 0, B, B).
The first reference loss of the P1 frame may be determined by referring to the above-mentioned procedure, which is not specifically described herein.
The following describes a manner of determining the second reference loss:
(1) The P0 frame, the B frame, and the P1 frame are down-sampled by 1/2, and the down-sampled P0 frame, B frame, and P1 frame are divided into 8 × 8 blocks, respectively.
(2) And taking the value of the current block I of the B frame as a search value, performing motion search in the P0 frame, and finding the block with the minimum SATD as a best matching block I'. At this time, the SATD penalty of the current block is denoted as SATD1.
(3) And taking the value of 2 × I-I' as a search value, performing motion search in the frame P1, finding the block with the minimum SATD as the best matching block, and marking the loss as SATD2. If SATD1> SATD2, using SATD2 as the SATD loss of the current block, otherwise, still using SATD1 as the SATD loss of the current block. Wherein 2 × I denotes multiplying each pixel in block I by 2; 2I-I 'denotes the dot-by-dot subtraction of pixels in block I from pixels in block I'.
(4) When the search of each block in the B frame is finished, the SATD loss of each block in the current frame is added to obtain D (p 0, B, p 1).
In step S208, a first reference loss ref _ cost corresponding to the forward reference frame is determined 1 First reference loss ref _ cost corresponding to backward reference frame 2 And a second reference loss ref _ cost, which is respectively calculated to obtain the propagation weights of the reference blocks in the forward reference frame and the backward reference frame:
w 1 =(ref_cost 2 -ref_cost)/(ref_cost 1 +ref_cost 2 –2ref_cost)
w 2 =(ref_cost 1 -ref_cost)/(ref_cost 1 +ref_cost 2 –2ref_cost)
suppose ref _ cost 1 -ref_cost<ref_cost 2 -ref _ cost, which indicates that the more important a reference block in the forward reference frame is referred to, the larger the propagation weight should be set to reduce the QP quantization loss of the forward reference frame, and the current coding block can be referred to by utilizing the similarity thereof to the maximum extent; on the contrary, it is said that the importance of the reference block in the forward reference frame being referred to is small, and the propagation weight needs to be reduced.
In step S210, propagation costs corresponding to the reference blocks in the forward reference frame and the backward reference frame are calculated according to the propagation weights of the reference blocks in the forward reference frame and the backward reference frame.
propagate_cost ref1 =w 1 *propagate_cost out *(overlap_area/CU_area)
propagate_cost ref2 =w 2 *propagate_cost out *(overlap_area/CU_area)
Wherein, the propagate _ cost ref1 Representing the corresponding propagation cost of the forward reference frame; propagate _ cost ref2 Representing the corresponding propagation cost of the backward reference frame.
In step S212, QP increments corresponding to the reference blocks in the forward reference frame and the backward reference frame are determined.
DeltaQP_1=-strength*log2(1+propagate_cost ref1 /intra_cost)
DeltaQP_2=-strength*log2(1+propagate_cost ref2 /intra_cost)
Wherein, deltaQP _1 represents a QP increment corresponding to a reference block in the first reference frame; deltaQP _2 represents the QP increment corresponding to the reference block in the second reference frame; strength is an adjustable strength value.
In step S214, the current coding block is coded based on the respective QP increments corresponding to the reference blocks in the forward reference frame and the backward reference frame.
It should be understood that, although the steps in the above-described flowcharts are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a part of the steps in the above flowcharts may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
It is understood that the same/similar parts between the embodiments of the method described above in this specification can be referred to each other, and each embodiment focuses on the differences from the other embodiments, and it is sufficient that the relevant points are referred to the descriptions of the other method embodiments.
Fig. 3 is a block diagram illustrating an apparatus 300 for encoding video according to an example embodiment. Referring to fig. 3, the apparatus includes a loss determination module 302, a weight determination module 304, and an encoding module 306.
A loss determining module 302 configured to perform, in a case that a reference frame corresponding to a current coding block of a video includes multiple frames, determining first reference losses corresponding to the reference frames, which are generated by the current coding block respectively referring to the reference frames; a weight determining module 304 configured to perform allocating corresponding propagation weights to the reference blocks in each reference frame according to the corresponding first reference loss of each reference frame, where the propagation weights and the first reference loss are in a negative correlation relationship; an encoding module 306 configured to perform encoding of the current encoded block based on the propagation weights.
In an exemplary embodiment, the weight determination module 304 includes: a loss determining unit configured to perform determining a second reference loss generated by the current coding block while referring to the multi-frame reference frame; and the weight distribution unit is configured to distribute corresponding propagation weights to the reference blocks in each reference frame according to the first reference loss and the second reference loss corresponding to each reference frame.
In an exemplary embodiment, the weight assignment unit includes: a difference value generation subunit configured to perform determining a loss difference value between a first reference loss and a second reference loss corresponding to each reference frame; and the weight distribution subunit is configured to distribute corresponding propagation weights to the reference blocks in the reference frames according to the loss difference values corresponding to the reference frames.
In an exemplary embodiment, the weight assignment subunit is configured to perform obtaining, for any reference frame, a sum of loss difference values of other reference frames, where the other reference frames are reference frames other than any reference frame in the plurality of reference frames; obtaining the sum of loss difference values of a plurality of reference frames; and generating a propagation weight corresponding to the reference block in any reference frame according to the sum of the loss difference values of other reference frames and the sum of the loss difference values of a plurality of reference frames.
In an exemplary embodiment, the difference value generating subunit is configured to perform obtaining a difference between the first reference loss and the second reference loss corresponding to each reference frame as the loss difference value corresponding to each reference frame.
In an exemplary embodiment, the loss determining module 302 is configured to perform dividing on a current frame to which a current coding block belongs and each reference frame to obtain a plurality of blocks corresponding to the current frame and each reference frame; aiming at any block in a plurality of blocks of a current frame, carrying out motion search in a plurality of blocks of each reference frame, and determining the reference loss between the any block and each block in each reference frame; and determining a first reference loss according to the reference loss between each block in the current frame and each block in each reference frame.
In an exemplary embodiment, in the case where the current frame to which the current coding block belongs is a B frame, the multi-frame reference frames include a forward reference frame and a backward reference frame of the current frame.
With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
Fig. 4 is a block diagram illustrating an electronic device Z00 for encoding video in accordance with an example embodiment. For example, the electronic device Z00 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
Referring to fig. 4, electronic device Z00 may include one or more of the following components: a processing component Z02, a memory Z04, a power component Z06, a multimedia component Z08, an audio component Z10, an interface for input/output (I/O) Z12, a sensor component Z14 and a communication component Z16.
The processing component Z02 generally controls the overall operation of the electronic device Z00, such as operations associated with display, telephone calls, data communication, camera operations and recording operations. The processing component Z02 may comprise one or more processors Z20 to execute instructions to perform all or part of the steps of the method described above. Furthermore, the processing component Z02 may include one or more modules that facilitate interaction between the processing component Z02 and other components. For example, the processing component Z02 may comprise a multimedia module to facilitate interaction between the multimedia component Z08 and the processing component Z02.
The memory Z04 is configured to store various types of data to support operations at the electronic device Z00. Examples of such data include instructions for any application or method operating on electronic device Z00, contact data, phonebook data, messages, pictures, videos, and the like. The memory Z04 may be implemented by any type or combination of volatile or non-volatile storage devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, optical disk, or graphene memory.
The power supply component Z06 provides power to the various components of the electronic device Z00. Power component Z06 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device Z00.
The multimedia component Z08 comprises a screen providing an output interface between said electronic device Z00 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component Z08 includes a front facing camera and/or a rear facing camera. When the electronic device Z00 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component Z10 is configured to output and/or input an audio signal. For example, the audio component Z10 includes a Microphone (MIC) configured to receive an external audio signal when the electronic device Z00 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in a memory Z04 or transmitted via a communication component Z16. In some embodiments, the audio component Z10 further comprises a speaker for outputting audio signals.
The I/O interface Z12 provides an interface between the processing component Z02 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly Z14 includes one or more sensors for providing various aspects of state evaluation for the electronic device Z00. For example, the sensor assembly Z14 can detect the open/closed status of the electronic device Z00, the relative positioning of the components, such as the display and keypad of the electronic device Z00, the sensor assembly Z14 can also detect a change in the position of the electronic device Z00 or components of the electronic device Z00, the presence or absence of user contact with the electronic device Z00, the orientation or acceleration/deceleration of the device Z00, and a change in the temperature of the electronic device Z00. Sensor assembly Z14 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly Z14 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly Z14 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component Z16 is configured to facilitate wired or wireless communication between the electronic device Z00 and other devices. The electronic device Z00 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component Z16 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component Z16 further comprises a Near Field Communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device Z00 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components for performing the above-described methods.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory Z04 comprising instructions, executable by the processor Z20 of the electronic device Z00 to perform the above method is also provided. For example, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which comprises instructions executable by the processor Z20 of the electronic device Z00 to perform the above-mentioned method.
Fig. 5 is a block diagram illustrating another electronic device S00 for encoding video in accordance with an example embodiment. For example, the electronic device S00 may be a server. Referring to FIG. 5, the electronic device S00 includes a processing component S20, which further includes one or more processors, and memory resources, represented by memory S22, for storing instructions, such as applications, executable by the processing component S20. The application stored in the memory S22 may include one or more modules each corresponding to a set of instructions. Furthermore, the processing component S20 is configured to execute instructions to perform the above-described method.
The electronic device S00 may further include: the power supply component S24 is configured to perform power management of the electronic device S00, the wired or wireless network interface S26 is configured to connect the electronic device S00 to a network, and the input output (I/O) interface S28. The electronic device S00 may operate based on an operating system stored in the memory S22, such as Windows Server, mac OS X, unix, linux, freeBSD, or the like.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory S22 comprising instructions, executable by a processor of the electronic device S00 to perform the above-described method is also provided. The storage medium may be a computer-readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which includes instructions executable by a processor of the electronic device S00 to perform the above method.
It should be noted that the descriptions of the above apparatus, the electronic device, the computer-readable storage medium, the computer program product, and the like according to the method embodiments may also include other embodiments, and specific implementation manners may refer to the descriptions of the related method embodiments, which are not described in detail herein.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method for encoding video, comprising:
under the condition that a reference frame corresponding to a current coding block of a video comprises multiple frames, determining first reference loss which is generated by the current coding block by referring to each reference frame and corresponds to each reference frame;
distributing corresponding propagation weights to the reference blocks in each reference frame according to the first reference loss corresponding to each reference frame, wherein the propagation weights and the first reference loss are in a negative correlation relationship;
encoding the current encoding block based on the propagation weights.
2. The method of claim 1, wherein said assigning propagation weights to the reference blocks in each of the reference frames according to the first reference loss corresponding to each of the reference frames comprises:
determining a second reference loss generated by the current coding block simultaneously referencing the reference frames of the multiple frames;
and distributing corresponding propagation weights for the reference blocks in each reference frame according to the first reference loss and the second reference loss corresponding to each reference frame.
3. The method of claim 2, wherein the assigning propagation weights to the reference blocks in the reference frames according to the first reference loss and the second reference loss corresponding to the reference frames comprises:
determining a loss difference value between a first reference loss and a second reference loss corresponding to each reference frame;
and distributing corresponding propagation weights for the reference blocks in the reference frames according to the loss difference values corresponding to the reference frames.
4. The method according to claim 3, wherein said assigning propagation weights to the reference blocks in each of the reference frames according to the corresponding loss difference values of the reference frames comprises:
aiming at any reference frame, obtaining the sum of loss difference values corresponding to other reference frames, wherein the other reference frames are the reference frames except the any reference frame in the plurality of reference frames;
obtaining the sum of loss difference values of a plurality of reference frames;
and generating a propagation weight corresponding to the reference block in any reference frame according to the sum of the loss difference values of other reference frames and the sum of the loss difference values of a plurality of reference frames.
5. The method of claim 3, wherein the determining the loss difference between the first reference loss and the second reference loss corresponding to each of the reference frames comprises:
and acquiring the difference between the first reference loss and the second reference loss corresponding to each reference frame as the loss difference value corresponding to each reference frame.
6. The method of any of claims 1-5, wherein said determining a first reference loss corresponding to each of the reference frames, which is generated by the current coding block with reference to each of the reference frames, comprises:
dividing a current frame to which the current coding block belongs and each reference frame to obtain a plurality of blocks of the current frame and each reference frame;
for any block in a plurality of blocks of a current frame, performing motion search in a plurality of blocks of each reference frame, and determining a reference loss between the any block and each block in each reference frame;
and determining the first reference loss according to the reference loss between each block in the current frame and each block in each reference frame.
7. The method according to any one of claims 1 to 5, wherein, in a case where a current frame to which the current coding block belongs is a B frame, the plurality of frames of reference include a forward reference frame and a backward reference frame of the current frame.
8. An apparatus for encoding video, comprising:
the loss determining module is configured to determine first reference losses, which are generated by the current coding block respectively referring to the reference frames and correspond to the reference frames, in the case that the reference frames corresponding to the current coding block of the video comprise multiple frames;
the weight determining module is configured to execute distribution of corresponding propagation weights for the reference blocks in each reference frame according to the first reference loss corresponding to each reference frame, wherein the propagation weights and the first reference loss are in a negative correlation relationship;
an encoding module configured to perform encoding of the current encoding block based on the propagation weights.
9. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of encoding video of any of claims 1 to 7.
10. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of encoding video of any of claims 1-7.
CN202211078216.0A 2022-09-05 2022-09-05 Video encoding method and device, electronic equipment and storage medium Pending CN115514965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211078216.0A CN115514965A (en) 2022-09-05 2022-09-05 Video encoding method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211078216.0A CN115514965A (en) 2022-09-05 2022-09-05 Video encoding method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115514965A true CN115514965A (en) 2022-12-23

Family

ID=84502738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211078216.0A Pending CN115514965A (en) 2022-09-05 2022-09-05 Video encoding method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115514965A (en)

Similar Documents

Publication Publication Date Title
KR102448635B1 (en) Video encoding method, decoding method and terminal
WO2019184643A1 (en) Video coding code rate control method, apparatus and device, and storage medium
TWI684356B (en) A method and apparatus for determining motion vector prediction value, computer readable storage medium
EP3787291B1 (en) Method and device for video encoding, storage medium, and equipment
KR20210129721A (en) Method, device, and system for determining prediction weights for merge mode
US11412210B2 (en) Inter prediction method and apparatus for video coding
US10284850B2 (en) Method and system to control bit rate in video encoding
WO2019128716A1 (en) Image prediction method, apparatus, and codec
US20230328233A1 (en) Reference frame selection method and apparatus, device, and medium
JP2020509668A (en) Video encoding method, apparatus, device, and storage medium
CN109922340B (en) Image coding and decoding method, device, system and storage medium
CN110418134B (en) Video coding method and device based on video quality and electronic equipment
CN115052150A (en) Video encoding method, video encoding device, electronic equipment and storage medium
CN110611820A (en) Video coding method and device, electronic equipment and storage medium
CN115297333B (en) Inter-frame prediction method and device of video data, electronic equipment and storage medium
CN115514965A (en) Video encoding method and device, electronic equipment and storage medium
CN109587501B (en) Method, apparatus and storage medium for motion estimation
CN109565601B (en) Template matching-based prediction method and device
CN109660794B (en) Decision method, decision device and computer readable storage medium for intra prediction mode
CN111225208B (en) Video coding method and device
CN115086679A (en) Intra-frame prediction method and device, electronic equipment and storage medium
CN113038124B (en) Video encoding method, video encoding device, storage medium and electronic equipment
CN112738524B (en) Image encoding method, image encoding device, storage medium, and electronic apparatus
CN117956145A (en) Video encoding method, video encoding device, electronic equipment and storage medium
CN116600117A (en) Quantization parameter determination method, quantization parameter determination device, quantization parameter determination equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination