CN109040753B

CN109040753B - Prediction mode selection method, device and storage medium

Info

Publication number: CN109040753B
Application number: CN201810941893.8A
Authority: CN
Inventors: 黄书敏
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2018-08-17
Filing date: 2018-08-17
Publication date: 2020-12-22
Anticipated expiration: 2038-08-17
Also published as: CN109040753A

Abstract

The invention discloses a prediction mode selection method, a prediction mode selection device and a storage medium, and belongs to the technical field of video coding. The method comprises the following steps: acquiring a video frame to be coded in a target video, acquiring conditional probability distribution of candidate prediction modes about pixel information, wherein the conditional probability distribution is used for expressing the probability that a pixel block to be coded adopts any candidate prediction mode under the condition that the pixel block meets any pixel information, and acquiring the probability that the pixel block adopts each candidate prediction mode for each pixel block in the video frame according to the conditional probability distribution and the pixel information of the pixel block; the prediction mode of the pixel block is selected from a plurality of candidate prediction modes according to the probability that the pixel block adopts each candidate prediction mode. The invention provides a prediction mode selection mode, which can directly acquire the prediction mode for determining the pixel block without calculating a rate-distortion value for each alternative prediction mode, thereby reducing the calculation amount, saving the encoding time and improving the encoding efficiency.

Description

Prediction mode selection method, device and storage medium

Technical Field

The present invention relates to the field of video coding technologies, and in particular, to a prediction mode selection method, a prediction mode selection device, and a storage medium.

Background

Video coding technology refers to a technology for converting a certain video format into another format by a specific compression method. Intra-frame prediction is a commonly used video encoding technique, and is capable of encoding any pixel block in a video frame according to a prediction mode determined by the pixel block in the video frame by using spatial correlation between adjacent pixels in the video frame, and according to pixel values of a pixel block adjacent to the pixel block and having been encoded.

In the related technology, a video frame to be encoded is divided into a plurality of pixel blocks, for a target pixel block to be encoded currently, a process of encoding according to each alternative prediction mode can be simulated, that is, the target pixel block is encoded according to each alternative prediction mode, a difference value between an original pixel value and a predicted pixel value of the target pixel block can be determined according to an encoding value of the target pixel block, so that a rate distortion value of the target pixel block is calculated, and the rate distortion value can represent the prediction accuracy of the target pixel block. The prediction mode with the smallest rate distortion value is selected from the multiple candidate prediction modes as the prediction mode of the target pixel block. And then, according to the determined prediction mode, encoding the target pixel block according to the pixel value of the pixel block which is adjacent to the target pixel block and is already encoded, so as to obtain the encoded value of the target pixel block.

Each pixel block needs to calculate a rate-distortion value for each candidate prediction mode, and one prediction mode is selected from multiple candidate prediction modes according to the calculated rate-distortion value, because the number of candidate prediction modes is large, a large amount of calculation is caused, the encoding time is too long, and the encoding efficiency is very low.

Disclosure of Invention

The embodiment of the invention provides a prediction mode selection method, a prediction mode selection device and a storage medium, which can solve the problems in the related art. The technical scheme is as follows:

in a first aspect, a prediction mode selection method is provided, the method including:

acquiring a video frame to be coded in a target video;

acquiring conditional probability distribution of candidate prediction modes about pixel information, wherein the conditional probability distribution is determined according to the pixel information and the prediction modes of a plurality of pixel blocks which are coded in the target video, the conditional probability distribution is used for expressing the probability that the pixel block to be coded adopts any candidate prediction mode under the condition of meeting any pixel information, and the pixel information of the pixel block comprises at least one of the gradient direction of the pixel block, the prediction mode of a reference pixel block of the pixel block and the position identification of the pixel block;

for each pixel block in the video frame, acquiring the probability of each alternative prediction mode adopted by the pixel block according to the conditional probability distribution and the pixel information of the pixel block;

and selecting the prediction mode of the pixel block from a plurality of candidate prediction modes according to the probability of adopting each candidate prediction mode by the pixel block.

Optionally, the obtaining, according to the conditional probability distribution and the pixel information of the pixel block, the probability that the pixel block adopts each of the candidate prediction modes includes at least one of:

acquiring the probability of each candidate prediction mode adopted by the pixel block according to the first conditional probability distribution of the candidate prediction modes about the gradient direction and the gradient direction of the pixel block;

acquiring the probability of each candidate prediction mode adopted by the pixel block according to the second conditional probability distribution of the candidate prediction mode relative to the prediction mode of the reference pixel block and the prediction mode of the reference pixel block of the pixel block;

and acquiring the probability of the pixel block adopting each alternative prediction mode according to the third conditional probability distribution of the alternative prediction modes about the position identifications and the position identifications of the pixel block.

Optionally, the obtaining, according to the conditional probability distribution and the pixel information of the pixel block, the probability that the pixel block adopts each of the candidate prediction modes includes:

for each candidate prediction mode, when the pixel information includes multiple items of the gradient direction of the pixel block, the prediction mode of a reference pixel block of the pixel block, and the position identification of the pixel block, calculating the product of the probabilities that the pixel block adopts the candidate prediction mode under each item of pixel information as the probability that the pixel block adopts the candidate prediction mode.

Optionally, the selecting a prediction mode of the pixel block from a plurality of candidate prediction modes according to the probability that the pixel block adopts each of the candidate prediction modes includes:

according to the probability of the pixel block adopting each candidate prediction mode, selecting the candidate prediction mode with the highest probability from the multiple candidate prediction modes as the prediction mode of the pixel block; alternatively, the first and second electrodes may be,

and selecting a preset number of alternative prediction modes from the multiple alternative prediction modes according to the sequence of the pixel block from large to small in the probability of adopting each alternative prediction mode, respectively encoding the pixel block according to each selected alternative prediction mode to obtain the rate distortion value of the pixel block, and selecting the alternative prediction mode with the minimum rate distortion value as the prediction mode of the pixel block.

Optionally, the method further comprises:

acquiring a coded video frame in the target video;

acquiring pixel information and a prediction mode of each pixel block in the video frame, and taking each acquired prediction mode as an alternative prediction mode;

and acquiring the conditional probability distribution of the candidate prediction mode about the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the video frame.

Optionally, the obtaining, according to the pixel information and the prediction mode of the plurality of pixel blocks in the video frame, a conditional probability distribution of the candidate prediction mode with respect to the pixel information includes:

acquiring the number of pixel blocks corresponding to any one of the alternative prediction modes and any one of the gradient directions according to the gradient directions and the prediction modes of the plurality of pixel blocks;

determining the probability of any one candidate prediction mode about any one gradient direction according to the number of pixel blocks corresponding to the any one candidate prediction mode and the total number of the pixel blocks;

and composing the probabilities of the plurality of candidate prediction modes with respect to a plurality of gradient directions into a first conditional probability distribution of the candidate prediction modes with respect to the gradient directions.

forming a prediction mode combination by the prediction mode of any pixel block in the video frame and the prediction mode of the reference pixel block of any pixel block to obtain various prediction mode combinations;

acquiring the occurrence frequency of each prediction mode combination in the video frame, and acquiring the probability of each prediction mode combination according to the acquired occurrence frequency;

and composing the probability of each prediction mode combination into a second conditional probability distribution of the candidate prediction mode with respect to the prediction mode of the reference pixel block, the second conditional probability distribution being used for representing the probability that the pixel block to be encoded adopts any one of the prediction modes on the condition that the reference pixel block of the pixel block to be encoded adopts any one of the prediction modes.

calculating the number of pixel blocks corresponding to any position identification in any alternative prediction mode according to the position identifications and the prediction modes of the pixel blocks;

determining the probability of any one candidate prediction mode about any one position identifier according to the number of pixel blocks corresponding to any one position identifier of any one candidate prediction mode and the total number of the pixel blocks;

and forming a third conditional probability distribution of the candidate prediction modes with respect to the position identifications by using the probabilities of the candidate prediction modes with respect to the position identifications.

In a second aspect, a live data playing apparatus is provided, the apparatus including:

the first video frame acquisition module is used for acquiring a video frame to be coded in a target video;

a first distribution obtaining module, configured to obtain a conditional probability distribution of candidate prediction modes with respect to pixel information, where the conditional probability distribution is determined according to pixel information and a prediction mode of a plurality of pixel blocks that have been encoded in the target video, and the conditional probability distribution is used to represent a probability that a pixel block to be encoded adopts any one of the candidate prediction modes under a condition that any one of the pixel information is satisfied, where the pixel information of the pixel block includes at least one of a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a location identifier of the pixel block;

a probability obtaining module, configured to obtain, for each pixel block in the video frame, a probability that the pixel block adopts each candidate prediction mode according to the conditional probability distribution and pixel information of the pixel block;

and the selection module is used for selecting the prediction mode of the pixel block from a plurality of candidate prediction modes according to the probability that the pixel block adopts each candidate prediction mode.

Optionally, the probability obtaining module includes at least one of:

a first probability obtaining unit, configured to obtain a probability that the pixel block adopts each candidate prediction mode according to a first conditional probability distribution of the candidate prediction modes with respect to a gradient direction and the gradient direction of the pixel block;

a second probability obtaining unit, configured to obtain, according to a second conditional probability distribution of candidate prediction modes with respect to a prediction mode of a reference pixel block and a prediction mode of the reference pixel block of the pixel block, a probability that the pixel block adopts each of the candidate prediction modes;

and the third probability acquisition unit is used for acquiring the probability of the pixel block adopting each alternative prediction mode according to the third conditional probability distribution of the alternative prediction modes about the position identifications and the position identifications of the pixel block.

Optionally, the probability obtaining module includes:

a calculating unit, configured to, for each candidate prediction mode, when the pixel information includes multiple items of a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identification of the pixel block, calculate a product of probabilities that the pixel block adopts the candidate prediction mode under each item of pixel information as a probability that the pixel block adopts the candidate prediction mode.

Optionally, the selecting module is configured to:

Optionally, the apparatus further comprises:

the second video frame acquisition module is used for acquiring the video frames which are coded in the target video;

the information acquisition module is used for acquiring the pixel information and the prediction mode of each pixel block in the video frame and taking each acquired prediction mode as an alternative prediction mode;

and the second distribution acquisition module is used for acquiring the conditional probability distribution of the candidate prediction mode about the pixel information according to the pixel information and the prediction mode of the pixel blocks in the video frame.

Optionally, the second distribution obtaining module includes:

a first obtaining unit, configured to obtain, according to the gradient direction and the prediction mode of the plurality of pixel blocks, the number of pixel blocks corresponding to any one of the candidate prediction modes and any one of the gradient directions; determining the probability of any one candidate prediction mode about any one gradient direction according to the number of pixel blocks corresponding to the any one candidate prediction mode and the total number of the pixel blocks; and composing the probabilities of the plurality of candidate prediction modes with respect to a plurality of gradient directions into a first conditional probability distribution of the candidate prediction modes with respect to the gradient directions.

Optionally, the second distribution obtaining module includes:

the second acquisition unit is used for forming a prediction mode combination by the prediction mode of any pixel block in the video frame and the prediction mode of a reference pixel block of any pixel block to obtain a plurality of prediction mode combinations; acquiring the occurrence frequency of each prediction mode combination in the video frame, and acquiring the probability of each prediction mode combination according to the acquired occurrence frequency; and composing the probability of each prediction mode combination into a second conditional probability distribution of the candidate prediction mode with respect to the prediction mode of the reference pixel block, the second conditional probability distribution being used for representing the probability that the pixel block to be encoded adopts any one of the prediction modes on the condition that the reference pixel block of the pixel block to be encoded adopts any one of the prediction modes.

Optionally, the second distribution obtaining module includes:

the third acquisition unit is used for calculating the number of pixel blocks corresponding to any position identifier in any alternative prediction mode according to the position identifiers and the prediction modes of the plurality of pixel blocks; determining the probability of any one candidate prediction mode about any one position identifier according to the number of pixel blocks corresponding to any one position identifier of any one candidate prediction mode and the total number of the pixel blocks; and forming a third conditional probability distribution of the candidate prediction modes with respect to the position identifications by using the probabilities of the candidate prediction modes with respect to the position identifications.

In a third aspect, a prediction mode selection apparatus is provided, the apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, the instruction being loaded and executed by the processor to implement the operations performed in the prediction mode selection method according to any one of claims 1 to 8

In a fourth aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, the instruction being loaded and executed by a processor to implement the operations performed in the prediction mode selection method according to the first aspect.

The technical scheme provided by the embodiment of the invention has the following beneficial effects:

according to the method, the device and the storage medium provided by the embodiment of the invention, the conditional probability distribution of the candidate prediction mode about the pixel information is determined according to the pixel information and the prediction mode of the plurality of encoded pixel blocks, and the probability that the pixel block adopts any candidate prediction mode under the condition of meeting any pixel information can be determined according to the conditional probability distribution, so that the probability that the pixel block adopts each candidate prediction mode can be obtained according to the conditional probability distribution and the pixel information of the pixel block to be encoded, the prediction mode is further determined for the pixel block without calculating a rate distortion value, the calculation amount is reduced, and the encoding efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a prediction mode selection method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a prediction mode selection method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an operational flow provided by an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a prediction mode selection apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a server according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the related art, when video encoding is performed, for each pixel block to be encoded in a video frame of a target video, an encoding process according to each candidate prediction mode may be simulated, so as to obtain a rate distortion value corresponding to each prediction mode, select a prediction mode with the smallest rate distortion value from among the multiple candidate prediction modes, and perform encoding according to the prediction mode. However, the number of candidate prediction modes is large, which results in a large amount of calculation in the process of determining the prediction mode, too long encoding time and low encoding efficiency.

In the embodiment of the present invention, the conditional probability distribution of the candidate prediction mode with respect to the pixel information is determined according to the pixel information and the prediction mode of the plurality of encoded pixel blocks, and the probability that the pixel block adopts any candidate prediction mode under the condition that any pixel information is satisfied can be determined according to the conditional probability distribution, so that the probability that the pixel block adopts each candidate prediction mode can be obtained according to the conditional probability distribution and the pixel information of the pixel block to be encoded, and the prediction mode is determined for the pixel block without calculating a rate distortion value, thereby reducing the amount of calculation and improving the encoding efficiency.

The embodiment of the invention can be applied to a scene of coding a video, for example, the method provided by the embodiment of the invention is adopted to determine the prediction mode of each pixel block, and each pixel block is coded according to the corresponding prediction mode, so that the coding process of the video is completed, and at the moment, the video file obtained after coding can be stored, so that the redundant information in the video can be reduced, and the storage space is saved. Or, the video data obtained after coding is sent, so that the data volume of transmission can be reduced, and the network resources are saved.

Fig. 1 is a flowchart of a prediction mode selection method according to an embodiment of the present invention. The execution subject of the embodiment of the invention is a coding device, and referring to fig. 1, the method comprises the following steps:

101. and acquiring a video frame to be coded in the target video.

102. A conditional probability distribution of the alternative prediction modes with respect to the pixel information is obtained.

The conditional probability distribution is determined according to the pixel information and the prediction mode of a plurality of pixel blocks which are coded in the target video, and is used for representing the probability that the pixel block to be coded adopts any alternative prediction mode under the condition that any pixel information is met.

Wherein the pixel information of the pixel block comprises at least one of a gradient direction of the pixel block, a prediction mode of a neighboring pixel block of the pixel block, and a location identification of the pixel block.

103. And for each pixel block in the video frame, acquiring the probability of each alternative prediction mode adopted by the pixel block according to the conditional probability distribution and the pixel information of the pixel block.

104. The prediction mode of the pixel block is selected from a plurality of candidate prediction modes according to the probability that the pixel block adopts each candidate prediction mode.

According to the method provided by the embodiment of the invention, the conditional probability distribution of the candidate prediction mode about the pixel information is determined according to the pixel information and the prediction mode of a plurality of pixel blocks which are coded in the target video, the probability that the pixel block to be coded adopts any candidate prediction mode under the condition of meeting any pixel information is represented by the conditional probability distribution, and the probability that the pixel block adopts each candidate prediction mode can be directly obtained according to the conditional probability distribution and the pixel information of the pixel block, so that the prediction mode of the pixel block is determined without carrying out analog coding on the pixel block or calculating a rate distortion value aiming at each candidate prediction mode, the calculation amount is reduced, the coding time is saved, and the coding efficiency is improved.

Optionally, obtaining the probability of each candidate prediction mode adopted by the pixel block according to the conditional probability distribution and the pixel information of the pixel block, where the probability includes at least one of:

acquiring the probability of each alternative prediction mode adopted by a pixel block according to the first conditional probability distribution of the alternative prediction modes about the gradient direction and the gradient direction of the pixel block;

acquiring the probability of each alternative prediction mode adopted by the pixel block according to the second conditional probability distribution of the alternative prediction mode about the prediction mode of the reference pixel block and the prediction mode of the reference pixel block of the pixel block;

and acquiring the probability of each alternative prediction mode adopted by the pixel block according to the third conditional probability distribution of the alternative prediction modes about the position identifications and the position identifications of the pixel block.

Optionally, obtaining the probability of each candidate prediction mode adopted by the pixel block according to the conditional probability distribution and the pixel information of the pixel block, includes: for each candidate prediction mode, when the pixel information includes a plurality of items among the gradient direction of the pixel block, the prediction mode of the reference pixel block of the pixel block, and the position identification of the pixel block, calculating the product of the probabilities that the pixel block adopts the candidate prediction mode under each item of pixel information as the probability that the pixel block adopts the candidate prediction mode.

Optionally, selecting a prediction mode of the pixel block from a plurality of candidate prediction modes according to the probability that the pixel block adopts each candidate prediction mode includes:

according to the probability of adopting each alternative prediction mode by the pixel block, selecting the alternative prediction mode with the maximum probability from the multiple alternative prediction modes as the prediction mode of the pixel block; alternatively, the first and second electrodes may be,

selecting a preset number of alternative prediction modes from the multiple alternative prediction modes according to the sequence of the probability of each alternative prediction mode adopted by the pixel block from large to small, respectively encoding the pixel block according to each selected alternative prediction mode, obtaining the rate distortion value of the pixel block, and selecting the alternative prediction mode with the minimum rate distortion value as the prediction mode of the pixel block.

Optionally, acquiring a video frame which is encoded in the target video;

acquiring pixel information and a prediction mode of each pixel block in a video frame, and taking each acquired prediction mode as an alternative prediction mode;

and acquiring the conditional probability distribution of the candidate prediction mode relative to the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the video frame.

Optionally, obtaining a conditional probability distribution of the candidate prediction mode with respect to the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the video frame includes:

determining the probability of any one candidate prediction mode about any one gradient direction according to the number of pixel blocks corresponding to any one gradient direction and the total number of the plurality of pixel blocks of any one candidate prediction mode;

the probabilities of the plurality of candidate prediction modes with respect to the plurality of gradient directions are combined into a first conditional probability distribution of the candidate prediction modes with respect to the gradient directions.

the prediction mode of any pixel block in the video frame and the prediction mode of a reference pixel block of any pixel block form a prediction mode combination to obtain various prediction mode combinations;

acquiring the occurrence frequency of each prediction mode combination in a video frame, and acquiring the probability of each prediction mode combination according to the acquired occurrence frequency;

the probabilities of each combination of prediction modes constitute a second conditional probability distribution of the candidate prediction modes with respect to the prediction mode of the reference pixel block, the second conditional probability distribution being intended to represent the probability that the pixel block to be encoded adopts any one of the prediction modes on condition that the reference pixel block of the pixel block to be encoded adopts any one of the prediction modes. Optionally, obtaining a conditional probability distribution of the candidate prediction mode with respect to the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the video frame includes:

determining the probability of any one candidate prediction mode about any one position identifier according to the number of pixel blocks corresponding to any one position identifier and the total number of the plurality of pixel blocks of any one candidate prediction mode;

the probabilities of the plurality of candidate prediction modes with respect to the plurality of location identities are formed into a third conditional probability distribution of the candidate prediction modes with respect to the location identities.

All the above-mentioned optional technical solutions can be combined arbitrarily to form the optional embodiments of the present invention, and are not described herein again.

Fig. 2 is a flowchart of a prediction mode selection method according to an embodiment of the present invention. The execution subject of the embodiment of the invention is a coding device, and referring to fig. 2, the method includes:

201. when the video frame to be encoded is acquired, the video frame is divided into a training video frame or a prediction video frame, if the video frame is divided into the training video frame, step 202-.

The encoding device may be a terminal such as a mobile phone, a computer, or a tablet computer, or may be a server. The target video is a video to be encoded, and may be a video file stored on a terminal or a server, or may also be a video sent by the server to the terminal and played by the terminal. The target video includes a plurality of video frames, each of which includes a plurality of pixels, and when the target video is encoded, each of the pixels in each of the video frames needs to be encoded.

In the embodiment of the invention, in order to reduce the calculation amount and shorten the encoding time, the encoding device can perform predictive encoding on the uncoded video frame according to the encoded video frame. For this purpose, a plurality of video frames in the target video are divided into training video frames and prediction video frames, and the training video frames can be coded to obtain the conditional probability distribution of the alternative prediction modes about the pixel information, and the conditional probability distribution can be applied to the coding process of the prediction video frames, so that the coding time of the prediction video frames is saved.

Optionally, positions of a plurality of video frames in the target video have a sequence, and a dividing manner of a training video frame position and a prediction video frame position may be predetermined before encoding, so that in the process of encoding the target video, each time a certain video frame is to be encoded, a position of the video frame in the target video is determined, and according to the position of the video frame and the predetermined position dividing manner, it is determined whether the video frame is divided into the training video frame or the prediction video frame.

The method comprises the steps of determining positions of training video frames and positions of prediction video frames, wherein the positions of a first number are used as the positions of the training video frames, then positions of a second number are used as the positions of the prediction video frames, then the positions of the first number are used as the positions of the training video frames, and the like until each position in a target video is determined to be the position of the training video frame or the position of the prediction video frame. The first number and the second number may or may not be equal.

For example, the first to third video frames in the target video are taken as training video frames, the fourth to sixth video frames are taken as prediction video frames, the seventh to ninth video frames are taken as training video frames, and so on.

Optionally, in order to make the ratio between the training video frame and the prediction video frame obtained by division meet the requirement, the division probability of the training video frame may be set to be a first probability, the division probability of the prediction video frame may be set to be a second probability, and the sum of the first probability and the second probability is 1. In the process of encoding the target video, when a certain video frame is to be encoded each time, the video frame is randomly divided according to the first probability and the second probability, and whether the video frame is a training video frame or a prediction video frame is determined. Therefore, the proportion between the training video frame and the prediction video frame which are divided in the target video can meet the requirement, and the training video frame and the prediction video frame can be guaranteed to realize cross division due to the randomness of the probability.

Optionally, in order to implement the cross-partition between the training video frame and the prediction video frame, the partition manner may be preset as follows: in the target video, the video frame in the first playing time length is divided into training video frames, the video frame in the second playing time length is divided into prediction video frames, the video frame in the first playing time length is divided into training video frames, and the like. In the process of encoding the target video, each time a certain video frame is to be encoded, it is determined whether the video frame is a training video frame or a prediction video frame according to the playing time point of the video frame in the target video.

In addition, the encoding apparatus may also use other methods to partition the training video frame and the prediction video frame, which is not described herein again in the embodiments of the present invention.

202. Acquiring pixel information and a prediction mode of each pixel block in a training video frame, and taking each acquired prediction mode as an alternative prediction mode.

When the current video frame to be encoded is a training video frame, dividing the training video frame into a plurality of pixel blocks, and acquiring a plurality of candidate prediction modes, wherein each candidate prediction mode is used for specifying a reference pixel block of any pixel block and a specific mode for calculating an encoding value of any pixel block according to a pixel value of the reference pixel block.

For each pixel block to be encoded, a process of encoding according to each candidate prediction mode may be simulated, that is, the pixel block is encoded according to each candidate prediction mode, and a difference between an original pixel value and a predicted pixel value of the pixel block may be determined according to an encoding value of the pixel block, so as to calculate a rate-distortion value of the pixel block, where the rate-distortion value may represent a prediction accuracy of the pixel block. Then, a prediction mode with the minimum rate distortion value is selected from the multiple candidate prediction modes to serve as the prediction mode of the pixel block, and the pixel block is encoded according to the determined prediction mode and the pixel value of the reference pixel block of the pixel block to obtain the encoded value of the pixel block.

The pixel block refers to the minimum unit for encoding in a video frame, and the prediction modes adopted by the pixel blocks with different sizes are different. For example, in the h264 standard (highly compressed digital video codec standard), 4 × 4 pixel blocks may adopt 9 prediction modes, and 16 × 16 pixel blocks may adopt another 4 prediction modes.

Then, when encoding, a maximum pixel block size may be determined according to pixel block sizes corresponding to the plurality of candidate prediction modes, and the video frame may be divided into a plurality of macroblocks according to the maximum pixel block size. When each macroblock is subjected to analog coding, the macroblock can be subdivided into pixel blocks meeting the size specified by the candidate prediction mode according to the currently selected candidate prediction mode.

For example, the video frame may be first divided into 16 × 16 macroblocks, and each macroblock may calculate rate distortion values when 4 candidate prediction modes with a size of 16 × 16 are adopted, respectively, to obtain rate distortion values of 4 macroblocks. In addition, the macroblock may be further divided into 16 sub-macroblocks of 4 × 4, each sub-macroblock may respectively calculate rate-distortion values when 9 candidate prediction modes with a size of 4 × 4 are adopted, to obtain 9 rate-distortion values, determine a minimum rate-distortion value of the 9 rate-distortion values and a candidate prediction mode corresponding to the minimum rate-distortion value, use a sum of minimum rate-distortion values of 16 sub-macroblocks in the macroblock as a rate-distortion value of the macroblock, may also obtain 1 rate-distortion value of the macroblock, compare 5 rate-distortion values obtained by the macroblock, and select a prediction mode corresponding to the minimum rate-distortion value.

In addition, the reference pixel block of each pixel block refers to a pixel block adjacent to the pixel block and having been already encoded. The adjacent pixel blocks of each pixel block can be multiple, and the specific adoption of which coded pixel block as the reference pixel block can be determined according to the prediction mode. For example, the prediction mode 0 specifies that any pixel block has the last line of pixel blocks above the pixel block as a reference pixel block.

After the training video frame is encoded, a prediction mode adopted by each pixel block in the training video frame can be obtained, in addition, pixel information of each pixel block can also be obtained, at this time, each prediction mode obtained in the training video frame can be used as an alternative prediction mode, statistics can be carried out according to the pixel information and the prediction mode of the same pixel block, and the association relationship between each pixel information and each alternative prediction mode can be learned.

Wherein the pixel information of the pixel block comprises at least one of a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a location identification of the pixel block. The gradient direction is used to indicate a direction in which the pixel block has the largest change in pixel value in the neighborhood, and the manner of calculating the gradient direction may be a square gradient method or a gradient histogram method. The reference pixel block of the pixel block may be a left-adjacent pixel block, an upper-left adjacent pixel block, or a pixel block at another position, and when the pixel block is encoded, the prediction mode of the reference pixel block may be used to determine the prediction mode for the current pixel block, considering that the prediction modes between the adjacent pixel blocks have relevance. The position identifier of the pixel block is used to indicate the position of the pixel block in the video frame, and may be a position number or the coordinates of the start point of the pixel block in the video frame.

203. And acquiring a conditional probability distribution of the candidate prediction mode about the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the training video frame, and executing step 204.

The encoding device may obtain a plurality of pixel information and a plurality of candidate prediction modes after obtaining the pixel information and the prediction modes of the plurality of pixel blocks, and at this time, in order to count the association relationship between each pixel information and each candidate prediction mode, may combine the plurality of pixel information and the plurality of candidate prediction modes two by two to obtain a combination including any one of the pixel information and any one of the candidate prediction modes. For each combination, the pixel block which satisfies the pixel information in the combination and adopts the candidate prediction mode in the combination is the pixel block corresponding to the combination, and the more the pixel blocks corresponding to the combination, the stronger the association between the pixel information and the candidate prediction mode is, and the less the pixel blocks corresponding to the combination, the weaker the association between the pixel information and the candidate prediction mode is.

Therefore, the probability of the candidate prediction mode about the pixel information can be determined according to the number of the pixel blocks corresponding to the combination, and then the probabilities corresponding to a plurality of combinations are combined into the conditional probability distribution of the candidate prediction mode about the pixel information. The probability may represent the probability that the pixel block adopts the alternative prediction mode under the condition that the pixel information is satisfied, and the conditional probability distribution may represent the probability that the pixel block adopts any alternative prediction mode under the condition that any pixel information is satisfied.

In addition, the encoding apparatus may also adopt other statistical methods to obtain the conditional probability distribution according to the pixel information and the prediction mode of the plurality of pixel blocks, and the embodiment of the present invention is not described herein again.

It should be noted that, in the embodiment of the present invention, the encoding device may sequentially encode each video frame in the target video according to the sequence of each video frame, and after a certain training video frame is encoded, the obtained conditional probability distribution may be updated according to the pixel information and the prediction mode of the plurality of pixel blocks in the training video frame. The updating process comprises the following steps: for each combination, obtaining the updated pixel block number of the combination according to the pixel block number corresponding to the combination before the coding of the training video frame and the pixel block number corresponding to the combination in the training video frame coded this time, re-determining the probability of the candidate prediction mode about the pixel information according to the updated pixel block number of the combination, and further obtaining the updated conditional probability distribution.

In practical applications, a sliding window may be used to update the conditional probability distribution, for example, the length of the sliding window is 500 frames, and every time the conditional probability distribution is to be updated, 500 frames before the current video frame are selected, and the conditional probability distribution is updated according to the pixel information and the prediction mode of each pixel block in the training video frame in the 500 frames. To ensure the accuracy of the conditional probability distribution, the predicted video frames contained within these 500 frames are not considered, as are other video frames preceding these 500 frames. Then, as time goes on, the sliding window slides continuously, so that it can be ensured that the training video frame with smaller time difference with the current video frame can be adopted when the conditional probability distribution is updated every time, and the training video frame with larger time difference with the current video frame is not adopted any more, thereby improving the accuracy of the conditional probability distribution.

Optionally, the step 203 comprises at least one of the following steps 2031-2033:

2031. calculating the number of pixel blocks corresponding to any gradient direction in any candidate prediction mode according to the gradient directions and the prediction modes of the pixel blocks; determining the probability of any one candidate prediction mode about any one gradient direction according to the number of pixel blocks corresponding to any one gradient direction and the total number of the pixel blocks; the probabilities of the plurality of candidate prediction modes with respect to the plurality of gradient directions are combined into a first conditional probability distribution of the candidate prediction modes with respect to the gradient directions.

Alternatively, the ratio between the number of pixel blocks corresponding to any one of the candidate prediction modes and any one of the gradient directions and the total number of the plurality of pixel blocks may be used as the probability of the any one of the candidate prediction modes with respect to the any one of the gradient directions.

For example, D denotes the gradient direction and Q is the prediction mode, assuming that the gradient directions occurring in the encoded training video frame include m { D }₁，D₂……D_mThe adopted prediction modes comprise n { Q }₁，Q₂……Q_nAnd m and n are positive integers, then m × n combinations of the gradient direction and the prediction mode exist, and corresponding probabilities can be calculated according to the number of pixel blocks corresponding to each combination to obtain conditional probability distribution. The conditional probability distribution can be shown as the following matrix:

the conditional probability distribution can be stored in the form of the matrix, a two-dimensional array or other manners.

2032. The prediction mode of any pixel block in the video frame and the prediction mode of a reference pixel block of any pixel block form a prediction mode combination to obtain various prediction mode combinations; acquiring the occurrence frequency of each prediction mode combination in a video frame, and acquiring the probability of each prediction mode combination according to the acquired occurrence frequency; the probabilities of each combination of prediction modes constitute a second conditional probability distribution of the candidate prediction modes with respect to the prediction mode of the reference pixel block, the second conditional probability distribution being intended to represent the probability that the pixel block to be encoded adopts any one of the prediction modes on condition that the reference pixel block of the pixel block to be encoded adopts any one of the prediction modes.

The reference pixel block of the pixel block may include one or more pixel blocks, such as a left pixel block of the pixel block, an upper pixel block of the pixel block, and an upper left pixel block of the pixel block. When encoding a video frame, the pixel blocks may be sequentially encoded from left to right and from top to bottom, and when encoding each pixel block, the pixel value of the left or upper-left reference pixel block may be referred to. Of course, the reference pixel block may also include a right pixel block, an upper pixel block, and an upper-right pixel block of the pixel block, or pixel blocks at other positions.

Optionally, the number of occurrences of each prediction mode combination in the video frame is obtained, and the ratio of the number of occurrences of each prediction mode combination to the total number of occurrences of all prediction mode combinations is calculated as the probability of each prediction mode combination.

For example, X denotes the prediction mode of the reference pixel block, Q denotes the prediction mode of the current pixel block to be coded, and it is assumed that the prediction mode adopted by the pixel block includes n kinds of { Q }₁，Q₂……Q_nAnd n is a positive integer, then the combination of the prediction mode of the reference pixel block and the prediction mode of the pixel block has n × n, and the corresponding probability can be calculated according to the number of the pixel blocks corresponding to each combination, so as to obtain the conditional probability distribution. The conditional probability distribution can be shown as the following matrix:

2033. calculating the number of pixel blocks corresponding to any position identification in any alternative prediction mode according to the position identifications and the prediction modes of the pixel blocks; determining the probability of any one candidate prediction mode about any one position identifier according to the number of pixel blocks corresponding to any one position identifier and the total number of the plurality of pixel blocks of any one candidate prediction mode; the probabilities of the plurality of candidate prediction modes with respect to the plurality of location identities are formed into a third conditional probability distribution of the candidate prediction modes with respect to the location identities.

Alternatively, the location identifier is used to indicate the location of the pixel block in the video frame, and may be a serial number, coordinates, or other identifier of the pixel block. For example, the pixel blocks in the video frame are arranged in the order from left to right and from top to bottom, and the sequence number of each pixel block can be determined as the position identifier. Or a coordinate system is established by taking the position of the first pixel point at the upper left corner of the video frame as an origin, and the coordinate of each pixel block in the coordinate system can be used as a position identifier.

Optionally, a ratio between the number of pixel blocks corresponding to any one candidate prediction mode and any one position identifier and the total number of the plurality of pixel blocks is used as a probability of any one candidate prediction mode with respect to any one position identifier.

For example, H represents the location identifier, Q is the prediction mode, and it is assumed that the encoded training video frame contains m pixel blocks, i.e. the location identifier includes m { D }₁，D₂……D_mThe adopted prediction modes comprise n { Q }₁，Q₂……Q_nAnd m and n are positive integers, then m x n combinations of the position marks and the prediction modes are provided, and corresponding probability can be calculated according to the number of pixel blocks corresponding to each combination to obtain conditional probability distribution. The conditional probability distribution can be shown as the following matrix:

204. a conditional probability distribution of the alternative prediction modes with respect to the pixel information is obtained.

When the encoding apparatus acquires a video frame to be currently encoded and the video frame is a prediction video frame, the encoding method of the video frame trained in step 202 is not used, but a conditional probability distribution is acquired, so that the prediction mode to be used is directly determined according to the conditional probability distribution.

205. And for each pixel block in the video frame, acquiring the probability of each alternative prediction mode adopted by the pixel block according to the conditional probability distribution of the alternative prediction modes about the pixel information and the pixel information of the pixel block.

Since the probability of each candidate prediction mode with respect to each pixel information is already determined in the conditional probability distribution, that is, the probability of each candidate prediction mode being adopted by the pixel block under the condition that each pixel information is satisfied is determined. Therefore, when the pixel information of the pixel block is acquired, the probability of each candidate prediction mode adopted by the pixel block can be determined according to the conditional probability distribution.

Based on the above steps 2031-2033, the step 205 may include at least one of the following steps 2051-2053:

2051. and acquiring the probability of each alternative prediction mode adopted by the pixel block according to the first conditional probability distribution of the alternative prediction modes about the gradient direction and the gradient direction of the pixel block.

Since the probability of each candidate prediction mode with respect to each gradient direction has been determined in the first conditional probability distribution, i.e. the probability of each candidate prediction mode being used by a block of pixels having each gradient direction is determined. Therefore, when the gradient direction of the pixel block is obtained, the probability that the pixel block adopts each alternative prediction mode can be determined according to the first conditional probability distribution.

Based on the example in step 2031, assume that the gradient direction of the pixel block is D₁Then, according to the first conditional probability distribution, the probability that the pixel block adopts each candidate prediction mode can be determined to be { P (Q) } respectively₁|D₁),P(Q₂|D₁)……P(Q_n|D₁)}。

2052. And acquiring the probability of each alternative prediction mode adopted by the pixel block according to the second conditional probability distribution of the alternative prediction mode relative to the prediction mode of the reference pixel block and the prediction mode of the reference pixel block of the pixel block.

Based on the example in step 2032 above, assume that the prediction mode of the reference pixel block of the pixel blocks is X₁Then, according to the second conditional probability distribution, the probability that the pixel block adopts each candidate prediction mode can be determined to be { P (Q) } respectively₁|X₁),P(Q₂|X₁)……P(Q_n|X₁)}。

2053. And acquiring the probability of each alternative prediction mode adopted by the pixel block according to the third conditional probability distribution of the alternative prediction modes about the position identifications and the position identifications of the pixel block.

Based on the example in step 2033 above, assume that the position of the reference pixel block of the pixel block is identified as H₁Then, according to the third conditional probability distribution, the probability that the pixel block adopts each candidate prediction mode can be determined to be { P (Q) }₁|H₁),P(Q₂|H₁)……P(Q_n|H₁)}。

It should be noted that, in the embodiment of the present invention, the pixel information of the pixel block may include one or more of a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identifier of the pixel block, and when the pixel information includes only a certain item, only the probability that the pixel block adopts each candidate prediction mode needs to be obtained according to the item of pixel information.

When the pixel information includes multiple items of the gradient direction of the pixel block, the prediction mode of the reference pixel block of the pixel block, and the position identification of the pixel block, the probability that the pixel block adopts each candidate prediction mode under each item of pixel information may be respectively obtained according to the conditional probability distribution of the candidate prediction mode on each item of pixel information and each item of pixel information of the pixel block, and the product of the probabilities that the pixel block adopts each candidate prediction mode under each item of pixel information is calculated as the probability that the pixel block adopts each candidate prediction mode.

The probability that the pixel block adopts each candidate prediction mode should be the probability that the pixel block adopts each candidate prediction mode under the condition that a plurality of items of pixel information are simultaneously met. According to the bayesian classification method, assuming that each item of pixel information is independent, the probability of the pixel block adopting each candidate prediction mode can be the product of the probabilities of the pixel block adopting each candidate prediction mode under the condition of satisfying each item of pixel information.

For example, the gradient direction of the pixel block is D₁Determining the gradient direction D according to the first conditional probability distribution₁Conditional use of alternative prediction modesThe probability of Q is P (Q | D ═ D)₁) The prediction modes of the left, upper and upper-left reference pixel blocks of the pixel block are X respectively₁、Y₁And Z₁According to the second conditional probability distribution, the probability of adopting the candidate prediction mode Q under the corresponding condition is determined to be P (Q | X ═ X)₁)、P(Q|Y＝Y₁) And P (Q | Z ═ Z)₁) The position of the pixel block is marked as H₁Determining the position mark H according to the third condition probability distribution₁The probability of using the candidate prediction mode Q under the condition is P (Q | H ═ H)₁)。

206. The prediction mode of the pixel block is selected from a plurality of candidate prediction modes according to the probability that the pixel block adopts each candidate prediction mode.

Optionally, the candidate prediction modes with the highest probability are selected from the multiple candidate prediction modes as the prediction modes of the pixel block according to the probability of adopting each candidate prediction mode of the pixel block and the sequence of the probabilities from large to small. In addition, if there are at least two candidate prediction modes with the highest probability, any one of the candidate prediction modes with the highest probability may be randomly selected, or the pixel block may be encoded according to each candidate prediction mode with the highest probability to obtain the rate distortion value of the pixel block, and the candidate prediction mode with the lowest rate distortion value is selected as the prediction mode of the pixel block.

Or selecting a preset number of alternative prediction modes from the multiple alternative prediction modes according to the sequence of the probability of each alternative prediction mode adopted by the pixel block from large to small, respectively encoding the pixel block according to each selected alternative prediction mode, obtaining the rate distortion value of the pixel block, and selecting the alternative prediction mode with the minimum rate distortion value as the prediction mode of the pixel block. The preset number can be determined according to the total number of all the candidate prediction modes and a certain selection proportion, and the selection proportion can be comprehensively determined according to the requirements on the accuracy of the coding and the coding efficiency.

The manner of determining the prediction mode by obtaining the rate distortion value is similar to that in step 202, and is not described herein again.

207. And coding the pixel block according to the determined prediction mode and the pixel value of the reference pixel block of the pixel block to obtain the coding value of the pixel block.

After the current prediction video frame is coded, the next video frame in the target video can be coded continuously until all the video frames in the target video are coded.

Fig. 3 is a flowchart of an operation provided by an embodiment of the present invention, referring to fig. 3, when encoding a target video, a video frame of the target video is first divided into two parts, i.e., a training video frame and a prediction video frame.

If the current video frame to be coded is a training video frame, the training video frame is coded in a traditional coding mode, the prediction mode of each pixel block in the training video frame is determined in the coding process, and the three-item pixel information of each pixel block is calculated, so that the conditional probability distribution of the candidate prediction mode about the three-item pixel information can be respectively updated.

If the current video frame to be coded is a prediction video frame, three items of pixel information of each pixel block, namely the gradient direction, the reference pixel block mode and the position identification, are firstly obtained, then the probability of each alternative prediction mode under the condition of three pixel information is obtained according to the three items of pixel information, so that the probability of adopting each alternative prediction mode is obtained, a certain number of prediction modes are selected according to the probability, the rate distortion value of each alternative prediction mode in the number of alternative prediction modes is calculated, and then the prediction mode to be adopted by the pixel block is determined according to the rate distortion value of each alternative prediction mode.

And after the coding is finished, if the current video frame is the last video frame in the target video, ending the coding, and if the current video frame is not the last video frame, continuing to code the next video frame in the same way until the coding of the last video frame in the target video is finished.

Fig. 4 is a schematic structural diagram of a prediction mode selection apparatus according to an embodiment of the present invention, referring to fig. 4, the apparatus includes:

a first video frame acquiring module 401, configured to acquire a video frame to be encoded in a target video;

a first distribution obtaining module 402, configured to obtain a conditional probability distribution of candidate prediction modes with respect to pixel information, where the conditional probability distribution is determined according to pixel information and a prediction mode of a plurality of encoded pixel blocks in a target video, the conditional probability distribution is used to indicate a probability that a pixel block to be encoded adopts any one of the candidate prediction modes under a condition that any one of the pixel information is satisfied, and the pixel information of the pixel block includes at least one of a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identifier of the pixel block;

a probability obtaining module 403, configured to obtain, for each pixel block in the video frame, a probability that the pixel block adopts each candidate prediction mode according to the conditional probability distribution and the pixel information of the pixel block;

a selection module 404, configured to select a prediction mode for a pixel block from a plurality of candidate prediction modes according to a probability that the pixel block adopts each candidate prediction mode.

Optionally, the probability obtaining module 403 includes at least one of:

a first probability obtaining unit, configured to obtain, according to a first conditional probability distribution of the candidate prediction modes with respect to a gradient direction and the gradient direction of the pixel block, a probability that the pixel block adopts each of the candidate prediction modes;

a second probability obtaining unit configured to obtain a probability that the pixel block adopts each of the candidate prediction modes, based on a second conditional probability distribution of the candidate prediction modes with respect to the prediction mode of the reference pixel block and the prediction mode of the reference pixel block of the pixel block;

and the third probability acquisition unit is used for acquiring the probability of each alternative prediction mode adopted by the pixel block according to the third conditional probability distribution of the alternative prediction modes relative to the position identifications and the position identifications of the pixel block.

Optionally, the probability obtaining module 403 includes:

and a calculation unit configured to calculate, for each candidate prediction mode, a product of probabilities that the pixel block adopts the candidate prediction mode under each item of pixel information as probabilities that the pixel block adopts the candidate prediction mode, when the pixel information includes a plurality of items among a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identification of the pixel block.

Optionally, a selecting module 404, configured to:

Optionally, the apparatus further comprises:

the second video frame acquisition module is used for acquiring a video frame which is coded in the target video;

and the second distribution acquisition module is used for acquiring the conditional probability distribution of the candidate prediction mode about the pixel information according to the pixel information and the prediction mode of a plurality of pixel blocks in the video frame.

Optionally, the second distribution obtaining module includes:

the first acquisition unit is used for acquiring the pixel block number of any one alternative prediction mode corresponding to any one gradient direction according to the gradient direction and the prediction mode of the pixel blocks; determining the probability of any one candidate prediction mode about any one gradient direction according to the number of pixel blocks corresponding to any one gradient direction and the total number of the pixel blocks; the probabilities of the plurality of candidate prediction modes with respect to the plurality of gradient directions are combined into a first conditional probability distribution of the candidate prediction modes with respect to the gradient directions.

Optionally, the second distribution obtaining module includes:

the second acquisition unit is used for forming a prediction mode combination by the prediction mode of any pixel block in the video frame and the prediction mode of a reference pixel block of any pixel block to obtain a plurality of prediction mode combinations; acquiring the occurrence frequency of each prediction mode combination in a video frame, and acquiring the probability of each prediction mode combination according to the acquired occurrence frequency; the probabilities of each combination of prediction modes constitute a second conditional probability distribution of the candidate prediction modes with respect to the prediction mode of the reference pixel block, the second conditional probability distribution being intended to represent the probability that the pixel block to be encoded adopts any one of the prediction modes on condition that the reference pixel block of the pixel block to be encoded adopts any one of the prediction modes.

Optionally, the second distribution obtaining module includes:

the third acquisition unit is used for calculating the number of pixel blocks corresponding to any position identifier of any alternative prediction mode according to the position identifiers and the prediction modes of the plurality of pixel blocks; determining the probability of any one candidate prediction mode about any one position identifier according to the number of pixel blocks corresponding to any one position identifier and the total number of the plurality of pixel blocks of any one candidate prediction mode; the probabilities of the plurality of candidate prediction modes with respect to the plurality of location identities are formed into a third conditional probability distribution of the candidate prediction modes with respect to the location identities.

It should be noted that: in the prediction mode selection apparatus provided in the foregoing embodiment, when the prediction mode is selected, only the division of the functional modules is illustrated, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the encoding apparatus is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the prediction mode selection apparatus and the prediction mode selection method provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments and are not described herein again.

Fig. 5 is a schematic structural diagram of a server according to an embodiment of the present invention, where the server 500 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 501 and one or more memories 502, where the memory 502 stores at least one instruction, and the at least one instruction is loaded and executed by the processors 501 to implement the methods provided by the above method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The server 500 may be configured to perform the steps performed by the encoding apparatus in the prediction mode selection method described above.

Fig. 6 is a schematic structural diagram of a terminal 600 according to an embodiment of the present invention. The terminal 600 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), a notebook computer, a desktop computer, a head-mounted device, or any other intelligent terminal. The terminal 600 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

In general, the terminal 600 includes: a processor 601 and a memory 602.

The processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit, image Processing interactor) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for being possessed by processor 601 to implement the prediction mode selection methods provided by the method embodiments herein.

In some embodiments, the terminal 600 may further optionally include: a peripheral interface 603 and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 604, a touch screen display 605, a camera 606, an audio circuit 607, a positioning component 608, and a power supply 609.

The peripheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 601 and the memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 601, the memory 602, and the peripheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 8G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 604 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 605 is a touch display screen, the display screen 605 also has the ability to capture touch signals on or over the surface of the display screen 605. The touch signal may be input to the processor 601 as a control signal for processing. At this point, the display 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 605 may be one, providing the front panel of the terminal 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, the display 605 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 605 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 606 is used to capture images or video. Optionally, camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing or inputting the electric signals to the radio frequency circuit 604 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 607 may also include a headphone jack.

The positioning component 608 is used for positioning the current geographic Location of the terminal 600 to implement navigation or LBS (Location Based Service). The Positioning component 608 can be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union's galileo System.

Power supply 609 is used to provide power to the various components in terminal 600. The power supply 609 may be ac, dc, disposable or rechargeable. When the power supply 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 601 may control the touch screen display 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 and the acceleration sensor 611 may cooperate to acquire a 3D motion of the user on the terminal 600. The processor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

The pressure sensor 613 may be disposed on a side frame of the terminal 600 and/or on a lower layer of the touch display screen 605. When the pressure sensor 613 is disposed on the side frame of the terminal 600, a user's holding signal of the terminal 600 can be detected, and the processor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 614 is used for collecting a fingerprint of a user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 601 authorizes the user to have relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 614 may be disposed on the front, back, or side of the terminal 600. When a physical key or vendor Logo is provided on the terminal 600, the fingerprint sensor 614 may be integrated with the physical key or vendor Logo.

The optical sensor 615 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the ambient light intensity collected by optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 605 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.

A proximity sensor 616, also known as a distance sensor, is typically disposed on the front panel of the terminal 600. The proximity sensor 616 is used to collect the distance between the user and the front surface of the terminal 600. In one embodiment, when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually decreases, the processor 601 controls the touch display 605 to switch from the bright screen state to the dark screen state; when the proximity sensor 616 detects that the distance between the user and the front surface of the terminal 600 gradually becomes larger, the processor 601 controls the touch display 605 to switch from the breath screen state to the bright screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 6 is not intended to be limiting of terminal 600 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

The embodiment of the present invention further provides a prediction mode selection apparatus, where the prediction mode selection apparatus includes a processor and a memory, where the memory stores at least one instruction, and the instruction is loaded and executed by the processor to implement the operations performed in the prediction mode selection method of the foregoing embodiment.

The embodiment of the present invention also provides a computer-readable storage medium, where at least one instruction is stored in the computer-readable storage medium, and the instruction is loaded and executed by a processor to implement the operations performed in the prediction mode selection method of the foregoing embodiment.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method of prediction mode selection, the method comprising:

acquiring a video frame to be coded in a target video;

2. The method according to claim 1, wherein said obtaining the probability of said pixel block adopting said each candidate prediction mode according to said conditional probability distribution and the pixel information of said pixel block comprises at least one of:

3. The method according to claim 2, wherein said obtaining the probability of the pixel block adopting the each candidate prediction mode according to the conditional probability distribution and the pixel information of the pixel block comprises:

for each candidate prediction mode, when the pixel information includes a plurality of items among a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identification of the pixel block, calculating a product of probabilities that the pixel block adopts the candidate prediction mode under each item of pixel information as a probability that the pixel block adopts the candidate prediction mode.

4. The method according to claim 1, wherein said selecting the prediction mode of said block of pixels from a plurality of candidate prediction modes based on the probability of said block of pixels adopting said each candidate prediction mode comprises:

5. The method of claim 1, further comprising:

acquiring a coded video frame in the target video;

6. The method according to claim 5, wherein said obtaining a conditional probability distribution of the candidate prediction mode with respect to pixel information according to pixel information and prediction modes of a plurality of pixel blocks in the video frame comprises:

7. The method according to claim 5, wherein said obtaining a conditional probability distribution of the candidate prediction mode with respect to pixel information according to pixel information and prediction modes of a plurality of pixel blocks in the video frame comprises:

8. The method according to claim 5, wherein said obtaining a conditional probability distribution of the candidate prediction mode with respect to pixel information according to pixel information and prediction modes of a plurality of pixel blocks in the video frame comprises:

9. A prediction mode selection apparatus, the apparatus comprising:

10. The apparatus of claim 9, wherein the probability obtaining module comprises at least one of:

11. The apparatus of claim 10, wherein the probability obtaining module comprises:

a calculation unit configured to, for each candidate prediction mode, calculate, as a probability that the pixel block adopts the candidate prediction mode, a product of probabilities that the pixel block adopts the candidate prediction mode for each item of pixel information when the pixel information includes a plurality of items among a gradient direction of the pixel block, a prediction mode of a reference pixel block of the pixel block, and a position identification of the pixel block.

12. The apparatus of claim 9, wherein the selection module is configured to:

13. The apparatus of claim 9, further comprising:

14. The apparatus of claim 13, wherein the second distribution obtaining module comprises:

15. The apparatus of claim 13, wherein the second distribution obtaining module comprises:

16. The apparatus of claim 13, wherein the second distribution obtaining module comprises:

17. A prediction mode selection apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, the instruction being loaded and executed by the processor to carry out the operations carried out in the prediction mode selection method according to any one of claims 1 to 8.

18. A computer-readable storage medium having stored therein at least one instruction, which is loaded and executed by a processor to perform operations performed in the prediction mode selection method according to any one of claims 1 to 8.