CN108024111B

CN108024111B - Frame type judgment method and device

Info

Publication number: CN108024111B
Application number: CN201610966433.1A
Authority: CN
Inventors: 张贤国; 朱政; 金星; 张二丽; 范娟婷
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Priority date: 2016-10-28
Filing date: 2016-10-28
Publication date: 2019-12-06
Anticipated expiration: 2036-10-28
Also published as: CN108024111A

Abstract

the embodiment of the invention discloses a frame type judgment method and a device, wherein the method is applied to an encoder and comprises the following steps: determining any I frame coded before an image to be coded as a target I frame; analyzing the coding result of the first type of video frame to obtain a first statistical model value; analyzing the coding result of the second type of video frame to obtain a second statistical model value; judging whether the image to be coded meets a preset I frame selection condition or not according to the first statistical model value and the second statistical model value; if yes, the frame type of the image to be coded is judged as an I frame. The scheme provided by the embodiment of the invention is applied to video coding, so that the coding speed can be increased, the video compression efficiency can be improved, the video coding loss can be reduced, and the video coding efficiency can be improved.

Description

Frame type judgment method and device

Technical Field

the present invention relates to the field of video coding technologies, and in particular, to a method and an apparatus for determining a frame type.

background

with the continuous development of digital video services in multimedia applications and the continuous improvement of the demand for video cloud computing, the bandwidth and storage resources of the existing wired and wireless transmission networks cannot bear the large data volume of the original video information source. Therefore, data compression of video signals in the transmission and storage processes becomes one of the hot spots of research and application at home and abroad at present; video data compression, also known as video coding, aims to eliminate various data redundancies of video signals.

At present, various video coding standards are established successively by standardization working organizations at home and abroad in the last decades. In order to reduce the transmission bandwidth and storage space occupied by video data, these standards have been implemented by prediction, transform, scanning, quantization, entropy coding, etc. to effectively reduce the various data redundancies. The prediction technology is a common technology, which divides image coding frame types into three frame types of I frame, P frame and B frame according to the prediction reference relation between frames or within frames. The I frame is an intra-frame prediction image, has the characteristics of small quantization parameter, mode decision biased selection of a low-loss mode and less objective coding loss in the realization of a general encoder, and can improve the compression efficiency of a subsequent image when used as prediction reference of the subsequent image. Therefore, in addition to having a random access characteristic, I frames often have a function of periodically recovering video coding loss and preventing video quality from being degraded.

in the prior art, the interval between two consecutive I frames is generally set in advance, and when I frames are set in the above manner and video encoding is performed, the interval is generally set to a large value, so that many P frames and B frames exist between two consecutive I frames. In view of the above, when the interval between two consecutive I frames is large, the encoding efficiency is low.

disclosure of Invention

the embodiment of the invention discloses a frame type judgment method and a frame type judgment device, which are used for improving the coding efficiency of a video. The technical scheme is as follows:

to achieve the above object, in a first aspect, an embodiment of the present invention provides a frame type determining method applied to an encoder, where the method includes:

determining any I frame coded before an image to be coded as a target I frame;

Analyzing the coding result of a first type of video frame to obtain a first statistical model value, wherein the first type of video frame is as follows: according to the coding sequence, from the target I frame to the I frame between the images to be coded;

Analyzing the encoding result of a second type of video frame to obtain a second statistical model value, wherein the second type of video frame is as follows: according to the coding sequence, from the target I frame to a non-I frame between the images to be coded;

judging whether the image to be coded meets a preset I frame selection condition or not according to the first statistical model value and the second statistical model value;

And if so, judging the frame type of the image to be coded as an I frame.

preferably, the determining whether the image to be encoded meets a preset I-frame selection condition according to the first statistical model value and the second statistical model value includes:

Calculating a difference value and/or a ratio value between the first statistical model value and the second statistical model value to obtain a calculation result;

and judging whether the image to be coded meets a preset I frame selection condition or not according to the calculation result.

Preferably, the method further comprises:

determining whether scene change occurs between images in a target image set, wherein the target image set consists of continuous first preset number of frame images which contain the images to be coded according to a coding sequence;

the judging whether the image to be coded meets a preset I frame selection condition according to the first statistical model value and the second statistical model value comprises the following steps:

Judging whether the following relation is satisfied, if so, judging that the image to be coded meets a preset I frame selection condition:

the first statistical model value and the second statistical model value meet a preset numerical relationship, and scene change does not occur between images in the target image set.

Preferably, the determining whether a scene change occurs between the images in the target image set includes:

determining whether the first target image has scene change relative to the second target image, and if so, judging that the scene change occurs between the images in the target image set; wherein the content of the first and second substances,

The first target image is: according to the coding sequence, second preset number of frame images positioned in front of and/or behind the image to be coded;

the second target image is: a reference picture of the first target picture or a picture of a frame preceding the first target picture in coding order.

preferably, the first target image is: a third preset number of frame images in the image group where the image to be coded is located currently;

the determining whether the first target image has a scene change relative to the second target image includes:

calculating the sum of the prediction distortion of the first target image compared with the second target image to obtain a first sum value;

Calculating the sum of the intra-frame prediction distortion of the second target image to obtain a second sum value;

calculating a ratio of the first sum and the second sum;

judging whether the ratio is in a preset value interval or not;

If yes, correspondingly judging that the first target image has scene change relative to the second target image;

And if not, correspondingly judging that the first target image has no scene change relative to the second target image.

preferably, the first target image is: a fourth preset number of frame images in the image group where the image to be coded is located currently;

obtaining the motion amplitude of each image block in each first target image relative to a second target image corresponding to each first target image;

according to the obtained motion amplitude, counting image blocks of which the motion amplitude is larger than a first preset threshold in each first target image;

Calculating the ratio of the image blocks to the whole frame of image obtained by statistics aiming at each first target image;

And judging whether the scene change occurs to the first target image relative to the second target image according to the calculated proportion.

and determining whether a subsequent image of the image to be coded has a scene change relative to the image to be coded, and if so, judging that the scene change occurs between the images in the target image set, wherein the subsequent image is a fifth frame image with a preset number behind the image to be coded according to the coding sequence.

Preferably, the determining whether a scene change occurs in a subsequent image of the image to be encoded with respect to the image to be encoded includes:

Obtaining the motion amplitude of each image in each subsequent image relative to the image to be coded;

according to the obtained motion amplitude, counting image blocks of which the motion amplitude is larger than a second preset threshold in each subsequent image;

Calculating the ratio of the image blocks to the whole frame of image obtained by statistics aiming at each subsequent image;

And judging whether the subsequent image of the image to be coded has scene change relative to the image to be coded according to the calculated proportion.

Calculating a first prediction distortion of the subsequent image relative to the image to be encoded for each frame;

And judging whether the subsequent image of the image to be coded has scene change relative to the image to be coded according to all the first prediction distortions.

and determining whether the content of the image to be coded changes relative to the reference image of the image to be coded based on the current reference relationship, and if so, judging that scene changes occur among the images in the target image set.

preferably, the determining whether the content of the image to be encoded changes from the reference image thereof based on the current reference relationship includes:

Based on the current reference relationship, calculating a target motion value of the image to be coded relative to a corresponding nearest reference image, wherein the nearest reference image is: according to the display sequence, in all the reference images corresponding to the image to be coded, the reference image closest to the image to be coded, and the target motion value is as follows: in the image to be coded, the data block with the motion amplitude larger than a third preset threshold value accounts for the proportion of the whole frame of image;

Judging whether the target motion value is larger than a fourth preset threshold value or not;

if yes, judging that the content of the image to be coded changes relative to a reference image;

if not, judging that the content of the image to be coded does not change relative to the reference image.

Calculating second prediction distortion of the image to be coded relative to a corresponding nearest reference image based on a current reference relationship, wherein the nearest reference image is a reference image which is closest to the image to be coded in all reference images corresponding to the image to be coded according to a video display sequence;

Judging whether the second prediction distortion is larger than a fifth preset threshold value or not;

preferably, after the determining the frame type of the image to be encoded as an I frame, the method further includes:

The frame type of the picture for which the frame type has been currently determined and encoding has not started is reset.

preferably, the first statistical model value and the second statistical model value each include: at least one of an average bit number, a total bit number, a maximum peak signal-to-noise ratio, a minimum peak signal-to-noise ratio, an average peak signal-to-noise ratio, a maximum structure similarity value, a minimum structure similarity value, an average structure similarity value, a maximum sum of squared residuals of data blocks of a preset size, a sum of squared residuals of the data blocks of the preset size, or a sum of squared residuals of the data blocks of the preset size.

in a second aspect, an embodiment of the present invention further provides a frame type determining apparatus, which is applied to an encoder, and the apparatus includes:

The first determining module is used for determining any I frame coded before the image to be coded as a target I frame;

a first analysis module, configured to analyze a coding result of a first type of video frame to obtain a first statistical model value, where the first type of video frame is: according to the coding sequence, from the target I frame to the I frame between the images to be coded;

a second analysis module, configured to analyze an encoding result of a second type of video frame to obtain a second statistical model value, where the second type of video frame is: according to the coding sequence, from the target I frame to a non-I frame between the images to be coded;

The judging module is used for judging whether the image to be coded meets a preset I frame selection condition or not according to the first statistical model value and the second statistical model value;

And the judging module is used for judging the frame type of the image to be coded as an I frame under the condition that the judging result of the judging module is yes.

preferably, the judging module includes:

the first judgment submodule is used for calculating a difference value and/or a ratio between the first statistical model value and the second statistical model value to obtain a calculation result;

And the second judgment submodule is used for judging whether the image to be coded meets a preset I frame selection condition or not according to the calculation result.

preferably, the apparatus further comprises:

the second determining module is used for determining whether scene change occurs between images in a target image set, wherein the target image set consists of continuous first preset number of frame images which contain the images to be coded according to a coding sequence;

the judgment module is specifically configured to judge whether the following relationship holds, and if so, judge that the image to be encoded meets a preset I-frame selection condition:

Preferably, the second determining module is specifically configured to:

The second determining module includes:

The first calculation submodule is used for calculating the sum of the prediction distortion of the first target image compared with the second target image to obtain a first sum value;

the second calculation submodule is used for calculating the sum of the intra-frame prediction distortion of the second target image to obtain a second sum value;

a third calculation submodule for calculating a ratio of the first sum and the second sum;

the third judgment submodule is used for judging whether the ratio is in a preset value interval or not;

The first judgment sub-module is used for correspondingly judging that the scene change of the first target image relative to the second target image occurs under the condition that the judgment result of the third judgment sub-module is yes; and correspondingly judging that the first target image has no scene change relative to the second target image under the condition that the judgment result of the third judgment submodule is negative.

the second determining module includes:

the first obtaining submodule is used for obtaining the motion amplitude of each image block in each first target image relative to a second target image corresponding to each frame of the first target image;

The first statistic submodule is used for counting the image blocks of which the motion amplitudes are larger than a first preset threshold in each first target image according to the obtained motion amplitudes;

the fourth calculation submodule is used for respectively calculating the proportion of the image blocks obtained by statistics in the whole frame of image aiming at each first target image;

and the fourth judgment submodule is used for judging whether the scene change occurs in the first target image relative to the second target image according to the calculated proportion.

Preferably, the second determining module is specifically configured to:

Preferably, the second determining module includes:

the second obtaining submodule is used for obtaining the motion amplitude of each image in each subsequent image relative to the image to be coded;

The second counting submodule is used for counting the image blocks of which the motion amplitudes are larger than a second preset threshold in each subsequent image according to the obtained motion amplitudes;

the fifth calculation submodule is used for respectively calculating the proportion of the image blocks to the whole frame of image, which is obtained by statistics, aiming at each subsequent image;

and the fifth judgment submodule is used for judging whether the scene change of the subsequent image of the image to be coded relative to the image to be coded occurs according to the calculated proportion.

preferably, the second determining module includes:

a sixth calculation sub-module for calculating a first prediction distortion of the subsequent image with respect to the image to be encoded for each frame;

and the sixth judgment submodule is used for judging whether the subsequent image of the image to be coded has scene change relative to the image to be coded according to all the first prediction distortions.

preferably, the second determining module is specifically configured to:

preferably, the second determining module includes:

A seventh calculating sub-module, configured to calculate, based on the current reference relationship, a target motion value of the image to be encoded relative to a nearest reference image corresponding to the image to be encoded, where the nearest reference image is: according to the display sequence, in all the reference images corresponding to the image to be coded, the reference image closest to the image to be coded, and the target motion value is as follows: in the image to be coded, the data block with the motion amplitude larger than a third preset threshold value accounts for the proportion of the whole frame of image;

A seventh judging submodule, configured to judge whether the target motion value is greater than a fourth preset threshold;

The second judging submodule is used for judging that the content of the image to be coded changes relative to the reference image of the image to be coded under the condition that the judgment result of the seventh judging submodule is positive; and under the condition that the judgment result of the seventh judgment submodule is negative, judging that the content of the image to be coded does not change relative to the reference image.

Preferably, the second determining module includes:

the eighth calculation submodule is used for calculating second prediction distortion of the image to be coded relative to a corresponding nearest reference image based on the current reference relationship, wherein the nearest reference image is a reference image which is closest to the image to be coded in all reference images corresponding to the image to be coded according to a video display sequence;

an eighth determining submodule, configured to determine whether the second prediction distortion is greater than a fifth preset threshold;

the third judging submodule is used for judging that the content of the image to be coded changes relative to the reference image of the image to be coded under the condition that the judgment result of the eighth judging submodule is positive; and under the condition that the judgment result of the eighth judgment submodule is negative, judging that the content of the image to be coded does not change relative to the reference image.

preferably, the apparatus further comprises:

and the resetting module is used for resetting the frame type of the image which is determined currently and is not coded after the frame type of the image to be coded is judged as the I frame.

As can be seen from the above, in the embodiment of the present invention, when determining the frame type of an image to be encoded, first, any I frame encoded before the image to be encoded is determined as a target I frame; analyzing the coding result of the first type of video frame to obtain a first statistical model value; analyzing the coding result of the second type of video frame to obtain a second statistical model value; judging whether the image to be coded meets a preset I frame selection condition or not according to the first statistical model value and the second statistical model value; if yes, the frame type of the image to be coded is judged as an I frame. In the embodiment of the present invention, the first type of video frames are: according to the coding sequence, from a target I frame to an I frame between the images to be coded; the second type of video frame is: according to the coding sequence, from a target I frame to a non-I frame between the images to be coded; therefore, when the frame type of the image to be coded is judged, the embodiment of the invention considers the coding result of the coded video frame and determines whether the image to be coded is coded into the I frame according to the coding result of the coded video frame. The frame type judgment method provided by the embodiment of the invention is used for video coding, so that the coding speed can be increased, the video compression efficiency can be improved, the video coding loss can be reduced, and the video coding efficiency can be improved.

Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.

drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

fig. 1 is a flowchart illustrating a frame type determining method according to a first embodiment of the present invention;

Fig. 2 is a flowchart illustrating a frame type determining method according to a second embodiment of the present invention;

fig. 3 is a flowchart illustrating a frame type determining method according to a third embodiment of the present invention;

fig. 4 is a flowchart illustrating a frame type determining method according to a fourth embodiment of the present invention;

fig. 5 is a flowchart illustrating a frame type determining method according to a fifth embodiment of the present invention;

Fig. 6 is a schematic structural diagram of a frame type determining apparatus according to an embodiment of the present invention;

Fig. 7 is a schematic structural diagram of a frame type determining apparatus according to another embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

hereinafter, technical terms related to the embodiments of the present invention will be briefly described.

Frame type: current video coding standards mainly include three frame types: i, P, and B frames; the I frame is an intra-frame predicted image, and decoding can be completed only by the frame data during decoding; p frames are forward predicted frames, the coding of which requires reference to previous I or P frames, while B frames are bi-directional predicted frames, the coding of which requires reference to previous and subsequent I or P frames.

As known to those skilled in the art, encoding is generally lossy encoding, data obtained after image encoding of each frame has information loss relative to original image data, a reference frame of each frame is lossy, the greater the loss of the reference frame, the greater the loss of the frame after encoding, and the greater the loss of the frame as a reference frame of a subsequent frame, the greater the loss is, and further, the degradation of video encoding efficiency is brought; due to the above situation, in order to ensure the coding efficiency of the subsequent frames, it is necessary to properly insert I frames, i.e. properly code some pictures into I frames.

in the existing coding standard, all image frames before the next I frame form a Group of Pictures (GOP), also called Group of Pictures, starting from a certain I frame, and in a Group of Pictures, there is only one I frame, for example, Group of Pictures: I. b1, B2, P1, B3, B4, P2, B5, and B6.

In addition, before video encoding, a frame type setting rule may be predetermined, for example, one group of pictures contains 15 frames, 2B frames are inserted between adjacent reference frames, and the like, and specifically, for example, the frame type of each picture in one group of pictures is set in advance to I, B, B, P, B, B, P, B, B in display order; in the process of video encoding, the encoder may also adjust the frame type of the image, for example, in the conventional scene change detection technology, the encoder detects in advance the change of the video image content between the adjacent frames, and determines whether the I frame needs to be inserted according to the change amplitude, the I frame is inserted when the scene changes, for example, in the group of pictures with the frame type of I, B, B, P, B, B, P, B, B, when the second image to be encoded as a P frame is encoded, the change amplitude of the video image content between the adjacent frames is detected to exceed a preset threshold, and then the image is encoded as an I frame.

the present invention will be described in detail below with reference to specific examples.

fig. 1 is a flowchart illustrating a frame type determining method according to a first embodiment of the present invention; it is understood that the method is applicable to each frame of image in the video image, and the method is applied to an encoder, and the method comprises:

s101: any I frame encoded prior to the image to be encoded is determined to be a target I frame.

it can be understood that the target I frame is the kth I frame that is coded before the image to be coded in the whole video, wherein k > 0; for example, the third I frame, which is encoded before the image to be encoded is encoded.

s102: analyzing the coding result of the first type of video frame to obtain a first statistical model value; wherein, the first type video frame is: and according to the coding sequence, from the target I frame to the I frame between the images to be coded.

s103: analyzing the coding result of the second type of video frame to obtain a second statistical model value; wherein, the second type video frame is: and according to the coding sequence, from the target I frame to the non-I frame between the images to be coded.

It should be noted that the execution sequence of steps S102 and S103 is not required in order, and step S102 may be executed first, and then step S103 may be executed; step S103 may be executed first, and then step S102 may be executed; it is also possible to perform steps S102 and S103 simultaneously.

in an embodiment of the present invention, the first statistical model value and the second statistical model value each include: at least one of an average bit number, a total bit number, a maximum peak signal-to-noise ratio, a minimum peak signal-to-noise ratio, an average peak signal-to-noise ratio, a maximum structure similarity value, a minimum structure similarity value, an average structure similarity value, a maximum sum of squared residuals of data blocks of a preset size, a sum of squared residuals of data blocks of a preset size, or a sum of squared residuals of data blocks of a preset size.

For a specific calculation method of the Peak Signal-to-Noise Ratio (PSNR) and the Structural Similarity Index (SSIM) of the image, reference may be made to the prior art.

it should be noted that, when the first statistical model value is determined as one or more of the above model values, the type of the second statistical model value is necessarily the same as the type of the first statistical model value. For example, if the first statistical model value is: according to the coding sequence, the total bit number of the I frames between the target I frame and the image to be coded, and the second statistical model value is the total bit number of the non-I frames between the target I frame and the image to be coded; if the first statistical model value is: and according to the coding sequence, the average bit number and the average peak signal-to-noise ratio of the I frame between the target I frame and the image to be coded are obtained, and the second statistical model value is the average bit number and the average peak signal-to-noise ratio of the non-I frame between the target I frame and the image to be coded.

in the embodiment of the present invention, the selection of the statistical model value is not limited to the above type, and may be other statistical model values that may reflect the ratio of the total number of bits of the encoded I frame to the total number of bits of all the encoded video images or the encoding quality of the encoded non-I frame.

s104: and judging whether the image to be coded meets a preset I frame selection condition or not according to the first statistical model value and the second statistical model value.

In the embodiment of the present invention, the obtained first statistical model value and the second statistical model value may be utilized to further obtain some target values that can determine the compression rate and the coding loss of the current coded video image, and the target values may indicate that an I frame needs to be inserted under the current situation, that is, the image to be coded needs to be coded as an I frame.

in an embodiment of the present invention, the determining whether the image to be encoded meets a preset I-frame selection condition according to the first statistical model value and the second statistical model value (S104) may include:

calculating the difference and/or ratio between the first statistical model value and the second statistical model value to obtain a calculation result;

In the embodiment of the present invention, when the first statistical model value and the second statistical model value are respectively: correspondingly, according to the coding sequence, when the average bit number or the total bit number of the I frame and the non-I frame between the target I frame and the image to be coded is larger than the total bit number, the ratio of the first statistical model value to the second statistical model value can be calculated, if the ratio is smaller than a certain preset threshold value, the image to be coded can be judged to meet the preset I frame selection condition, and the image to be coded can be coded into the I frame.

when the first statistical model value and the second statistical model value are respectively: correspondingly, according to the coding sequence, when the maximum peak signal-to-noise ratio, the minimum peak signal-to-noise ratio, the average peak signal-to-noise ratio, the maximum structure similarity value, the minimum structure similarity value or the average structure similarity value of the I frame and the non-I frame between the target I frame and the image to be coded is obtained, the ratio of the first statistical model value to the second statistical model value can be calculated, if the ratio exceeds a certain preset threshold value, the image to be coded can be judged to meet the preset I frame selection condition, and the image to be coded can be coded into the I frame.

When the first statistical model value and the second statistical model value are respectively: correspondingly, according to the coding sequence, when the maximum peak signal-to-noise ratio, the minimum peak signal-to-noise ratio, the average peak signal-to-noise ratio, the maximum structure similarity value, the minimum structure similarity value or the average structure similarity value of the I frame and the non-I frame between the target I frame and the image to be coded is obtained, the difference value between the first statistical model value and the second statistical model value can be calculated, if the difference value exceeds a certain preset threshold value, the image to be coded can be judged to meet the preset I frame selection condition, and the image to be coded can be coded into the I frame.

When the first statistical model value and the second statistical model value are respectively: correspondingly, according to the coding sequence, when the maximum residual square sum of the data blocks with preset sizes from the target I frame to the I frame and the non-I frame between the target I frame and the image to be coded, the minimum residual square sum of the data blocks with preset sizes or the average residual square sum of the data blocks with preset sizes are obtained, the ratio of the first statistical model value to the second statistical model value can be calculated, if the ratio is smaller than a certain preset threshold value, the image to be coded can be judged to meet the preset I frame selection condition, and then the image to be coded can be coded into the I frame.

it should be noted that, in an actual process, the encoder may choose to count only one statistical model value, for example, count only a first average bit number and a second average bit number corresponding to an I frame and a non-I frame between a target I frame and the image to be encoded respectively according to the encoding order, may calculate a ratio of the first average bit number and the second average bit number, and if the ratio is smaller than a preset threshold, may determine that the image to be encoded meets a preset I frame selection condition.

the encoder can also select and count at least two statistical model values, and under the condition that at least one of the selected various first statistical model values and second statistical model values meets a preset condition, the encoder can also judge that the image to be encoded meets a preset I frame selection condition.

For example, the encoder counts a first total bit number and a first average peak signal-to-noise ratio corresponding to an I frame from a target I frame to the image to be encoded according to the encoding sequence, and also counts a second total bit number and a second average peak signal-to-noise ratio corresponding to a non-I frame from the target I frame to the image to be encoded according to the encoding sequence, and if a ratio of the first total bit number to the second total bit number is not less than a preset threshold, and a difference between the first average peak signal-to-noise ratio and the second average peak signal-to-noise ratio exceeds another preset threshold, it may be determined that the image to be encoded satisfies a preset I frame selection condition.

if so, executing S105: and judging the frame type of the image to be coded as an I frame.

it is understood that after the frame type of the image to be encoded is determined as an I frame, the encoder encodes the image to be encoded into the I frame. For example, according to a predetermined frame type setting rule, an image to be encoded will be encoded as a B frame, and when the frame type of the image to be encoded is determined as an I frame after the method provided by the embodiment of the present invention is used, the encoder encodes the image to be encoded as an I frame.

in this embodiment of the present invention, after the determining the frame type of the image to be encoded as an I frame (S105), the method may further include:

As is well known to those skilled in the art, after an image to be encoded is encoded into an I frame, according to the encoding sequence, the reference relationship and the image frame type of a plurality of frames of images that have been pre-read in the following frames of the image to be encoded need to be reset on the basis of the current image as the I frame. For example, the image to be encoded is an image Pic1 in an image group having a display order of Pic1, Pic1, Pic1 and Pic1, and the frame types corresponding to the images Pic1, Pic1 and Pic1 are 1 and B, respectively, and it is obvious that the encoding order of the image group is Pic1, Pic1 and Pic1, and when the image Pic1 is determined as an I frame, the frame type of the image Pic1 is still set as a B frame, but the reference frame of the image Pic1 is changed into an image Pic 1.

the embodiment of the invention can well improve the compression efficiency of the conference video particularly for the conference video, and the video generated by coding accords with the existing video coding standard and can be directly decoded by a common player based on the frame type judgment method provided by the embodiment of the invention.

With respect to the method embodiment shown in fig. 1, the method may further include:

S106: and determining whether scene change occurs between images in a target image set, wherein the target image set consists of a first preset number of continuous frame images which contain the images to be coded according to a coding sequence.

It should be noted that, whether a scene between images changes or not may be determined by the prior art, and the determination method herein is not limited in the embodiment of the present invention. Also, within the set of target images, may be: and judging that the scene change occurs between the images in the target image set as long as the scene change occurs between any two images.

The above target image set should be determined according to the encoding order, for example, for the images of the same image group, assuming that the display order of the images is I, B1, B2, P1, B3, B4, P2, B5, and B6, the encoding order of the image group is I, P1, B1, B2, P2, B3, B4, B5, and B6, assuming that the image to be encoded is B2, and the first preset number is 5, the target image set may be an image set composed of I, P1, B1, B2, and P2.

when step S106 is included, steps S101 to S103 in the embodiment of the present invention are the same as the method embodiment shown in fig. 1, and different from the method embodiment shown in fig. 1, when step S106 is included, the above-mentioned determining whether the image to be encoded satisfies the preset I-frame selection condition according to the first statistical model value and the second statistical model value (S104) may be:

Certainly, in the embodiment of the present invention, the determining process of whether the first statistical model value and the second statistical model value satisfy the preset numerical relationship and the executing sequence of step S106 are not limited; first, whether the first statistical model value and the second statistical model value satisfy a preset numerical relationship is judged, and step S106 is executed under the condition that the first statistical model value and the second statistical model value satisfy the preset numerical relationship; step S106 may also be executed first, and in a case that it is determined that no scene change occurs between the images in the target image set, it is determined whether the first statistical model value and the second statistical model value satisfy a preset numerical relationship, as shown in fig. 2.

Fig. 2 is a flowchart illustrating a frame type determining method according to a second embodiment of the present invention.

As shown in fig. 2, after step S106 is executed, the method for determining a frame type according to the embodiment of the present invention further includes:

s1041: and judging whether the first statistical model value and the second statistical model value meet a preset numerical value relationship.

It is understood that the determination in step S1041 herein can be implemented by the method provided in the embodiment of the method shown in fig. 1, and whether the first statistical model value and the second statistical model value satisfy the preset numerical relationship, that is: corresponding to the embodiment of the method in fig. 1, if the determination result in step S104 is yes, the first statistical model value and the second statistical model value satisfy the preset numerical relationship, and if the determination result in step S104 is no, the first statistical model value and the second statistical model value do not satisfy the preset numerical relationship. The specific determination method of the target condition that no scene change occurs between images in the target image set can be implemented by the following method in addition to the prior art.

a first possible target condition determining method, as shown in fig. 3, the determining whether a scene change occurs between images in the target image set (S106) may be:

s1061: and determining whether the first target image has scene change relative to the second target image, and if so, judging that the scene change occurs between the images in the target image set.

wherein the first target image is: according to the coding sequence, second preset number of frame images positioned in front of and/or behind the image to be coded; the second target image is: the reference picture of the first target picture or a picture of a frame preceding the first target picture in coding order.

for example, the image to be encoded is an image B3 in an image group with display order I, B1, B2, P1, B3, B4, P2, B5, and B6, the encoding order of the image group is I, P1, B1, B2, P2, B3, B4, B5, and B6, if the second preset number is 3, the first target image may be at least one frame image of B2, P2, and B4, for example, the first target image is images B2 and B4, the second target image is a previous frame image corresponding to the first target image in the encoding order, and the second target images corresponding to images B2 and B4 are B1 and B3, respectively. If a scene change occurs between image B2 with respect to image B1 and/or B4 with respect to image B3, it can be determined that a scene change has occurred between images within the target set of images.

It should be emphasized that, when encoding a video, since the frame type setting rule is predetermined, when encoding the image to be encoded, there is a reference image corresponding to the image to be encoded, for example, for a group of images with display order of I, B1, B2, P1, B3, B4, P2, B5, B6, the encoding order is I, P1, B1, B2, P2, B3, B4, B5, B6, when encoding the image B3, that is, when the image B3 is the image to be encoded, the reference image currently corresponding to the image includes images P1 and P2.

in a first method for determining whether a scene change occurs in a first target image relative to a second target image, the first target image may be: and a third preset number of frame images in the image group where the image to be coded is currently located, wherein the third preset number of frame images are contained in the second preset number of frame images. At this time, the above-mentioned determining whether the first target image has a scene change with respect to the second target image (S1061) may be:

calculating a ratio of the first sum and the second sum;

judging whether the ratio is in a preset value interval or not;

if so, correspondingly judging that the first target image has scene change relative to the second target image;

If not, correspondingly judging that the first target image has no scene change relative to the second target image.

it should be noted that the specific calculation methods of the prediction distortion and the intra-frame prediction distortion described above belong to the prior art, and are not described herein again in the embodiments of the present invention.

in general, when the ratio of the first sum value and the second sum value is close to 1, it indicates that the first target image has a scene change relative to the second target image, so in actual use, the preset value interval may be [1, v ], where v may be selected according to actual choice, for example, v is selected to be 1.5.

For example, the preset value interval is [1, 1.6], the image to be encoded is an image P in a group of images with a display order of I, B1, B2, P, B3 and B4, the encoding order of the group of images is I, P, B1, B2, B3 and B4, if the third preset number is 2, the first target image may be images B1 and B2, and the second target images corresponding to images B1 and B2 are P and B1, respectively.

At this time, a prediction distortion x of the image B1 with respect to the image P and a prediction distortion y of the image B2 with respect to the image B1 are calculated, and a sum z of x and y is calculated; calculating the sum of the intra-frame prediction distortion of the images B1 and B2 to obtain a sum value w; then, the ratio of the sum z to the sum w is calculated, and assuming that the ratio is 1.1, it is determined that the scene change occurs in the first target image relative to the second target image.

In a second method for determining whether a scene change occurs in a first target image relative to a second target image, the first target image is: a fourth preset number of frame images in the image group where the image to be coded is located currently; of course, the fourth predetermined number of frame images is included in the second predetermined number of frame images. At this time, the above-mentioned determining whether the first target image has a scene change with respect to the second target image (S1061) may be performed in four steps as follows:

Obtaining the motion amplitude of each image block in each first target image relative to a second target image corresponding to each frame of the first target image;

It should be noted that, when the number of the first target images exceeds one, for each obtained proportion corresponding to each frame of the first target image, a maximum value may be selected from the proportions, and it is determined whether the maximum value is greater than a preset threshold, and if so, it is determined that the first target image has a scene change relative to the second target image; otherwise, judging that the first target image has no scene change relative to the second target image; or calculating the average value of each proportion, judging whether the average value is larger than another preset threshold value, and if so, judging that the scene change of the first target image relative to the second target image occurs; otherwise, the first target image is judged to have no scene change relative to the second target image.

For example, whether the first target image has a scene change relative to the second target image is determined by the maximum value of the respective proportions, and the preset threshold is 40%, and the first preset threshold is 16 integer pixel values. Second target images G and H corresponding to the first target images E and F respectively; the proportion of the image blocks of which the motion amplitude is larger than 16 whole pixel values in the image E is 38.6 percent relative to the image G; the proportion of image blocks with motion amplitude larger than 16 integer pixel values in the image F is 49.2% relative to the image H; then the maximum value is 49.2%, and since the maximum value of 49.2% is greater than the preset threshold value of 40%, it is determined that the first target image has a scene change relative to the second target image.

It should be noted that the above-mentioned motion amplitude and proportion calculation method belongs to the prior art, and is not described herein again in the embodiments of the present invention.

Of course, in the embodiment of the present invention, it is not limited to use the obtained average value or the maximum value corresponding to all the proportions to determine whether the first target image has a scene change relative to the second target image, and a person skilled in the art may also determine through other prior arts, and the embodiment of the present invention is not described herein again.

a second possible target condition determining method, as shown in fig. 4, the determining whether a scene change occurs between images in the target image set (S106), may include:

s1062: and determining whether a scene change occurs to a subsequent image of the image to be coded relative to the image to be coded, and if so, judging that the scene change occurs to the images in the target image set. And the subsequent images are the fifth frame images with the preset number behind the images to be coded according to the coding sequence.

It can be understood that the second feasible target condition determination method is mainly used for determining whether the image to be encoded has a reference value relative to its subsequent images.

It should be noted that when the fifth preset number is greater than 1, it may be determined that a scene change occurs between images in the target image set as long as any one of subsequent images of the images to be encoded has a scene change with respect to the images to be encoded.

for example, the image to be encoded is an image B3 in an image group with a display order of I, B1, B2, B3, P, B4, B5, and B6, the encoding order of the image group is I, P, B1, B2, B3, B4, B5, and B6, if the fifth preset number of frame images selected are images B4 and B5, it is determined whether a scene change occurs in images B4 and B5 with respect to image B3, and if it is determined that a scene change does not occur in image B4 with respect to image B3 and a scene change occurs in image B5 with respect to image B3, it is determined that a scene change occurs between images in the target image set.

in the embodiment of the present invention, the following two methods are adopted to determine whether a scene change occurs in a subsequent image of the image to be encoded relative to the image to be encoded, but of course, the embodiment of the present invention is not limited to the following two methods to determine whether a scene change occurs in a subsequent image of the image to be encoded relative to the image to be encoded.

in a first method for determining whether a scene change occurs in a picture subsequent to the picture to be encoded relative to the picture to be encoded, the determining whether the scene change occurs in the picture subsequent to the picture to be encoded relative to the picture to be encoded (S1062) may include the following four steps:

it should be noted that, when the number of the subsequent images of the image to be encoded exceeds one, for each obtained proportion corresponding to each frame of the subsequent images, a maximum value may be selected from the proportions, and it is determined whether the maximum value is greater than a preset threshold, and if so, it is determined that the subsequent images of the image to be encoded have a scene change relative to the image to be encoded; otherwise, judging that the subsequent image of the image to be coded does not have scene change relative to the image to be coded; or calculating the average value of each proportion, judging whether the average value is larger than another preset threshold value, and if so, judging that the scene change of the subsequent image of the image to be coded relative to the image to be coded occurs; otherwise, judging that the subsequent image of the image to be coded has no scene change relative to the image to be coded.

for example, whether a scene change occurs in a subsequent image of the image to be encoded relative to the image to be encoded is determined by an average value of the respective proportions, where the preset threshold is 45%, and the second preset threshold is 12 whole pixel values. The subsequent images E, F and G relative to the image to be coded, the image blocks with motion amplitude larger than 12 integer pixel values account for 30.2%, 49% and 53.8% of the whole frame image; then the average value of each ratio is 44.3%, and since the average value of 44.3% is smaller than the preset threshold value of 45%, it is determined that no scene change occurs in the subsequent image of the image to be encoded relative to the image to be encoded.

of course, in the embodiment of the present invention, it is not limited to determining whether a scene change occurs in a subsequent image of the image to be encoded relative to the image to be encoded by using the obtained average value or the maximum value corresponding to all the proportions, and a person skilled in the art may also use other determination methods provided in the prior art, which is not described herein again in the embodiment of the present invention.

in a second method for determining whether a scene change occurs in a subsequent picture of the picture to be encoded with respect to the picture to be encoded, the determining whether a scene change occurs in the subsequent picture of the picture to be encoded with respect to the picture to be encoded (S1062) may be:

calculating first prediction distortion of each frame of subsequent image relative to the image to be coded;

It should be noted that the above-mentioned method for calculating prediction distortion belongs to the prior art, and is not described herein again in the embodiments of the present invention.

in the embodiment of the present invention, when the number of the subsequent images is one frame, it may be directly determined whether the prediction distortion of the one frame of the subsequent images with respect to the image to be encoded is greater than a preset threshold, and if so, it is determined that the scene change occurs in the subsequent images of the image to be encoded with respect to the image to be encoded.

In addition, if the number of the subsequent images exceeds one frame, the prediction distortion of each frame of the subsequent images relative to the image to be coded can be calculated and obtained; then, obtaining a maximum value from the obtained prediction distortion, judging whether the maximum value is greater than a preset threshold value, if so, judging that the subsequent image of the image to be coded has scene change relative to the image to be coded, otherwise, judging that the subsequent image of the image to be coded has no scene change relative to the image to be coded; or, an average value of all the obtained prediction distortions may be calculated, whether the average value is greater than another preset threshold is judged, if so, it is judged that the subsequent image of the image to be encoded has a scene change relative to the image to be encoded, otherwise, it is judged that the subsequent image of the image to be encoded has no scene change relative to the image to be encoded.

of course, in the embodiment of the present invention, when the number of subsequent images exceeds one frame, it is not limited to use the obtained average value or maximum value corresponding to all prediction distortions to determine whether the subsequent image of the image to be encoded has a scene change relative to the image to be encoded, and a person skilled in the art may also determine by using other prior art, which is not described herein again in the embodiment of the present invention.

as shown in fig. 5, the third possible method for determining the target condition may be, as described above, determining whether a scene change occurs between images in the target image set (S106):

s1063: and determining whether the content of the image to be coded changes relative to the reference image of the image to be coded based on the current reference relationship, and if so, judging that scene changes occur among the images in the target image set.

It should be noted that the content changes are substantially similar to the scene changes, only to the extent that the data content changes between images.

in the first method for determining whether the content of the to-be-encoded picture changes from the reference picture thereof, the determining whether the content of the to-be-encoded picture changes from the reference picture thereof based on the current reference relationship (S1063) may include:

based on the current reference relationship, calculating a target motion value of the image to be coded relative to a corresponding nearest reference image, wherein the nearest reference image is as follows: according to the display sequence, in all the reference images corresponding to the image to be coded, the reference image closest to the image to be coded has a target motion value: in the image to be coded, the data block with the motion amplitude larger than a third preset threshold value accounts for the proportion of the whole frame image;

If yes, judging that the content of the image to be coded changes relative to the reference image;

if not, the image to be coded is judged to have no content change relative to the reference image.

It can be understood that, when video coding is performed, a reference picture of a picture to be coded may be more than one frame, and if there are multiple frame reference pictures, the latest reference picture is: for example, for a group of images with a display order of I, B1, B2, P1, B3, B4, P2, B5, and B6, if the current image to be encoded is image B4, and the reference images corresponding to image B4 are images P1 and P2, the closest reference image corresponding to image B4 is image P2.

the target motion values are: and relative to the nearest reference image corresponding to the image to be coded, the data block with the motion amplitude larger than a third preset threshold value in the image to be coded accounts for the proportion of the whole frame of image. The method for calculating the target motion value may refer to the prior art, and the embodiment of the present invention does not describe the method for calculating the target motion value in detail.

in a second method for determining whether the content of the to-be-encoded picture changes from the reference picture thereof, the determining whether the content of the to-be-encoded picture changes from the reference picture thereof based on the current reference relationship (S1063) may include:

calculating second prediction distortion of the image to be coded relative to a corresponding nearest reference image based on the current reference relationship, wherein the nearest reference image is a reference image which is closest to the image to be coded in all reference images corresponding to the image to be coded according to a video display sequence;

Judging whether the second prediction distortion is larger than a fifth preset threshold value;

for example, the image to be encoded is an image B3 in a group of images with display orders of I, B1, B2, P1, B3, B4, P2, B5 and B6, the nearest reference image corresponding to the image B3 is an image P1, the prediction distortion of the image B3 relative to the image P1 is calculated, whether the prediction distortion is greater than a fifth preset threshold value is judged, if so, the image to be encoded is judged to have content change relative to the reference image thereof, otherwise, the image to be encoded is judged not to have content change relative to the reference image thereof.

in the embodiment of the present invention, in addition to using the above two methods to determine whether a scene change occurs in a subsequent image of the image to be encoded with respect to the image to be encoded, other prior art techniques may also be used.

it should be emphasized that, in practical applications, one or more than one of the three determination methods of the target condition may be selected, and when more than one determination method is used, it should be ensured that each determination method used determines that no scene change occurs between images in the target image set, so as to determine that the target condition is satisfied.

for example, when the determination methods of S1061 and S1062 are used together, and at most one of steps S1061 and S1062 determines that a scene change determination has occurred between images in the target image set, the encoder finally determines that the target condition is not satisfied; only when both steps S1061 and S1062 determine that no scene change determination has occurred between the images in the target image set, the encoder finally determines that the target condition is satisfied.

As in the method embodiment shown in fig. 1, in the embodiment of the present invention, after determining the frame type of the image to be encoded as an I frame (S105), the method may further include:

in addition, because the image to be encoded is encoded into the I frame only when the first statistical model value and the second statistical model value satisfy the preset numerical relationship and no scene change occurs between the images in the target image set, it can be effectively avoided that too many I frames are inserted in a short time, for example, in a 1s time period or a 2s time period, which causes too large ratio of the total bit number of the encoded I frame in the short time to the total bit number of all the encoded images in the short time, which causes too large fluctuation of the code rate, and further avoids the problem of unclear video display caused thereby.

the embodiments of the present invention will be described in detail below by way of specific examples.

for an image to be encoded in a video, the image to be encoded is an image Pic8 in an image group with the display sequence of Pic1, Pic2, Pic3, Pic4, Pic5, Pic6, Pic7, Pic8 and Pic9, and the frame types corresponding to the images Pic1, Pic2, Pic3, Pic4, Pic5, Pic6, Pic7, Pic8 and Pic9 are I, B, B, P, B, B, P, B and B, respectively. It is understood that the encoding order of the group of pictures is Pic1, Pic4, Pic2, Pic3, Pic7, Pic5, Pic6, Pic8 and Pic9, and when the picture Pic8 is encoded, the pictures Pic1, Pic4, Pic2, Pic3, Pic7, Pic5 and Pic6 have been encoded.

in this example, the previously encoded picture Pic1 is first determined to be the target I frame; then the number of bits of the picture Pic1 encoded as an I frame is determined and the total number of bits after the picture Pic4, Pic2, Pic3, Pic7, Pic5 and Pic6 are counted, and then the ratio of the number of bits of the picture Pic1 to the total number of bits is obtained.

in addition, according to the coding order, whether the pictures Pic3, Pic7, Pic5 and Pic6 have scene changes relative to the previous pictures is judged, and the specific judgment mode is as follows: it is determined whether a proportion of data blocks having a motion magnitude greater than 16 full pixel values in the pictures Pic3, Pic7, Pic5 and Pic6, respectively, is more than 30% of the full frame picture with respect to the respective previous picture. At the same time, it is also determined whether the picture to be encoded Pic8 has a change in content from its reference picture Pic7, in particular by determining whether the proportion of data blocks of the picture Pic8 having a motion amplitude greater than 8 integer pixel values in the whole frame of picture Pic8 exceeds 50% relative to the picture Pic 7.

Then, in the frame type determination process, if the conditions are simultaneously satisfied:

(1) the above ratio is less than 5%; (2) of the pictures Pic3, Pic7, Pic5 and Pic6, no picture is present in which a proportion of data blocks having a motion amplitude greater than 16 full pixel values is more than 30% of the full frame picture with respect to the respective previous picture; (3) the picture to be encoded Pic8 has no content change compared to its reference picture Pic 7.

At this time, the frame type of the picture Pic8 may be newly decided as an I frame, and the picture Pic8 may be encoded as an I frame. The reference frame of picture Pic9 is thereafter changed to picture Pic 8.

in the example using the above method of determining the frame type, the coding efficiency of different videos is improved as shown in the following table.

TABLE 1

in the table, the left-most column represents different video names; the Y, U, V and YUV columns in the table represent the same rate savings at Y, U, V and YUV combining quality, respectively, where negative values represent savings and positive values represent increases. As can be seen from the above table, the method has an effect of improving the compression efficiency of most videos, and the frame type determination method provided by the embodiment of the present invention can improve the compression efficiency of videos well.

corresponding to the embodiment of the method shown in fig. 1, as shown in fig. 6, an embodiment of the present invention provides a frame type determining apparatus applied to an encoder, where the apparatus includes:

A first determining module 110, configured to determine any I frame encoded before an image to be encoded as a target I frame;

A first analysis module 120, configured to analyze an encoding result of a first type of video frame to obtain a first statistical model value, where the first type of video frame is: according to the coding sequence, from the target I frame to the I frame between the images to be coded;

a second analysis module 130, configured to analyze an encoding result of a second type of video frame to obtain a second statistical model value, where the second type of video frame is: according to the coding sequence, from the target I frame to a non-I frame between the images to be coded;

A judging module 140, configured to judge whether the image to be encoded meets a preset I-frame selection condition according to the first statistical model value and the second statistical model value;

A determining module 150, configured to determine the frame type of the image to be encoded as an I frame if the determination result of the determining module 140 is yes.

specifically, the determining module 140 may include a first determining sub-module and a second determining sub-module (not shown in the figure):

and the second judging submodule is used for judging whether the image to be coded meets a preset I frame selection condition or not according to the calculation result.

in practical use, specifically, the apparatus may further include:

A resetting module (not shown in the figure) for resetting the frame type of the image which is determined currently and is not started to be coded after the frame type of the image to be coded is determined as an I frame.

specifically, the first statistical model value and the second statistical model value both include: at least one of an average bit number, a total bit number, a maximum peak signal-to-noise ratio, a minimum peak signal-to-noise ratio, an average peak signal-to-noise ratio, a maximum structure similarity value, a minimum structure similarity value, an average structure similarity value, a maximum sum of squared residuals of data blocks of a preset size, a sum of squared residuals of the data blocks of the preset size, or a sum of squared residuals of the data blocks of the preset size.

In accordance with the embodiments of the method shown in fig. 2 to 5, with respect to fig. 6, as shown in fig. 7, the apparatus may further include:

A second determining module 160, configured to determine whether a scene change occurs between images in a target image set, where the target image set is composed of a first preset number of consecutive frame images including the image to be encoded according to an encoding sequence;

The determining module 140 is specifically configured to determine whether the following relationship holds, if so, determine that the image to be encoded meets a preset I-frame selection condition:

In practical applications, specifically, the second determining module 160 may be specifically configured to:

in practical applications, more specifically, the first target image is: a third preset number of frame images in the image group where the image to be coded is located currently;

the second determining module 160 may include a first calculating sub-module, a second calculating sub-module, a third calculating sub-module, a first judging sub-module and a first judging sub-module (not shown in the figure):

The third calculation submodule is used for calculating the ratio of the first sum value and the second sum value;

The first judgment submodule is used for judging whether the ratio is in a preset value interval or not;

the first judgment sub-module is used for correspondingly judging that the scene change of the first target image relative to the second target image occurs under the condition that the judgment result of the first judgment sub-module is yes; and correspondingly judging that the scene change does not occur in the first target image relative to the second target image under the condition that the judgment result of the first judgment submodule is negative.

in practical applications, more specifically, the first target image is: a fourth preset number of frame images in the image group where the image to be coded is located currently;

The second determining module 160 may include a first obtaining sub-module, a first statistic sub-module, a fourth calculating sub-module, and a second determining sub-module (not shown):

the first statistic submodule is used for counting image blocks of which the motion amplitude is larger than a first preset threshold in each first target image according to the obtained motion amplitude;

the second judgment submodule is configured to judge whether the first target image has a scene change relative to the second target image according to the calculated ratio.

In practical applications, the second determining module 160 may be further specifically configured to:

More specifically, the second determining module 160 may include a second obtaining sub-module, a second statistics sub-module, a fifth calculating sub-module, and a third determining sub-module (not shown in the figure):

the third judging submodule is used for judging whether scene change occurs in a subsequent image of the image to be coded relative to the image to be coded according to the calculated proportion.

more specifically, the second determining module 160 may further include a sixth calculating sub-module and a fourth determining sub-module (not shown in the figure):

the sixth calculation submodule is used for calculating the first prediction distortion of the subsequent image relative to the image to be coded in each frame;

the fourth judging submodule is configured to judge whether a scene change occurs in a subsequent image of the image to be encoded relative to the image to be encoded according to all the first prediction distortions.

more specifically, the second determining module 160 may include a seventh calculating submodule, a fifth judging submodule and a second judging submodule (not shown in the figure):

The seventh calculating submodule is configured to calculate, based on the current reference relationship, a target motion value of the image to be encoded relative to a nearest reference image corresponding to the image to be encoded, where the nearest reference image is: according to the display sequence, in all the reference images corresponding to the image to be coded, the reference image closest to the image to be coded, and the target motion value is as follows: in the image to be coded, the data block with the motion amplitude larger than a third preset threshold value accounts for the proportion of the whole frame of image;

the fifth judgment submodule is used for judging whether the target motion value is larger than a fourth preset threshold value or not;

The second judging submodule is used for judging that the content of the image to be coded changes relative to the reference image of the image to be coded under the condition that the judgment result of the fifth judging submodule is yes; and under the condition that the judgment result of the fifth judgment submodule is negative, judging that the content of the image to be coded does not change relative to the reference image.

more specifically, the second determining module 160 may further include an eighth calculating sub-module, a sixth determining sub-module, and a third determining sub-module (not shown in the figure):

the eighth calculation submodule is configured to calculate, based on a current reference relationship, second prediction distortion of the to-be-encoded image with respect to a nearest reference image corresponding to the to-be-encoded image, where the nearest reference image is a reference image closest to the to-be-encoded image in all reference images corresponding to the to-be-encoded image according to a video display order;

the sixth judgment submodule is configured to judge whether the second prediction distortion is larger than a fifth preset threshold;

the third judging submodule is used for judging that the content of the image to be coded changes relative to the reference image thereof under the condition that the judging result of the sixth judging submodule is positive; and under the condition that the judgment result of the sixth judgment submodule is negative, judging that the content of the image to be coded does not change relative to the reference image.

in addition, because the image to be coded is coded into the I frame only when the first statistical model value and the second statistical model value satisfy the preset numerical relationship and no scene change occurs between the images in the target image set, the problem that the video display is not clear due to the fact that too many I frames are inserted into the same video and the ratio of the total bit number of the I frame in the video to the video code stream is too large and the ratio of the total bit number of the non-I frame to the video code stream is too small can be effectively avoided.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

all the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Those skilled in the art will appreciate that all or part of the steps in the above method embodiments may be implemented by a program to instruct relevant hardware to perform the steps, and the program may be stored in a computer-readable storage medium, which is referred to herein as a storage medium, such as: ROM/RAM, magnetic disk, optical disk, etc.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A frame type decision method applied to an encoder, the method comprising:

Determining any I frame coded before an image to be coded as a target I frame;

and if so, judging the frame type of the image to be coded as an I frame.

2. the method according to claim 1, wherein said determining whether the image to be encoded satisfies a preset I-frame selection condition according to the first statistical model value and the second statistical model value comprises:

3. the method of claim 1, further comprising:

Determining whether scene change occurs between images in a target image set, wherein the target image set consists of a first preset number of frame images containing the images to be coded, and the coding sequence of the first preset number of frame images is continuous;

4. the method of claim 3, wherein determining whether a scene change has occurred between images within the target image set comprises:

The first target image is: second preset number frame images which are positioned in front of and/or behind the image to be coded in the target image set according to the coding sequence;

the second target image is: and in the reference image or the target image set of the first target image, the image of the frame before the first target image is in coding order.

5. The method of claim 4, wherein the first target image is: a third preset number of frame images in the image group where the image to be coded is located currently, wherein the third preset number of frame images are contained in the second preset number of frame images;

Calculating a ratio of the first sum and the second sum;

Judging whether the ratio is in a preset value interval or not;

6. the method of claim 4, wherein the first target image is: a fourth preset number of frame images in the image group where the image to be coded is located, wherein the fourth preset number of frame images are located in the second preset number of frame images;

7. the method of claim 3, wherein determining whether a scene change has occurred between images within the target image set comprises:

8. The method according to claim 7, wherein the determining whether a scene change occurs in a picture subsequent to the picture to be encoded with respect to the picture to be encoded comprises:

9. the method according to claim 7, wherein the determining whether a scene change occurs in a picture subsequent to the picture to be encoded with respect to the picture to be encoded comprises:

10. The method of claim 3, wherein determining whether a scene change has occurred between images within the target image set comprises:

11. the method according to claim 10, wherein said determining whether the content of the picture to be encoded changes from its reference picture based on the current reference relationship comprises:

12. the method according to claim 10, wherein said determining whether the content of the picture to be encoded changes from its reference picture based on the current reference relationship comprises:

13. The method according to claim 1, wherein after determining the frame type of the image to be encoded as an I-frame, the method further comprises:

14. the method of claim 1, wherein the first statistical model value and the second statistical model value each comprise: at least one of an average bit number, a total bit number, a maximum peak signal-to-noise ratio, a minimum peak signal-to-noise ratio, an average peak signal-to-noise ratio, a maximum structure similarity value, a minimum structure similarity value, an average structure similarity value, a maximum sum of squared residuals of data blocks of a preset size, a sum of squared residuals of the data blocks of the preset size, or a sum of squared residuals of the data blocks of the preset size.

15. A frame type decision device applied to an encoder, the device comprising:

16. The apparatus of claim 15, wherein the determining module comprises:

17. The apparatus of claim 15, further comprising:

A second determining module, configured to determine whether a scene change occurs between images in a target image set, where the target image set is composed of a first preset number of frame images including the image to be encoded, and an encoding sequence of the first preset number of frame images is continuous;

18. the apparatus of claim 17, wherein the second determining module is specifically configured to:

19. The apparatus of claim 18, wherein the first target image is: a third preset number of frame images in the image group where the image to be coded is located currently, wherein the third preset number of frame images are contained in the second preset number of frame images;

The second determining module includes:

20. The apparatus of claim 18, wherein the first target image is: a fourth preset number of frame images in the image group where the image to be coded is located, wherein the fourth preset number of frame images are located in the second preset number of frame images;

the second determining module includes:

21. The apparatus of claim 17, wherein the second determining module is specifically configured to:

22. The apparatus of claim 21, wherein the second determining module comprises:

23. the apparatus of claim 21, wherein the second determining module comprises:

24. The apparatus of claim 17, wherein the second determining module is specifically configured to:

25. The apparatus of claim 24, wherein the second determining module comprises:

26. the apparatus of claim 24, wherein the second determining module comprises:

27. The apparatus of claim 15, further comprising:

28. The apparatus of claim 15, wherein the first statistical model value and the second statistical model value each comprise: at least one of an average bit number, a total bit number, a maximum peak signal-to-noise ratio, a minimum peak signal-to-noise ratio, an average peak signal-to-noise ratio, a maximum structure similarity value, a minimum structure similarity value, an average structure similarity value, a maximum sum of squared residuals of data blocks of a preset size, a sum of squared residuals of the data blocks of the preset size, or a sum of squared residuals of the data blocks of the preset size.