CN107820095B

CN107820095B - Long-term reference image selection method and device

Info

Publication number: CN107820095B
Application number: CN201610827470.4A
Authority: CN
Inventors: 张贤国; 朱政; 金星
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd; Beijing Kingsoft Cloud Technology Co Ltd
Priority date: 2016-09-14
Filing date: 2016-09-14
Publication date: 2020-01-03
Anticipated expiration: 2036-09-14
Also published as: CN107820095A

Abstract

The embodiment of the invention discloses a long-term reference image selection method and a long-term reference image selection device. The method comprises the following steps: obtaining a target attribute value of a predetermined image attribute of an image to be encoded; the image to be coded is an image in a target video; judging whether the target attribute value is a preset attribute value or not; under the condition that the target attribute value is a preset attribute value, judging whether the image to be coded meets the determination condition of a long-term reference image or not according to a first data relation; wherein, the first data relationship is: regarding the data relationship between the image to be encoded and the similar images or regarding the data relationship between the similar images, the similar images are: in an image sequence of a target video obtained according to a preset sequencing mode, a first preset number of frame images are in front of the image to be coded and/or a second preset number of frame images are in back of the image to be coded; and if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image. The scheme can improve the video coding efficiency.

Description

Long-term reference image selection method and device

Technical Field

The present invention relates to the field of video coding technologies, and in particular, to a long-term reference picture selection method and apparatus.

Background

With the continuous development of video services in multimedia applications and the continuous improvement of video cloud computing requirements, the existing network transmission bandwidth and storage resources are increasingly difficult to support the original video information source with huge data volume, and the current situation makes video coding gradually become one of the hot spots of research and application at home and abroad.

Various video coding standards are established by domestic and foreign standardization organizations, and most of the standards remove redundant information in video image data through technologies such as prediction, transformation, scanning, quantization, entropy coding and the like so as to reduce transmission bandwidth and storage space. The prediction technology is divided into intra-frame prediction and inter-frame prediction, and inter-frame prediction uses the inter-frame correlation of video images to achieve the purpose of image compression, and in brief, an encoded image is used to predict an image to be encoded. As an inter-frame prediction optimization technique, a long-term reference image prediction technique is effectively supported in multiple coding standards, such as the h.264/avc (advanced Video coding) standard and the hevc (high Efficiency Video coding) standard. The long-term reference image prediction technology implements video coding based on a long-term reference image, and the long-term reference image refers to a reconstructed image which is always resident in a reference image buffer and can be referred to by a subsequent longer image in a video compression process.

Specifically, the step of performing inter prediction by applying the long-term reference picture prediction technique, as shown in fig. 1, includes: s101, selecting an image to be coded, namely firstly selecting an image to be coded; s102, generating a reconstructed image of the image to be coded, determining the reconstructed image as a long-term reference image, namely, coding the image to be coded, removing redundant information in image data to realize image compression, then performing decoding reconstruction, and determining the obtained reconstructed image as the long-term reference image; s103, moving the determined long-term reference picture into a long-term reference picture buffer area for reference of a subsequent uncoded picture, wherein the long-term reference picture is reserved in the long-term reference picture buffer area and is not moved out until long-term reference prediction is cancelled or the long-term reference picture is updated and replaced; and S104, performing long-term reference prediction by using the long-term reference image, namely selecting the long-term reference image as a prediction reference when performing inter-frame prediction on each frame of subsequent images to be coded. As shown in fig. 2, the reference picture buffer includes: a long-term reference picture 1, a short-term reference picture 2, a long-term reference picture buffer 3 and a short-term reference picture buffer 4, wherein the long-term reference picture 1 is stored in the long-term reference picture buffer 3, and the short-term reference picture 2 is stored in the short-term reference picture buffer 4; in practical applications, the number of long-term reference pictures and short-term reference pictures is not limited to the number shown in fig. 2.

In the prior art, a reconstructed image corresponding to a first frame image of a video to be encoded is usually directly selected as a long-term reference image. Although, with the above method, control is relatively simple for the encoder, it cannot adapt well to the multi-scene characteristics of the video, and the video coding efficiency is low.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for selecting a long-term reference picture to improve video coding efficiency. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a long-term reference image selection method, where the method includes:

obtaining a target attribute value of a predetermined image attribute of an image to be encoded; the image to be coded is an image in a target video;

judging whether the target attribute value is a preset attribute value or not;

under the condition that the target attribute value is a preset attribute value, judging whether the image to be coded meets the determination condition of a long-term reference image or not according to a first data relation; wherein the first data relationship is: regarding the data relationship between the image to be encoded and the similar images or regarding the data relationship between the similar images, the similar images are: in the image sequence of the target video obtained according to the preset sequencing mode, a first preset number of frame images before and/or a second preset number of frame images after the image to be coded;

and if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image.

Optionally, the obtaining a target attribute value of a predetermined image attribute of an image to be encoded includes:

obtaining a target type value of a frame type of an image to be coded;

the judging whether the target attribute value is a preset attribute value includes:

and judging whether the target type value is of an I type.

obtaining a target level value of a coding level to which an image to be coded belongs;

and judging whether the target level value is a target value or not, wherein the image corresponding to the target value does not refer to other images in the image group to which the image belongs.

obtaining a target type value of a frame type of an image to be coded and a target level value of a coding level to which the image belongs;

and judging whether the target type value is an I type or not and whether the target level value is a target value or not, wherein the image corresponding to the target value does not refer to other images in the image group to which the target value belongs.

Optionally, the obtaining a target level value of a coding level to which the image to be coded belongs includes:

determining a level value of a next level of an encoding level to which a target image belongs in the target video, wherein the target image and the image to be encoded belong to the same image group, and the target image is an image with the largest level value of the encoding level to which the target image belongs in a reference image of the image to be encoded;

and determining the determined level value as a target level value of the coding level to which the image to be coded belongs.

obtaining the number of images of which the quantization parameters are smaller than those of the images to be coded in the image group to which the images to be coded belong;

determining the number as a target level value of an encoding level to which the image to be encoded belongs.

Optionally, when the first data relationship is a data relationship between the image to be encoded and a similar image, the determining, according to the first data relationship, whether the image to be encoded meets a determination condition of a long-term reference image includes:

judging whether the image to be coded meets the determination condition of a long-term reference image or not according to the motion information between the image to be coded and the data block of the similar image;

or the like, or, alternatively,

and judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the sum of the inter-frame prediction distortion values of the image to be coded and the similar image.

Optionally, the determining, according to the motion information between the image to be encoded and the data block of the similar image, whether the image to be encoded meets the determination condition of the long-term reference image includes:

determining a first proportion of a first class data block in the image to be coded aiming at each image included by a similar image, and judging whether the first proportion is smaller than a first preset threshold value or not, wherein the first class data block is a data block of which the motion vector module value between the image to be coded and the image is larger than a second preset threshold value;

and if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

Optionally, the determining, according to a sum of inter-frame prediction distortion values of the image to be encoded and a neighboring image, whether the image to be encoded meets a determination condition of a long-term reference image includes:

and if the sum of the inter-frame prediction distortion values of the image to be coded and each image included in the similar image is smaller than a third preset threshold value, judging that the image to be coded meets the determination condition of the long-term reference image.

Optionally, when the first data relationship is a data relationship between close images, the determining, according to the first data relationship, whether the image to be encoded meets a determination condition of a long-term reference image includes:

for each image included in the similar images, determining a second proportion of a second type data block in the image, and judging whether the second proportion is smaller than a fourth preset threshold, wherein the second type data block is as follows: a data block with a motion vector module value between the image and the image of the previous frame of the image larger than a fifth preset threshold value; and if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

Optionally, the preset sorting manner is as follows: and sorting according to the image coding and decoding sequence or sorting according to the image display sequence.

Optionally, the close image is a pre-read image.

Optionally, the method further includes:

the determined long-term reference picture is moved into a long-term reference picture buffer for reference by the unencoded picture.

Optionally, the method for selecting a long-term reference image according to the embodiment of the present invention further includes:

judging whether to continue using the long-term reference image or not according to a second data relation aiming at each long-term reference image in the long-term reference image buffer area, and if not, marking the long-term reference image as a non-long-term reference image; wherein the second data relationship is: the first type of picture comprises a pre-read picture and/or a statistical coded picture with respect to a data relation between the first type of picture and the long-term reference picture or with respect to a data relation between the first type of picture.

Optionally, when the second data relationship is a data relationship between the first-class image and the long-term reference image, the determining whether to continue using the long-term reference image according to the second data relationship includes:

judging whether to continue using the long-term reference image according to the motion information between the data blocks between the first class image and the long-term reference image;

or the like, or, alternatively,

and judging whether to continue using the long-term reference image or not according to the sum of the interframe prediction distortion values of the first-class image and the long-term reference image.

Optionally, when the second data relationship is a data relationship between the first-class picture and the long-term reference picture and the first-class picture is a statistical coded picture, the determining whether to continue using the long-term reference picture according to the second data relationship includes:

and judging whether to continue using the long-term reference image according to the sum of the areas of the long-term reference image used in the counted coded image.

Optionally, the determining whether to continue using the long-term reference picture according to the motion information between the data blocks between the first class of picture and the long-term reference picture includes:

determining a third proportion of a third class data block in the long-term reference image aiming at each image included in the first class of images, and judging whether the third proportion is smaller than a sixth preset threshold value or not, wherein the third class data block is a data block of which the motion vector modulus between the long-term reference image and the image is larger than a seventh preset threshold value; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

Optionally, the determining whether to continue using the long-term reference picture according to a sum of inter-frame prediction distortion values of the first class of pictures and the long-term reference picture includes:

and if the sum of the inter-frame prediction distortion values of each image included in the first type of image and the long-term reference image is smaller than an eighth preset threshold value, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

Optionally, the determining whether to continue using the long-term reference picture according to the sum of the areas of the long-term reference pictures used in the counted encoded pictures includes:

if the sum of the areas of the long-term reference image used in the counted coded image is larger than a ninth preset threshold value, judging to continue using the long-term reference image, and otherwise, judging not to continue using the long-term reference image;

or the like, or, alternatively,

and if the ratio of the sum of the areas of the long-term reference images in the counted coded images to the total area of the coded images is greater than a tenth preset threshold, judging to continue using the long-term reference images, otherwise, judging not to continue using the long-term reference images.

Optionally, when the second data relationship is a data relationship between images of a first type, the determining whether to continue using the long-term reference image according to the second data relationship includes:

determining a fourth proportion of a fourth type data block in each image included in the first type of image, and judging whether the fourth proportion is smaller than an eleventh preset threshold, wherein the fourth type data block is a data block of which a motion vector modulus between the image and a previous frame image of the image is larger than the twelfth preset threshold; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

moving the marked non-long-term reference pictures out of the long-term reference picture buffer.

In a second aspect, an embodiment of the present invention provides a long-term reference image selection apparatus, including:

a target attribute value obtaining module for obtaining a target attribute value of a predetermined image attribute of an image to be encoded; the image to be coded is an image in a target video;

the first judgment module is used for judging whether the target attribute value is a preset attribute value or not;

the second judging module is used for judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the first data relation under the condition that the judgment result of the first judging module is yes; wherein the first data relationship is: regarding the data relationship between the image to be encoded and the similar images or regarding the data relationship between the similar images, the similar images are: in the image sequence of the target video obtained according to the preset sequencing mode, a first preset number of frame images before and/or a second preset number of frame images after the image to be coded;

and the determining module is used for generating a reconstructed image corresponding to the image to be coded and determining the reconstructed image as a long-term reference image under the condition that the judgment result of the second judging module is yes.

Optionally, the target attribute value obtaining module is specifically configured to:

obtaining a target type value of a frame type of an image to be coded;

the first judging module is specifically configured to:

and judging whether the target type value is of an I type.

the first judging module is specifically configured to:

Optionally, the target attribute value obtaining module includes:

the level value determining submodule is used for determining a level value of a next level of an encoding level to which a target image belongs in the target video, wherein the target image and the image to be encoded belong to the same image group, and the target image is an image with the maximum level value of the encoding level to which the reference image of the image to be encoded belongs;

and the first target level value determining submodule is used for determining the determined level value as the target level value of the coding level to which the image to be coded belongs.

Optionally, the target attribute value obtaining module includes:

the quantity obtaining submodule is used for obtaining the quantity of the images of which the quantization parameters are smaller than those of the images to be coded in the image group to which the images to be coded belong;

and the second target level value determining submodule is used for determining the number as the target level value of the coding level to which the image to be coded belongs.

Optionally, the second determining module includes:

the first judgment submodule is used for judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the motion information between the image to be coded and the data blocks of the similar images when the first data relation is about the data relation between the image to be coded and the similar images;

or the like, or, alternatively,

and the second judging submodule is used for judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the sum of the interframe prediction distortion values of the image to be coded and the similar image when the first data relation is about the data relation between the image to be coded and the similar image.

Optionally, the first determining submodule is specifically configured to:

Optionally, the second judgment sub-module is specifically configured to:

judging whether the sum of the inter-frame prediction distortion values of the image to be coded and each image included in the similar image is smaller than a third preset threshold value or not;

and judging that the image to be coded meets the determination condition of the long-term reference image if the judgment result is yes.

Optionally, the second determining module includes:

a third determining sub-module, configured to, when the first data relationship is a data relationship related to neighboring images, determine, for each image included in a neighboring image, a second ratio of a second class data block in the image, and determine whether the second ratio is smaller than a fourth preset threshold, where the second class data block is: a data block with a motion vector module value between the image and the image of the previous frame of the image larger than a fifth preset threshold value;

Optionally, the close image is a pre-read image.

Optionally, the long-term reference image selecting apparatus provided in the embodiment of the present invention further includes:

a long-term reference picture moving-in module for moving the determined long-term reference picture into a long-term reference picture buffer for reference by the uncoded picture.

a third judging module, configured to judge, according to a second data relationship, whether to continue to use the long-term reference image for each long-term reference image in the long-term reference image buffer; wherein the second data relationship is: regarding the data relation between the first kind of image and the long-term reference image or the data relation between the first kind of image, wherein the first kind of image comprises a pre-reading image and/or a statistical coded image;

and the marking module is used for marking the long-term reference image as a non-long-term reference image under the condition that the result of the third judging module is negative.

Optionally, the third determining module includes:

a fourth determining sub-module, configured to determine, for each long-term reference image in the long-term reference image buffer, whether to continue using the long-term reference image according to motion information between data blocks between the first class image and the long-term reference image when the second data relationship is a data relationship between the first class image and the long-term reference image;

or the like, or, alternatively,

and a fifth determining sub-module, configured to determine, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture according to a sum of inter-frame prediction distortion values of the first class picture and the long-term reference picture when the second data relationship is a data relationship between the first class picture and the long-term reference picture.

Optionally, the third determining module includes:

a sixth determining sub-module, configured to determine, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture according to a sum of areas of the long-term reference pictures used in the counted encoded pictures, when the second data relationship is a data relationship between the first-class picture and the long-term reference picture and the first-class picture is a counted encoded picture.

Optionally, the fourth determining sub-module is specifically configured to:

determining a third proportion of a third class data block in the long-term reference image aiming at each image included in the first class of images, and judging whether the third proportion is smaller than a sixth preset threshold value or not, wherein the third class data block is a data block of which the motion vector modulus between the long-term reference image and the image is larger than a seventh preset threshold value;

and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

Optionally, the fifth determining sub-module is specifically configured to:

Optionally, the sixth determining sub-module is specifically configured to:

for each long-term reference image in the long-term reference image buffer area, if the sum of the areas of the long-term reference images used in the counted coded images is larger than a ninth preset threshold value, judging that the long-term reference image is continuously used, and otherwise, judging that the long-term reference image is not continuously used;

or the like, or, alternatively,

and for each long-term reference image in the long-term reference image buffer area, if the ratio of the sum of the areas of the counted coded images using the long-term reference image to the total area of the coded images is greater than a tenth preset threshold, judging to continue using the long-term reference image, and otherwise, judging not to continue using the long-term reference image.

Optionally, the third determining module includes:

a seventh determining sub-module, configured to, when the second data relationship is a data relationship between images of a first type, determine, for each image included in the images of the first type, a fourth proportion of a fourth type data block in the image, and determine whether the fourth proportion is smaller than an eleventh preset threshold, where the fourth type data block is a data block whose motion vector modulus between the image and a previous frame image of the image is larger than the twelfth preset threshold;

a non-long-term reference picture moving-out module for moving the marked non-long-term reference picture out of the long-term reference picture buffer.

In the long-term reference image selection method provided by the embodiment of the invention, a target attribute value of a preset image attribute of an image to be coded is obtained; judging whether the target attribute value is a preset attribute value or not; under the condition that the target attribute value is a preset attribute value, judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the first data relation; and if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image. In the long-term reference image selection process, the relationship between the image to be coded and the similar image is considered, so that a proper image is flexibly selected as the long-term reference image instead of directly selecting the first frame image of the video to be coded as the long-term reference image, the change of a video scene can be better adapted, and the video coding efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of inter-frame prediction using long-term reference image prediction in the prior art;

FIG. 2 is a diagram of a reference picture buffer;

FIG. 3 is a flowchart of a long-term reference picture selection method according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a long-term reference picture selection method according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a long-term reference image selection apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a long-term reference image selection apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to improve video coding efficiency, embodiments of the present invention provide a long-term reference picture selection method and apparatus.

First, a method for selecting a long-term reference image according to an embodiment of the present invention is described below.

It should be noted that the long-term reference image refers to a decoded image that is always resident in the long-term reference image buffer and can be referred to by a subsequent longer video image in the video compression process, and the long-term reference image technology solves the problem that a part of coding units of a current image cannot obtain matched reference data in an adjacent image by acquiring far-end data for the current image as a prediction reference, thereby improving the prediction efficiency.

In addition, an implementation subject of the long-term reference image selection method provided by the embodiment of the invention can be a long-term reference image selection device. It is reasonable that the long-term reference picture selecting device can be a plug-in the existing video coding software or a separate functional software. The long-term reference image selection apparatus may be applied to a terminal or a server.

As shown in fig. 3, a method for selecting a long-term reference image according to an embodiment of the present invention may include the following steps:

s301, obtaining a target attribute value of a preset image attribute of the image to be coded.

S302, judging whether the target attribute value is a preset attribute value, if so, executing S303.

In practical application, in order to select a long-term reference image required in a target video encoding process, before each frame of video image is encoded, a target attribute value of a predetermined image attribute of an image to be encoded is obtained, then whether the image to be encoded is suitable for being used as a long-term reference image is determined for the first time by determining whether the target attribute value is a preset attribute value, and if the determination result is yes, subsequent operation is executed, and if the determination result is no, whether the image to be encoded is suitable for being used as a long-term reference image is not determined any more, so that the calculation amount is reduced. The predetermined image attribute may be a frame type or an encoding level to which the image belongs.

Of course, it is understood that, in order to further reduce the calculation amount of the long-term reference image selection process, a specific image may be selected from the target video to perform the long-term reference image selection method provided by the embodiment of the present invention, for example, a frame of the specific image may be periodically selected from the first frame image of the target video and every few frame images, and then the long-term reference image selection method provided by the embodiment of the present invention may be performed on the specific image, and so on.

In addition, it should be emphasized that the image to be encoded is an image in the target video, and all the images mentioned later also refer to images in the target video. In one implementation, the obtaining a target attribute value of a predetermined image attribute of an image to be encoded may include: obtaining a target type value of a frame type of an image to be coded; accordingly, the determining whether the target attribute value is the preset attribute value may include: and judging whether the target type value is the I type. The I frame represents a key frame, and the key frame generally uses a smaller quantization parameter during encoding, has less encoding loss and higher quality, and has a longer-term reference value compared with a B frame and a P frame with more encoding loss.

In another implementation, the obtaining a target attribute value of a predetermined image attribute of an image to be encoded may include: obtaining a target level value of a coding level to which an image to be coded belongs; accordingly, the determining whether the target attribute value is the preset attribute value may include: and judging whether the target level value is a target value, wherein the image corresponding to the target value does not refer to other images in the image group. As will be understood by those skilled in the art, the picture with the level value of 0 is generally the first frame picture in the group of pictures, and has less coding loss and higher quality, and has a longer-term reference value than other pictures in the same group.

It should be noted that, in practical applications, the first implementation manner and the second implementation manner may be combined to obtain an image with a longer-term reference value. Based on the requirement, the obtaining of the target attribute value of the predetermined image attribute of the image to be encoded may include: obtaining a target type value of a frame type of an image to be coded and a target level value of a coding level to which the image belongs; accordingly, the determining whether the target attribute value is a preset attribute value may include: and judging whether the target type value is an I type or not and whether the target level value is a target value or not, wherein the image corresponding to the target value does not refer to other images in the image group to which the target value belongs.

Specifically, in a specific implementation manner, the obtaining a target level value of a coding level to which an image to be coded belongs may include: determining a level value of a next level of a coding level to which a target image belongs in a target video, wherein the target image and an image to be coded belong to the same image group, and the target image is an image with the maximum level value of the coding level to which the target image belongs in a reference image of the image to be coded; and determining the determined level value as a target level value of the coding level to which the image to be coded belongs.

The encoding level to which the image belongs is based on a concept of a group of images, the image level value of other images in the group is not referred to as 0, only the image level value of the 0 th image in the group is referred to as 1, the maximum level value of the image referred to by the image with the level value of 2 in the group is referred to as 1, the maximum level value of the image referred to by the image with the level value of 3 in the group is referred to as 2, and so on.

For example, there are 8 frames of images in the image group 1, where the image 1 does not refer to the intra-group image, the image 2 only refers to the image 1, and the image 3 refers to the image 1 and the image 2, then: the level value of image 1 is 0, the level value of image 2 is 1, and the level value of image 3 is 2. It should be noted that the reference relationship in the image group can be obtained in advance.

Specifically, in another specific implementation manner, the obtaining a target level value of a coding level to which an image to be coded belongs may include: obtaining the number of images of which the quantization parameters are smaller than those of the images to be coded in the image group to which the images to be coded belong; the number is determined as a target level value of an encoding level to which the image to be encoded belongs.

It should be noted that the coding level of the image here is a concept based on the quantization relationship between images in the group of images, specifically, the level of the image with the smallest quantization parameter in each group of images is 0, generally, the image is the first frame image in the group of images, the level value of the image with the smallest quantization parameter is 1, and so on.

For example, there are 8 frames of images in the image group 2, which are respectively the image 9 to the image 16, wherein the quantization parameter of the image 9 is the smallest, and the quantization parameter of the image 11 is the smallest, so that the level value of the image 9 is 0, and the level value of the image 11 is 1. It should be noted that the quantization parameter of each image included in the image group may be preset.

S303, judging whether the image to be coded accords with the determination condition of the long-term reference image according to the first data relation, and if so, executing S304.

When the target attribute value is determined to be the preset attribute value, whether the image to be encoded meets the determination condition of the long-term reference image or not can be continuously determined according to the first data relationship, and if the determination result is yes, the step S304 is executed; it is to be understood that, when the determination result is negative, the process may not be performed, that is, the subsequent S304 is not performed any more.

The first data relationship may be a data relationship between the image to be encoded and a neighboring image or a data relationship between the neighboring images, and the neighboring images may be: in the image sequence of the target video obtained according to the preset sequencing mode, a first preset number of frame images are in front of the image to be coded and/or a second preset number of frame images are in back of the image to be coded. Specifically, the preset sorting manner may be: and sorting according to the image coding and decoding sequence or sorting according to the image display sequence.

As will be understood by those skilled in the art, the close images in the target video have continuity and similarity, and therefore, based on the definition of the long-term reference image, it can be determined whether the image to be encoded meets the determination condition of the long-term reference image according to the data dependency relationship between the image to be encoded and the close images.

For clarity, how to determine whether the image to be encoded meets the determination condition of the long-term reference image is described below with reference to different first data relationships:

when the first data relationship is a data relationship between the image to be encoded and the similar image, in an implementation, the determining, according to the first data relationship, whether the image to be encoded meets a determination condition of the long-term reference image may include: and judging whether the coded image meets the determination condition of the long-term reference image or not according to the motion information between the data blocks of the image to be coded and the similar image.

It should be noted that, through the motion information between the image to be encoded and the data block of the similar image, it may be directly detected whether the motion of the data block between the image to be encoded and the similar image is severe, and it can be understood by those skilled in the art that if it is detected that the data block with severe motion exceeds a certain ratio, the long-term reference value of the image to be encoded is low.

Specifically, the determining, according to the motion information between the image to be encoded and the data block of the similar image, whether the image to be encoded meets the determination condition of the long-term reference image may include: determining a first proportion of a first class data block in an image to be coded aiming at each image included by a similar image, and judging whether the first proportion is smaller than a first preset threshold value or not, wherein the first class data block is a data block of which the motion vector modulus between the image to be coded and the image is larger than a second preset threshold value; and if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

It should be noted that, in the interframe predictive coding, because there is a certain correlation between the scenes in the neighboring frames of the moving image, the moving image can be divided into several data blocks, and the position of each data block in the neighboring frame images is searched out, and the relative offset of the spatial position between the two is obtained, and the obtained relative offset is the motion vector which is usually referred to, and reflects the motion condition of the data block between the image to which the data block belongs and the neighboring image. The data block with the motion vector module value larger than the second preset threshold value is obtained, and the data block with the motion vector module value square larger than the second preset threshold value square can also be obtained. In addition, a method for specifically obtaining a motion vector modulus between an image to be coded and a similar image belongs to the prior art, and is not described herein again.

For example, a first preset threshold a and a second preset threshold b are set, the image to be encoded belongs to the image group 2, the image number is from 9 to 17, the image to be coded is the image 14, other images in the image group are taken and determined as similar images, and detecting and obtaining the motion vector of the data block between the image to be coded and each other image in the image group by motion search, if, wherein, n data blocks with motion vector module value greater than b between the image to be coded and the image 10 account for a ratio greater than a of the total number of the image data blocks to be coded, then the image to be coded can be judged not to conform to the determination condition of the long-term reference image, if the proportion of the data blocks with the motion vector modulus value larger than b between the image to be coded and all the similar images is smaller than a, the image to be coded can be judged to accord with the determination condition of the long-term reference image.

When the first data relationship is a data relationship between the image to be encoded and the similar image, in another implementation manner, the determining whether the image to be encoded meets the determination condition of the long-term reference image according to the first data relationship may include: and judging whether the image to be coded accords with the determination condition of the long-term reference image or not according to the sum of the interframe prediction distortion values of the image to be coded and the similar image.

It can be understood by those skilled in the art that if the sum of the inter prediction distortion values of the image to be encoded and the neighboring image is too large, the long-term reference value of the image to be encoded is low. The method for specifically obtaining the sum of inter-frame prediction distortion values of an image to be encoded and a similar image belongs to the prior art and is not described herein again.

Specifically, the determining whether the image to be encoded meets the determination condition of the long-term reference image according to the sum of the inter-frame prediction distortion values of the image to be encoded and the similar image may include: and if the sum of the inter-frame prediction distortion values of the image to be coded and each image included in the similar image is smaller than a third preset threshold value, judging that the image to be coded meets the determination condition of the long-term reference image.

When the first data relationship is a data relationship between close images, in an implementation manner, the determining whether the image to be encoded meets the determination condition of the long-term reference image according to the first data relationship may include: determining a second proportion of a second class data block in each image included by the similar images, and judging whether the second proportion is smaller than a fourth preset threshold value, wherein the second class data block is as follows: and the motion vector modulus between the image and the image in the previous frame of the image is larger than a fifth preset threshold value. And if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

It should be noted that if the motion amplitude between the close images of the image to be encoded is too large, and it is also very likely that the motion amplitude between the close images and the encoded image is too large, which indicates that the scene change of the video image in the current period is large, then the long-term reference value of the image to be encoded will be low.

For example, a fourth preset threshold is set as c, a fifth preset threshold is set as d, the neighboring images of the image to be encoded are images 10-20, the images 20 and 19, the images 19 and 18, and so on are detected until the motion information of the data blocks between the image 11 and the image 10, if the ratio of the data blocks with the motion vector modulus larger than d to the total number of the data blocks of the image 20 is smaller than c in the detection results of the images 20 and 19, and the ratio of the data blocks with the motion vector modulus larger than d to the total number of the data blocks of the image 19 is smaller than c in the detection results of the images 19 and 18, and so on, when all the detection results are smaller than c, it can be determined that the image to be encoded meets the determination condition of the long-term reference image. Of course, the close images may not be consecutive images.

Since the long-term reference picture is a decoded picture serving as a long-term reference for subsequent uncoded pictures, the above-mentioned close pictures may be pre-read pictures in order to more accurately determine whether the picture to be coded meets the determination condition of the long-term reference picture. The skilled person can understand that, according to the data relationship between the image to be encoded and the pre-read image, it is determined whether the image to be encoded meets the determination condition of the long-term reference image, which is more targeted, the obtained result is more accurate, and the determined long-term reference image also has a long-term reference value.

In practical application, according to specific situations and requirements, any of the above implementation manners may be combined with each other for the image to be encoded, so as to determine whether the image to be encoded meets the determination condition of the long-term reference image.

And S304, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image.

If the execution result of S303 is yes, that is, if the image to be encoded meets the determination condition of the long-term reference image, a reconstructed image corresponding to the image to be encoded may be generated, and the reconstructed image may be determined as the long-term reference image. The specific marking method belongs to the prior art and is not described herein again.

In the long-term reference image selection method provided by the embodiment shown in fig. 3, a target attribute value of a predetermined image attribute of an image to be encoded is obtained; judging whether the target attribute value is a preset attribute value or not; under the condition that the target attribute value is a preset attribute value, judging whether the image to be coded meets the determination condition of a long-term reference image or not according to a first data relation; and if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image. In the long-term reference image selection process, the relationship between the image to be coded as the image to be selected and the similar image thereof is considered, so that a proper image is flexibly selected as the long-term reference image instead of directly selecting the first frame image of the target video as the long-term reference image, the change of a video scene can be better adapted, and the video coding efficiency is improved.

Further, as shown in fig. 4, the method for selecting a long-term reference image according to an embodiment of the present invention may further include:

s305, moving the determined long-term reference picture into a long-term reference picture buffer.

The reconstructed image corresponding to the image to be coded which meets the determination condition of the long-term reference image, that is, the determined long-term reference image, can be shifted into the long-term reference image buffer for reference of the subsequent uncoded image.

It should be noted that, when applying the long-term reference image prediction technique to perform inter prediction, for an uncoded image, when the uncoded image is not an I frame or its coding level is smaller than an empirical value k, the long-term reference image may be used, where an image with a coding level smaller than the empirical value k is more valuable, and therefore, in order to reduce the complexity of the method, an image with a coding level smaller than the empirical value k may be selected to perform inter prediction using the long-term reference image; if the complexity of the method is not considered, the long-term reference image can be used for all uncoded images, and the obtained prediction effect is good; in practical application, the selection can be performed according to specific requirements.

Furthermore, it will be understood by those skilled in the art that the more adjacent images, the more similar objects they contain, and therefore, for an unencoded image, only one long-term reference image may be used, with the reference image being closest to the unencoded image; however, when an object that is not present in the adjacent image is present in the unencoded image, a long-term reference image that is farthest from the unencoded image may be referred to.

S306, aiming at each long-term reference image in the long-term reference image buffer area, judging whether to continue using the long-term reference image according to the second data relation, and if not, marking the long-term reference image as a non-long-term reference image.

In order to ensure that the long-term reference pictures stored in the long-term reference picture buffer area belong to valid long-term reference pictures in the subsequent encoding process, whether the long-term reference pictures are continuously used or not can be judged according to the second data relation aiming at each long-term reference picture in the long-term reference picture buffer area, and if not, the long-term reference pictures are marked as non-long-term reference pictures.

Wherein the second data relationship may be: with respect to the data relationship between the first type of picture and the long-term reference picture or with respect to the data relationship between the first type of picture, and the first type of picture may include a pre-read picture and/or a statistically encoded picture.

It should be noted that, whether the long-term reference picture is to be used continuously may be determined according to the data dependency relationship between the long-term reference picture and the pre-read picture and/or the statistical coded picture, or the data dependency relationship between the pre-read picture and/or the statistical coded picture, and if not, the long-term reference picture is marked as a non-long-term reference picture for subsequent operations.

When the second data relationship is a data relationship between the first-class image and the long-term reference image, in an implementation, the determining whether to continue using the long-term reference image according to the second data relationship may include: and judging whether to continue using the long-term reference image according to the motion information between the data blocks between the first-class image and the long-term reference image.

Specifically, the determining whether to continue using the long-term reference picture according to the motion information between the data blocks between the first class picture and the long-term reference picture may include: determining a third proportion of a third type data block in the long-term reference image aiming at each image included in the first type of image, and judging whether the third proportion is smaller than a sixth preset threshold value, wherein the third type data block is a data block of which the motion vector modulus between the long-term reference image and the image is larger than a seventh preset threshold value; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

When the second data relationship is a data relationship between the first-class image and the long-term reference image, in another implementation, the determining whether to continue using the long-term reference image according to the second data relationship may include: and judging whether to continue using the long-term reference image or not according to the sum of the interframe prediction distortion values of the first-class image and the long-term reference image.

Specifically, the determining whether to continue using the long-term reference picture according to the sum of the inter-frame prediction distortion values of the first class picture and the long-term reference picture may include: and if the sum of the interframe prediction distortion values of each image included in the first type of image and the long-term reference image is less than an eighth preset threshold value, judging to continue using the long-term reference image, and otherwise, judging not to continue using the long-term reference image.

It should be noted that, based on the similarity between the continuity of the images and the adjacent images, for each long-term reference image in the long-term reference buffer, whether to continue using the long-term reference image may be determined according to the sum of the motion information of the data block between the long-term reference image and the first-class image or the inter-frame prediction distortion value. It can be understood that, for the two implementation manners, the complexity of the method for selecting the encoded image is low, and the accuracy of the method for selecting the pre-read image is higher, and in practical application, the method can be selected according to specific requirements.

When the second data relationship is a data relationship between the first-class picture and the long-term reference picture and the first-class picture is a statistical coded picture, in an implementation, the determining whether to continue using the long-term reference picture according to the second data relationship may include: and judging whether to continue using the long-term reference image according to the sum of the areas of the long-term reference image used in the counted coded image.

Specifically, the determining whether to continue using the long-term reference picture according to the sum of the areas of the long-term reference pictures used in the counted encoded pictures may include:

or the like, or, alternatively,

It should be noted that, in order to improve the accuracy of the result, multiple frames of encoded images closest to the current time point may be counted, if the sum of the areas of the long-term reference images used in the counted encoded images is greater than a ninth preset threshold or the ratio occupied by the counted encoded images is greater than a tenth preset threshold, it is indicated that the current video scene change is small, and based on that the video images have a certain continuity, the long-term reference image generally has a higher long-term reference value than the subsequent uncoded image, it may be determined that the long-term reference image may be used continuously.

For example, a tenth predetermined threshold is set as an empirical value e%, the statistical coded picture is a most recently coded m (m is an integer) frame picture, the total area of the statistical coded picture is S, the sum of the areas of the statistical coded picture using a long-term reference picture is S,

if r is less than e%, it can be determined that the long-term reference image is not used any more.

When the second data relationship is a data relationship between images of a first type, in an implementation, the determining whether to continue using the long-term reference image according to the second data relationship may include:

determining a fourth proportion of a fourth type data block in each image included in the first type of image, and judging whether the fourth proportion is smaller than an eleventh preset threshold, wherein the fourth type data block is a data block of which the motion vector modulus between the image and a previous frame image of the image is larger than the twelfth preset threshold; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

It will be appreciated that if the motion of the data blocks between the pre-read pictures and/or the statistical coded pictures is so severe that the scene change of the target video is large in the current period of time, the currently selected long-term reference picture is most likely to be less meaningful than the subsequent non-coded pictures. It should be noted that the complexity of the method for selecting the encoded image is low, and the accuracy of the method for selecting the pre-read image is higher, and in practical application, the method can be selected according to specific requirements.

In practical application, according to a long-term reference image to be judged, any of the above implementation manners can be combined according to specific situations and requirements to judge whether to continue using the long-term reference image.

The long-term reference image selection method provided by the embodiment shown in fig. 4 will be described below with reference to a specific application example.

If the attribute of the preset image is a frame type, firstly, obtaining the frame type of the image to be coded, and judging whether the frame type is an I frame, if so, detecting whether the proportion of the number of data blocks with larger motion amplitude between the obtained pre-read image and the previous image in the pre-read image is smaller than a preset threshold value t1, and if all judgment results are yes, judging that the image to be coded accords with the determination condition of the long-term reference image; and generating a reconstructed image corresponding to the image to be coded, determining the reconstructed image as a long-term reference image, and then moving the determined long-term reference image into a long-term reference image buffer area. The data block with large motion amplitude may be a corresponding data block with a motion vector module value greater than a preset threshold V.

In addition, for the long-term reference image in the long-term reference image buffer area, whether the proportion of the number of data blocks with larger motion amplitude between each pre-read image and the previous image in the pre-read image is smaller than a preset threshold value t2 or not is judged, if all judgment results are yes or the reference area using the long-term reference image in the current coded image exceeds 128, the long-term reference image can be judged to be continuously used; otherwise, it is determined that the long-term reference picture is no longer in use, the long-term reference picture can be marked as a non-long-term reference picture, and moved out of the buffer.

In the subsequent encoding process, for uncoded images, a proper long-term reference image can be selected for reference according to actual conditions.

Under the same quality evaluation, compared with the prior art, the video coding is performed for different video types by applying the specific example, and the improvement results of the coding efficiency are shown in table 1 and table 2, which respectively and averagely bring about 0.65% and 0.51% of code rate saving:

TABLE 1 code rate savings at the same quality (1)

TABLE 2 code rate savings at the same quality (2)

Type of video	Y	U	V	YUV
					movie	-0.26％	-0.28％	-0.17％	-0.27％
game	-0.63％	-1.39％	-1.08％	-0.75％
					Average	-0.45％	-0.84％	-0.63％	-0.51％

YUV is a color coding method, Y represents brightness, U represents chroma, and V represents concentration; y, U, V and YUV columns in the table indicate the code rate savings under the same Y, U, V and YUV combining quality, respectively, negative values in the table indicate code rate savings and positive values indicate code rate increase.

It should be noted that table 1 shows the code rate improvement results of video coding for different video types divided from the image size perspective; table 2 shows the code rate improvement result of video coding for different video types divided from the perspective of video picture content, where movie represents movie-like video and game represents game-like video, and it can be understood that scene changes of movie-like video are relatively rich.

Most of the 720p videos selected in table 1 are video conference scenes, and the scene change is relatively small, so the bitrate saving is large and is 1.37%. It can be seen that the technical scheme provided by the embodiment of the invention has better effect when being applied to a video coding process with less scene change.

Applying the embodiment shown in fig. 4, further, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture is determined according to the data dependency relationship between the long-term reference picture and the pre-read picture and/or the coded picture, and if not, the long-term reference picture is marked as a non-long-term reference picture for subsequent operations. In practical application, the marked non-long-term reference image can be moved out of the long-term reference image buffer area, so that the buffer area is prevented from overflowing, and a storage space is reserved for the subsequently determined long-term reference image, so that an uncoded image can access a new long-term reference image in time, and the video coding efficiency is improved.

It should be emphasized that, in the embodiments of the present invention, the "first" of the "first threshold", the "second" of the "second threshold", and the like, which are only used for distinguishing different thresholds from each other by name, do not have any limiting meaning, and may be set according to actual situations, and are not limited herein; the "first type" in the "first type data relationship" and the "second type" in the "second type data relationship" merely distinguish different data relationships from each other in terms of nomenclature and do not have any limiting meaning.

Corresponding to the above method embodiment, an embodiment of the present invention provides a long-term reference image selection apparatus, as shown in fig. 5, the apparatus may include:

a target attribute value obtaining module 501, configured to obtain a target attribute value of a predetermined image attribute of an image to be encoded;

a first determining module 502, configured to determine whether the target attribute value is a preset attribute value;

a second judging module 503, configured to, if the judgment result of the first judging module 502 is yes, judge whether the image to be encoded meets the determination condition of the long-term reference image according to the first data relationship;

a determining module 504, configured to generate a reconstructed image corresponding to the image to be encoded and determine the reconstructed image as a long-term reference image if the determination result of the second determining module 503 is yes.

The image to be coded is an image in a target video; the first data relationship is: regarding the data relationship between the image to be encoded and the similar images or regarding the data relationship between the similar images, the similar images are: in the image sequence of the target video obtained according to the preset sequencing mode, the image to be coded is preceded by a first preset number of frame images and/or followed by a second preset number of frame images.

Applying the embodiment shown in fig. 5, obtaining a target attribute value of a predetermined image attribute of an image to be encoded; judging whether the target attribute value is a preset attribute value or not; under the condition that the target attribute value is a preset attribute value, judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the first data relation; and if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image. In the long-term reference image selection process, the relationship between the image to be coded as the image to be selected and the similar image thereof is considered, so that a proper image is flexibly selected as the long-term reference image instead of directly selecting the first frame image of the target video as the long-term reference image, the change of a video scene can be better adapted, and the video coding efficiency is improved.

In a first implementation manner, the target attribute value obtaining module 501 is specifically configured to:

obtaining a target type value of a frame type of an image to be coded;

the first determining module 502 is specifically configured to:

and judging whether the target type value is of an I type.

In a second implementation manner, the target attribute value obtaining module 501 is specifically configured to:

the first determining module 502 is specifically configured to:

Specifically, the target attribute value obtaining module 501 may include:

In a third implementation manner, the target attribute value obtaining module 501 is specifically configured to:

the first determining module 502 is specifically configured to:

Specifically, the second determining module 503 may include:

or, the second determining sub-module is configured to determine, when the first data relationship is a data relationship between the image to be encoded and a neighboring image, whether the image to be encoded meets a determination condition of a long-term reference image according to a sum of inter-frame prediction distortion values of the image to be encoded and the neighboring image.

Wherein the first judgment submodule is specifically configured to:

The second judgment sub-module is specifically configured to:

Specifically, the second determining module 503 may include:

The preset sorting mode can be sorting according to an image coding and decoding sequence or sorting according to an image display sequence; the close proximity image may be a pre-read image.

Further, on the basis of the target attribute value obtaining module 501, the first judging module 502, the second judging module 503 and the determining module 504, as shown in fig. 6, the long-term reference image selecting apparatus according to the embodiment of the present invention may further include:

a long-term reference picture moving-in module 505 for moving the determined long-term reference picture into a long-term reference picture buffer for reference by the uncoded picture.

A third determining module 506, configured to determine, according to the second data relationship, whether to continue to use the long-term reference picture for each long-term reference picture in the long-term reference picture buffer.

A marking module 507, configured to mark the long-term reference picture as a non-long-term reference picture if the result of the third determining module is negative.

Wherein the second data relationship is: the first type of picture comprises a pre-read picture and/or a statistical coded picture with respect to a data relation between the first type of picture and the long-term reference picture or with respect to a data relation between the first type of picture.

Applying the embodiment shown in fig. 6, further, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture is determined according to the data dependency relationship between the long-term reference picture and the pre-read picture and/or the coded picture, and if not, the long-term reference picture is marked as a non-long-term reference picture for subsequent operations. In practical application, the marked non-long-term reference image can be moved out of the long-term reference image buffer area, so that the buffer area is prevented from overflowing, and a storage space is reserved for the subsequently determined long-term reference image, so that an uncoded image can access a new long-term reference image in time, and the video coding efficiency is improved.

Specifically, the third determining module 506 may include:

or, a fifth determining sub-module, configured to determine, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture according to a sum of inter-frame prediction distortion values of the first class picture and the long-term reference picture when the second data relationship is a data relationship between the first class picture and the long-term reference picture.

Specifically, the third determining module 506 may include:

Wherein the fourth judgment submodule is specifically configured to:

The fifth judgment sub-module is specifically configured to:

if the sum of the inter-frame prediction distortion values of each picture included in the first type of picture and the long-term reference picture is less than an eighth preset threshold value; and judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

The sixth judgment submodule is specifically configured to:

or, for each long-term reference picture in the long-term reference picture buffer area, if the ratio of the sum of the areas of the counted coded pictures using the long-term reference pictures to the total area of the coded pictures is greater than a tenth preset threshold, determining to continue using the long-term reference picture, otherwise, determining not to continue using the long-term reference picture.

Specifically, the third determining module 506 may include:

Specifically, the long-term reference image selection apparatus provided in the embodiment of the present invention may further include:

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Those skilled in the art will appreciate that all or part of the steps in the above method embodiments may be implemented by a program to instruct relevant hardware to perform the steps, and the program may be stored in a computer-readable storage medium, which is referred to herein as a storage medium, such as: ROM/RAM, magnetic disk, optical disk, etc.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for long-term reference picture selection, the method comprising:

judging whether the target attribute value is a preset attribute value or not; when the target attribute value of the predetermined image attribute of the image to be coded is a preset attribute value, the coding loss of the image to be coded is less than the coding loss of the image of which the target attribute value of the predetermined image attribute is other attribute values except the preset attribute value;

if so, generating a reconstructed image corresponding to the image to be coded, and determining the reconstructed image as a long-term reference image;

when the first data relationship is a data relationship between the image to be encoded and a similar image, the determining whether the image to be encoded meets a determination condition of a long-term reference image according to the first data relationship includes: judging whether the image to be coded meets the determination condition of a long-term reference image or not according to the motion information between the image to be coded and the data block of the similar image; or, judging whether the image to be coded meets the determination condition of the long-term reference image according to the sum of the interframe prediction distortion values of the image to be coded and the similar image;

and/or the presence of a gas in the gas,

when the first data relationship is a data relationship related to neighboring images, the determining whether the image to be encoded meets a determination condition of a long-term reference image according to the first data relationship includes: for each image included in the similar images, determining a second proportion of a second type data block in the image, and judging whether the second proportion is smaller than a fourth preset threshold, wherein the second type data block is as follows: a data block with a motion vector module value between the image and the image of the previous frame of the image larger than a fifth preset threshold value; and if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

2. The method according to claim 1, wherein the obtaining a target attribute value of a predetermined image attribute of the image to be encoded comprises:

obtaining a target type value of a frame type of an image to be coded;

and judging whether the target type value is of an I type.

3. The method according to claim 1, wherein the obtaining a target attribute value of a predetermined image attribute of the image to be encoded comprises:

4. The method according to claim 1, wherein the obtaining a target attribute value of a predetermined image attribute of the image to be encoded comprises:

5. The method according to claim 3, wherein the obtaining a target level value of an encoding level to which the image to be encoded belongs comprises:

6. The method according to claim 3, wherein the obtaining a target level value of an encoding level to which the image to be encoded belongs comprises:

7. The method according to claim 1, wherein the determining whether the image to be encoded meets the determination condition of the long-term reference image according to the motion information between the image to be encoded and the data blocks of the neighboring images comprises:

determining a first proportion of a first class data block in the image to be coded aiming at each image contained in a similar image, and judging whether the first proportion is smaller than a first preset threshold value or not, wherein the first class data block is a data block of which a motion vector module value between the image to be coded and each image contained in the similar image is larger than a second preset threshold value;

8. The method according to claim 1, wherein the determining whether the image to be encoded meets the determination condition of the long-term reference image according to the sum of the inter-prediction distortion values of the image to be encoded and the neighboring image comprises:

9. The method according to any one of claims 1 to 8, wherein the predetermined ordering is: and sorting according to the image coding and decoding sequence or sorting according to the image display sequence.

10. The method according to any one of claims 1-8, further comprising:

11. The method of claim 10, further comprising:

judging whether to continue using the long-term reference image or not according to a second data relation aiming at each long-term reference image in the long-term reference image buffer area, and if not, marking the long-term reference image as a non-long-term reference image; wherein the second data relationship is: regarding the data relation between the first kind of image and the long-term reference image or the data relation between the first kind of image, wherein the first kind of image comprises a pre-reading image and/or a statistical coded image; the pre-read image is a pre-read current uncoded image;

wherein, when the second data relationship is the data relationship between the first-class image and the long-term reference image, the determining whether to continue using the long-term reference image according to the second data relationship includes: judging whether to continue using the long-term reference image according to the motion information between the data blocks between the first class image and the long-term reference image; or, according to the sum of the interframe prediction distortion values of the first class image and the long-term reference image, judging whether to continue using the long-term reference image;

and/or the presence of a gas in the gas,

when the second data relationship is a data relationship between the first-class picture and the long-term reference picture and the first-class picture is a statistical coded picture, the determining whether to continue using the long-term reference picture according to the second data relationship includes: judging whether to continue using the long-term reference image according to the sum of the areas of the long-term reference image used in the counted coded image;

when the second data relationship is a data relationship related to the first-class images, the determining whether to continue using the long-term reference image according to the second data relationship includes: determining a fourth proportion of a fourth type data block in each image included in the first type of image, and judging whether the fourth proportion is smaller than an eleventh preset threshold, wherein the fourth type data block is a data block of which a motion vector module value between the image and a previous frame of image of the image is larger than a twelfth preset threshold; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

12. The method of claim 11, wherein determining whether to continue using the long-term reference picture according to motion information between data blocks between the first class of pictures and the long-term reference picture comprises:

determining a third proportion of a third class data block in the long-term reference image aiming at each image included in the first class of images, and judging whether the third proportion is smaller than a sixth preset threshold value or not, wherein the third class data block is a data block of which the motion vector modulus between the long-term reference image and each image included in the first class of images is larger than a seventh preset threshold value;

13. The method of claim 11, wherein the determining whether to continue using the long-term reference picture according to the sum of the inter-prediction distortion values of the first class picture and the long-term reference picture comprises:

14. The method of claim 11, wherein said determining whether to continue using the long-term reference picture according to the sum of the areas of the statistical coded pictures in which the long-term reference picture is used comprises:

or the like, or, alternatively,

15. The method of claim 11, further comprising:

16. A long-term reference picture selection apparatus, the apparatus comprising:

the first judgment module is used for judging whether the target attribute value is a preset attribute value or not; when the target attribute value of the preset image attribute of the image to be coded is a preset attribute value, the coding loss of the image to be coded is less than the coding loss of the image of which the target attribute value of the preset image attribute is other attribute values except the preset attribute value;

the determining module is used for generating a reconstructed image corresponding to the image to be coded and determining the reconstructed image as a long-term reference image under the condition that the judgment result of the second judging module is yes;

the second judging module includes: the first judgment submodule is used for judging whether the image to be coded meets the determination condition of the long-term reference image or not according to the motion information between the image to be coded and the data blocks of the similar images when the first data relation is about the data relation between the image to be coded and the similar images;

alternatively, the first and second electrodes may be,

the second judging module includes: a second determining sub-module, configured to determine, when the first data relationship is a data relationship between the image to be encoded and a neighboring image, whether the image to be encoded meets a determination condition of a long-term reference image according to a sum of inter-frame prediction distortion values of the image to be encoded and the neighboring image;

alternatively, the first and second electrodes may be,

the second judging module includes: a third determining sub-module, configured to, when the first data relationship is a data relationship related to neighboring images, determine, for each image included in a neighboring image, a second ratio of a second class data block in the image, and determine whether the second ratio is smaller than a fourth preset threshold, where the second class data block is: a data block with a motion vector module value between the image and the image of the previous frame of the image larger than a fifth preset threshold value; and if all judgment results are yes, judging that the image to be coded meets the determination condition of the long-term reference image.

17. The apparatus according to claim 16, wherein the target attribute value obtaining module is specifically configured to:

obtaining a target type value of a frame type of an image to be coded;

the first judging module is specifically configured to:

and judging whether the target type value is of an I type.

18. The apparatus according to claim 16, wherein the target attribute value obtaining module is specifically configured to:

the first judging module is specifically configured to:

19. The apparatus according to claim 16, wherein the target attribute value obtaining module is specifically configured to:

the first judging module is specifically configured to:

20. The apparatus of claim 18, wherein the target property value obtaining module comprises:

21. The apparatus of claim 18, wherein the target property value obtaining module comprises:

22. The apparatus according to claim 16, wherein the first determining submodule is specifically configured to: determining a first proportion of a first class data block in the image to be coded aiming at each image contained in a similar image, and judging whether the first proportion is smaller than a first preset threshold value or not, wherein the first class data block is a data block of which a motion vector module value between the image to be coded and each image contained in the similar image is larger than a second preset threshold value;

23. The apparatus according to claim 16, wherein the second determination submodule is specifically configured to:

24. The apparatus according to any one of claims 16-23, wherein the predetermined ordering is: and sorting according to the image coding and decoding sequence or sorting according to the image display sequence.

25. The apparatus of any one of claims 16-23, further comprising:

26. The apparatus of claim 25, further comprising:

a third judging module, configured to judge, according to a second data relationship, whether to continue to use the long-term reference image for each long-term reference image in the long-term reference image buffer; wherein the second data relationship is: regarding the data relation between the first kind of image and the long-term reference image or the data relation between the first kind of image, wherein the first kind of image comprises a pre-reading image and/or a statistical coded image; the pre-read image is a pre-read current uncoded image;

a marking module, configured to mark the long-term reference picture as a non-long-term reference picture if the result of the third determining module is negative;

the third determining module includes: a fourth determining sub-module, configured to determine, for each long-term reference image in the long-term reference image buffer, whether to continue using the long-term reference image according to motion information between data blocks between the first class image and the long-term reference image when the second data relationship is a data relationship between the first class image and the long-term reference image;

alternatively, the first and second electrodes may be,

the third determining module includes: a fifth determining sub-module, configured to determine, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture according to a sum of inter-frame prediction distortion values of the first class picture and the long-term reference picture when the second data relationship is a data relationship between the first class picture and the long-term reference picture;

alternatively, the first and second electrodes may be,

the third determining module includes: a sixth determining sub-module, configured to determine, for each long-term reference picture in the long-term reference picture buffer, whether to continue using the long-term reference picture according to a sum of areas of the long-term reference pictures used in the counted encoded pictures, when the second data relationship is a data relationship between the first-class picture and the long-term reference picture and the first-class picture is a counted encoded picture;

alternatively, the first and second electrodes may be,

the third determining module includes: a seventh determining sub-module, configured to, when the second data relationship is a data relationship related to a first type of image, determine, for each image included in the first type of image, a fourth proportion of a fourth type of data block in the image, and determine whether the fourth proportion is smaller than an eleventh preset threshold, where the fourth type of data block is a data block whose motion vector modulus between the image and a previous frame of image of the image is larger than a twelfth preset threshold; and if all judgment results are yes, judging to continue using the long-term reference image, otherwise, judging not to continue using the long-term reference image.

27. The apparatus according to claim 26, wherein the fourth determining submodule is specifically configured to: determining a third proportion of a third class data block in the long-term reference image aiming at each image included in the first class of images, and judging whether the third proportion is smaller than a sixth preset threshold value or not, wherein the third class data block is a data block of which the motion vector modulus between the long-term reference image and each image included in the first class of images is larger than a seventh preset threshold value;

28. The apparatus according to claim 26, wherein the fifth determining sub-module is specifically configured to:

29. The apparatus according to claim 26, wherein the sixth determining submodule is specifically configured to: for each long-term reference image in the long-term reference image buffer area, if the sum of the areas of the long-term reference images used in the counted coded images is larger than a ninth preset threshold value, judging that the long-term reference image is continuously used, and otherwise, judging that the long-term reference image is not continuously used;

or the like, or, alternatively,

30. The apparatus of claim 26, further comprising: