CN117354526A

CN117354526A - Image coding method, device and medium

Info

Publication number: CN117354526A
Application number: CN202210753473.3A
Authority: CN
Inventors: 温武桢
Original assignee: Guangzhou Maile Information Technology Co ltd
Current assignee: Guangzhou Maile Information Technology Co ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2024-01-05

Abstract

The invention discloses an image coding method, device and medium. The method comprises the following steps: obtaining macro block attribute information corresponding to at least one frame of original image; determining the scene type of the corresponding frame coding image according to the macro block attribute information; and determining the coding strategy of the original image of the next frame according to the scene type of the at least one frame of coded image so as to code the original image of the next frame according to the coding strategy. The embodiment of the invention solves the problems of large calculation cost, limited scene type identification due to the limitation of calculation complexity and different emphasis points of a coding strategy due to the fact that the difference of target code rate and the like cannot be considered in the prior art, can identify more scene types and embody reality on the basis of ensuring that the calculation complexity is not additionally introduced.

Description

Image coding method, device and medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image encoding method, apparatus, and medium.

Background

With popularization of video conference, remote education and other applications, the demand of screen sharing is increasing, but different from the scene collected by the camera, the scene of screen sharing is quite abundant, for example, the scene can be used for making PPT demonstration, playing a video, or the current scene can be in screen up-and-down scrolling and the like. The coding strategies in different scenes are dynamically adjusted, so that great coding benefits can be brought. However, the precondition for dynamically adjusting the coding strategy is that the scene type of the current screen sharing needs to be identified.

In the prior art, a plurality of technical schemes are based on difference comparison of pixel values of original YUV (or RGB) images of two frames (or multiple frames) before and after encoding, and the images are divided into different scene types based on the size of the difference values. However, this scheme introduces additional computation time, and the larger the image resolution, the greater the time overhead required; moreover, the identified scene types are very limited; and the calculation process is irrelevant to the coding process of the coder, and the situation that the emphasis point of the coding strategy is different due to the difference of the target code rate and the like cannot be considered.

Disclosure of Invention

The invention provides an image coding method, device and medium, which are used for solving the problems that in the prior art, the calculation cost is high, the identified scene types are limited, and the important points of coding strategies are different due to the fact that the differences of target code rates and the like cannot be considered.

According to an aspect of the present invention, there is provided an image encoding method including:

obtaining macro block attribute information corresponding to at least one frame of original image;

determining the scene type of the corresponding frame coding image according to the macro block attribute information;

and determining the coding strategy of the original image of the next frame according to the scene type of the coded image of at least one frame so as to code the original image of the next frame according to the coding strategy.

According to another aspect of the present invention, there is provided an image encoding apparatus including:

the acquisition module is used for acquiring macro block attribute information corresponding to at least one frame of original image;

the first determining module is used for determining the scene type of the corresponding frame coding image according to the macro block attribute information;

and the self-adaptive coding module is used for determining the coding strategy of the original image of the next frame according to the scene type of the coded image of at least one frame so as to code the original image of the next frame according to the coding strategy.

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image encoding method according to any one of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the image encoding method according to any one of the embodiments of the present invention.

According to the technical scheme provided by the embodiment of the invention, the scene type of the coded image can be determined by carrying out combined calculation on the macro block attribute information of at least one frame of original image provided by the encoder, and extra calculation complexity is not required to be introduced, so that the problems of high calculation cost and limited scene type identification due to the limitation of the calculation complexity in the prior art are solved; the determination process of the scene type of the coded image is carried out in the encoder, so that the problem that the difference of the target code rate and the like cannot be considered in the prior art, which causes different emphasis points of the coding strategy, is solved, more scene types can be identified and the authenticity can be reflected on the basis of ensuring that the calculation complexity is basically not additionally introduced (namely, the calculation amount of the scene identification function is negligible for the calculation amount of the whole coding process).

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of an image encoding method according to an embodiment of the present invention;

FIG. 2 is a flowchart of another image encoding method according to an embodiment of the present invention;

FIG. 3 is a flowchart of another image encoding method according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an image encoding device according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an image encoding apparatus according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In an embodiment, fig. 1 is a flowchart of an image encoding method according to an embodiment of the present invention, where the method may be implemented by an image encoding device, and the image encoding device may be implemented in hardware and/or software, and the image encoding device may be configured in an image encoding apparatus. The image encoding device may be an encoder or other processing unit (e.g., a computer or other device having data processing functions), for example.

As shown in fig. 1, the method includes: S110-S130.

S110, macro block attribute information corresponding to at least one frame of original image is obtained.

Wherein, the original image refers to an image automatically intercepted from the process of screen sharing. The original image may be a YUV image or an RGB image, for example. In an embodiment, the at least one frame of original image refers to the current frame of original image or contains a history of N frames of original images preceding the current frame of original image.

In an embodiment, the frame type of each frame of the original image may be different. Wherein the frame type refers to a type of encoding an original image. Illustratively, the frame types include: intra-coded frames and inter-coded frames. Wherein, the intra-frame coding frame does not depend on other frames and can be independently coded; inter-coded frames need to be encoded in dependence on other encoded frames. Illustratively, an IDR frame may be used to represent an intra-coded frame and a P frame may be used to represent an inter-coded frame.

Before encoding the original image, dividing the original image of each frame to obtain a plurality of corresponding macro blocks, wherein the macro block sizes (i.e. macro block sizes) of the codes specified by different encoding standards are different, and illustratively, taking the H264 standard as an example, the macro block sizes of the codes specified by the H264 standard are 16×16; and then coding each macro block to obtain a corresponding coded image. For example, assuming that a frame of original image is 1280×720, the original image is divided into 3600 16×16 macroblocks and 3600 macroblocks are encoded in sequence before encoding the original image.

The macro block attribute information refers to the attribute information of each macro block after the corresponding encoding of each frame of original image. It is understood that macroblock attribute information is used to characterize attribute information of macroblocks after each encoding. In one embodiment, the macroblock attribute information includes at least one of: a macroblock motion vector; a macroblock type; macroblock size. The macro block motion vector is used for representing the position offset between the current macro block and the matched macro block in the reference frame; the macroblock types include: intra and inter macroblocks; the macroblock size is used to characterize the size of the number of bits that the current macroblock has been encoded. It should be noted that, when the original image is an intra-frame encoded frame, the macroblock types of all the corresponding macroblocks are intra-frame macroblocks; the original image is an inter-frame coded frame, and the macro block types of all macro blocks corresponding to the original image can be intra-frame macro blocks or inter-frame macro blocks. The intra-frame macro block relies on pixels of surrounding coded macro blocks to conduct intra-frame prediction, pixel residual values of the current macro block and the predicted macro block are obtained through calculation, and quantization coding is conducted on the residual values; and the inter-frame macro block performs motion search in the reference frame, finds a proper matching block corresponding to the reference frame, calculates to obtain a pixel residual value of the current macro block and the matching block, performs quantization coding on the residual value, and records the position offset (x, y) between the current macro block and the matching block, wherein the position offset is a macro block motion vector. For example, the macroblock motion vector of one of the macroblocks in a frame of the original image may be (0, 100).

Of course, the macroblock attribute information may further include: the information such as the macroblock QP and the macroblock residual is not limited to this.

S120, determining the scene type of the corresponding frame coding image according to the macro block attribute information.

The scene type is used for representing the scene type of screen sharing corresponding to each frame of original image. In an embodiment, the scene type may include, but is not limited to, one of: static scenes; a dynamic scene; screen scrolling scenes; switching scenes on a screen; fade in and fade out the scene.

In an embodiment, the scene type of the corresponding frame encoded image may be determined directly from the frame type; the scene type of the corresponding frame-encoded image may also be determined based on the frame type and the macroblock attribute information. In an embodiment, the macroblock attribute information includes a macroblock motion vector; a macroblock type; macroblock size. Accordingly, the scene type of the corresponding frame encoded image may be determined based on one or more of the macroblock motion vector, the macroblock type, and the macroblock size, and the frame type.

S130, determining the coding strategy of the original image of the next frame according to the scene type of the at least one frame of coded image so as to code the original image of the next frame according to the coding strategy.

In an embodiment, after determining the scene type of the current frame encoded image, the scene type of the current frame encoded image is stored. And then determining the encoding strategy of the original image of the next frame according to the scene type of the encoded image of the current frame so as to encode the original image of the next frame according to the encoding strategy.

Of course, the encoding strategy of the next frame of original image can also be determined according to the scene type of the historical N frame of encoded image containing the current frame of original image, so as to encode the next frame of original image according to the encoding strategy.

According to the technical scheme, the scene type of the coded image can be determined by carrying out combined calculation on the macro block attribute information of at least one frame of original image provided by the encoder, and extra calculation complexity is not required to be introduced, so that the problems of high calculation cost and limited scene type identification due to the limitation of the calculation complexity in the prior art are solved; the determination process of the scene type of the coded image is carried out in the encoder, so that the problem that the difference of the target code rate and the like cannot be considered in the prior art, so that the emphasis point of the coding strategy is different is solved, more scene types can be identified and the authenticity can be reflected on the basis of ensuring that the computational complexity is not additionally introduced.

In an embodiment, fig. 2 is a flowchart of another image encoding method according to an embodiment of the present invention, where the process of obtaining macroblock attribute information and the process of determining a scene type are further refined based on the above embodiment. As shown in fig. 2, the method includes: S210-S290.

S210, dividing the obtained at least one frame of original image according to a preset macro block division strategy to obtain a plurality of corresponding macro blocks.

The preset macro block dividing strategy refers to dividing an original image by adopting different strategies according to different coding standards. For example, assuming that the encoding standard is the H264 standard, the current frame original image may be divided into M macro blocks having a width of 16×16 pixels before encoding the original image, and then each macro block may be encoded in turn.

S220, coding all macro blocks corresponding to each frame of original image in sequence to obtain macro block attribute information of the corresponding frame.

In an embodiment, all macroblocks in each frame of original image are encoded in turn, so as to obtain the encoded macroblocks, and the macroblock attribute information of each macroblock is obtained. It should be noted that, after each macroblock is encoded, the macroblock attribute information of the macroblock can be obtained, and so on, to obtain the macroblock attribute information after encoding all the macroblocks corresponding to each frame of the original image.

S230, determining an abnormal motion vector in the motion vector of the macro block corresponding to the original image.

In an embodiment, when the frame type of a frame of original image is an inter-frame coded frame, the macroblock attribute information of a macroblock corresponding to the frame of original image needs to be analyzed, so as to determine the scene type according to the macroblock attribute information.

The abnormal motion vector refers to a motion vector in which a large deviation exists in the macroblock motion vector in the vertical direction or the horizontal direction.

In one embodiment, determining an abnormal motion vector among motion vectors of macro blocks corresponding to an original image includes: determining absolute values of motion vectors of macro blocks corresponding to an original image in a horizontal direction and a vertical direction respectively; the abnormal motion vector is determined based on the absolute values of the motion vectors in the horizontal direction and the vertical direction. The macroblock motion vector of each macroblock includes motion vectors in the horizontal direction and the vertical direction, respectively. In an embodiment, after dividing an original image into a plurality of macro blocks, encoding each macro block to obtain a corresponding macro block motion vector; then, obtaining the absolute values of the motion vectors of each macro block in the horizontal direction and the vertical direction respectively, determining whether the absolute values of the motion vectors in the two directions differ more, and if the absolute values of the motion vectors differ more, determining the macro block motion vector of the macro block as an abnormal motion vector; if the difference between the motion vector and the motion vector is smaller, the motion vector of the macro block is determined to be a non-abnormal motion vector. For example, assuming that a macroblock motion vector of one macroblock is (1,122), motion vector absolute values in horizontal and vertical directions are 1 and 122, respectively, and the motion vector absolute values in horizontal and vertical directions are more different, it is possible to determine the macroblock motion vector of the macroblock as an abnormal motion vector. Assuming that the macroblock motion vector of one macroblock is (53,64), the motion vector absolute values in the horizontal and vertical directions are 53 and 64, respectively, and the motion vector absolute values in the horizontal and vertical directions differ little, it is possible to determine that the macroblock motion vector of the macroblock is a non-abnormal motion vector.

S240, respectively carrying out nonlinear quantization and inverse quantization operations on the abnormal motion vector and the non-abnormal motion vector corresponding to each frame of original image to obtain a corresponding abnormal motion vector inverse quantization result and a corresponding non-abnormal motion vector inverse quantization result.

In the embodiment, since the motion vector is related to the complexity of the current original image, the target code rate of the encoder, the motion search algorithm of the encoder, and other factors, even if the scene scrolls up and down on the screen, the motion amplitude of the adjacent two macro blocks is subjectively the same, but the motion vectors of the adjacent two macro blocks cannot be guaranteed to be the same, that is, a certain deviation exists. For example, assuming that a macroblock motion vector of a certain macroblock is (1,121) and a macroblock motion vector of an adjacent macroblock is (2,123), the two macroblock motion vectors may be classified into the same abnormal motion vector although the values are different.

In the process of carrying out nonlinear quantization on abnormal motion vectors or non-abnormal motion vectors, the motion vectors are quantized to a smaller degree by using smaller values, and the motion vectors are quantized to a larger degree by using larger values, so that the deviation ratio of the motion vectors after quantization can be ensured, and the classification of the abnormal motion vectors is facilitated. Wherein a lesser degree of quantization may be understood as a lesser quantization level; a greater degree of quantization may be understood as a greater quantization level. Illustratively, assuming that the motion vector is (1, 12), the quantization level may be a smaller value of 2, 3, etc.; assuming that the motion vector is (1,156), the quantization level may be a larger value of 5, 10, etc.

S250, classifying the abnormal motion vector according to the abnormal motion vector inverse quantization result.

Illustratively, assuming that the abnormal motion vectors are configured as (1,121) and (2,123), respectively, quantization is performed at a quantization level of 5, and the quantized result (rounded down) is (1/5, 121/5) = (0, 24), (2/5, 123/5) = (0, 24), respectively; after performing inverse quantization (the quantization result is multiplied by the quantization level), the obtained results are (0×5,24×5) = (0, 120); the two different motion vectors are classified as an abnormal motion vector of the type (0, 120). In the embodiment, the abnormal motion vector is classified by adopting the abnormal motion vector dequantization result, so as to ensure that the abnormal motion vector dequantization result and the original abnormal motion vector are kept at a similar and equivalent data interval, and avoid that the data interval is too small to ensure the accuracy of size classification of the abnormal motion vector.

And S260, classifying the non-abnormal motion vector according to the non-abnormal motion vector dequantization result.

Illustratively, assuming that the non-abnormal motion vectors are configured as (52, 62) and (51, 61), respectively, quantization is performed at a quantization level of 5, and the quantized result (rounded down) is (52/5,62/5) = (10, 12), (51/5,61/5) = (10, 12), respectively; after performing inverse quantization (the quantization result is multiplied by the quantization level), the obtained results are (10×5,12×5) = (50, 60); the two different non-anomalous motion vectors are classified (50, 60) as non-anomalous motion vectors. In the embodiment, the non-abnormal motion vector is classified by adopting the non-abnormal motion vector inverse quantization result, so as to ensure that the non-abnormal motion vector inverse quantization result and the original non-abnormal motion vector are kept at a similar and identical data interval, and avoid that the data interval is too small to ensure the accuracy of size classification of the non-abnormal motion vector.

S270, determining the scene type of the corresponding frame coding image according to the macro block attribute information and/or the abnormal motion vector.

In one embodiment, S270 includes: and determining whether the scene type of the corresponding frame coding image is a screen scrolling scene or not according to the abnormal motion vector after classification. In an embodiment, after classifying the abnormal motion vector, determining the absolute values of the motion vector in the vertical direction and the horizontal direction respectively according to the abnormal motion vector after the classification, and if the absolute value of the motion vector in the vertical direction is far greater than the absolute value of the motion vector in the horizontal direction, moving the screen sharing picture up and down, namely, the macro block motion vector presents a vertical state; if the absolute value of the motion vector in the horizontal direction is far greater than that in the vertical direction, the screen-shared picture moves left and right, i.e. the macro block motion vector presents a horizontal state. It will be appreciated that the abnormal motion vector according to the vertical direction or the horizontal direction can be directly used to detect whether the original image of the current frame is in the screen scroll scene.

It should be noted that, the screen scrolling scene belongs to a special scene in the dynamic scene, but the emphasis of the coding strategy between the two is different, and because the pictures in the screen scrolling scene are regularly moved up and down or left and right and the moving range is slightly larger, a wider motion search range is required for the screen scrolling scene, and the common dynamic scene is not so harsh for the motion search range. It is therefore necessary to separate the "screen scroll scene" from the "dynamic scene" and adapt the more appropriate coding strategy alone.

In an embodiment, in case that the scene type of the encoded image is not a screen scroll scene, S270 includes: and determining whether the scene type of the corresponding frame coding image is a fade-in fade-out scene or not according to the macroblock type and the macroblock size in the macroblock attribute information. The screen of the shared screen is changed from bright to dark so as to be completely hidden, and the end of the lens is faded out and also faded out; the picture is changed from dark to bright and finally completely clear, and the beginning of the lens is called fade-in and fade-out. In an embodiment, in a case where the frame type of the original image is an inter-frame encoded frame and the scene type is not a screen scroll scene, whether the scene type of the corresponding frame encoded image is a fade-in fade-out scene may be determined directly according to the macroblock type and the macroblock size in the macroblock attribute information.

In an embodiment, determining whether a scene type of a corresponding frame encoded image is a fade-in fade-out scene according to a macroblock type and a macroblock size in macroblock attribute information includes: determining the number of macro blocks with the macro block type being an intra macro block in an original image; when the ratio between the number of macro blocks of the intra macro block and the total number of macro blocks in the current frame image reaches a preset ratio, obtaining the macro block size of each macro block in the original image of the corresponding frame; when the size of the macro block is smaller than a preset threshold value, determining that the scene type of the corresponding frame coding image is a fade-in fade-out scene; and when the size of the macro block is larger than or equal to a preset threshold value, determining the scene type of the corresponding frame coding image as a screen switching scene.

Under the condition that the frame type of the original image is an inter-frame coding frame, the macro block type of the macro block corresponding to the original image can be an intra-frame macro block or an inter-frame macro block, and the macro block number of which the macro block type is the intra-frame macro block in the original image of the frame is determined; then determining the proportion of the number of macro blocks of the intra-frame macro block to the total number of macro blocks in the original image of the frame, if the proportion reaches a preset proportion, obtaining the macro block size of each macro block in the original image of the frame, and if the macro block type is that the macro block size of the intra-frame macro block is smaller than a preset threshold value, determining that the scene type of the coded image of the frame is a fade-in fade-out scene; if the macro block type is that the macro block size of the macro block in the frame is larger than or equal to a preset threshold value, determining the scene type of the frame coding image as a screen switching scene. For example, assuming that the preset ratio is 80%, the preset threshold is 3 bytes, and the total number of macroblocks in the original image is 8100, the number of macroblocks in the original image, whose macroblock types are intra-frame macroblocks, reaches at least 6480, and the macroblock size of each macroblock in the original image of the frame is obtained. If the macro block type is that the macro block size of the macro block in the frame is smaller than 3 bytes, determining that the scene type of the frame coding image is a fade-in fade-out scene; if the macro block type is that the macro block size of the macro block in the frame is more than or equal to 3 bytes, determining the scene type of the frame coding image as a screen switching scene. The total number of macro blocks in the original image is related to the resolution of the original image, for example, the resolution of the original image is 1920×1080, and the total number of macro blocks in the original image is 1920×1080/256=8100.

In an embodiment, in case that the scene type of the encoded image is not a screen scroll scene and a fade scene, S270 includes: determining the number of macro block motion vectors in at least two preset motion vector ranges; determining the motion amplitude of the original image of the corresponding frame according to the number of macro block motion vectors in each preset motion vector range; and determining whether the scene type of the corresponding frame coding image is a dynamic scene or not according to the motion amplitude.

The preset motion vector range refers to a preset value range of counting barrels, wherein each counting barrel is used for storing the number of mv sizes in the preset range, and mv=max (x and y) is recorded as the largest motion vector absolute value in the motion vector absolute values of a macro block in the horizontal direction and the vertical direction respectively. Illustratively, assume that three counting buckets are created in advance, and the value ranges of the counting buckets are [0, 50), [50, 100), [100, 200), respectively, and if mv=30 for the current macroblock, the count of the first counting bucket is incremented by one; if mv=60 for the current macroblock, then the count for the second counting bucket is incremented by one, and so on. After classifying all macro blocks in the frame original image according to macro block motion vectors, determining the number of macro block motion vectors in each counting barrel (namely a preset motion vector range), determining the motion amplitude of the frame original image according to the number of macro block motion vectors in the preset motion vector range, and determining the scene type of the frame coded image as a dynamic scene if the number of macro block motion vectors in the preset motion vector range is the maximum [100, 200); if the preset motion vector range is [0, 50) and the number of macro block motion vectors is the largest, determining the scene type of the frame of coded image as a static scene; the static scene and the dynamic scene can be subdivided into complete static, most static, small-amplitude motion, large-amplitude motion and the like according to the numerical distribution of the counting barrel.

S280, determining the coding strategy of the original image of the next frame according to the scene type of the at least one frame of coded image so as to code the original image of the next frame according to the coding strategy.

In an embodiment, when at least one frame of original image is in a state of sharing stream, determining an encoding strategy of a next frame of original image according to a scene type of at least one frame of encoded image includes: determining the scene type of the current shared stream corresponding to the original image according to the scene type of the historical N-frame coded image; determining the coding strategy of the original image of the next frame according to the scene type of the current shared stream; wherein N is a positive integer greater than or equal to 1.

Wherein the shared stream refers to a shared scene composed of a plurality of frames of images. It is understood that a stream of shared multi-frame images is formed. In an embodiment, when the screen is in a state of sharing stream, the scene type of the current sharing stream can be determined according to the scene type of the N frames of images coded by history, and the coding strategy of the original image of the next frame can be determined according to the scene type of the current sharing stream. It is understood that the scene type may be identified once at 1-5 seconds intervals when at least one frame of the original image is in a state of sharing a stream.

In an embodiment, determining a scene type of a current shared stream corresponding to an original image according to a scene type of a historical N-frame encoded image includes: determining the total frame number of the original image corresponding to each scene type; the scene type with the largest total frame number of the original image is used as the scene type of the shared stream, and the scene type of the current shared stream can be determined based on an average score value.

Specifically, the scene type of the current shared stream is determined according to the average score value, and the specific process is as follows: first, a score value is set for each scene type of the image, for example, a score value of 0 for complete rest, a score value of 10 for most rest, a score value of 20 for small amplitude motion, and a score value of 40 for large amplitude motion; it is assumed that the scene type of the shared stream is determined from the historical 5-frame encoded images, and the types of these 5 frames are "most still scene", "full still scene", "small-amplitude motion scene", "large-amplitude motion scene", respectively. Then the fractional division value of 5 frames is (10+0+20+20+40)/(5=18 minutes), then a fractional range is determined for different scene types of the shared stream, for example, 0 to 5 is divided into completely static scenes, 6 to 15 is divided into most static scenes, 16 to 25 is a small-amplitude motion scene, 26 to 40 is a large-amplitude motion scene, and the scene type of the current shared stream can be considered as a small-amplitude motion scene because the average division value of 5 frames is 18.

Illustratively, the scene type of the last frame image (assuming that the 10 th frame) is a "static scene", and the scene types of the 9 th, 8 th, 7 th, 6 th and 5 th frames are all "dynamic scenes", then the scene type of the current shared stream is "dynamic scene", instead of taking the scene type of the last frame image as the type of the current shared stream.

In an embodiment, after determining the scene type of the current shared stream, different encoding strategies may be adapted for different scene types. For example, a static scene does not need very high fluency, and a somewhat lower frame rate can be used, so that better image quality can be achieved under the same code rate; and the dynamic scene can use higher frame rate so as to ensure fluency; and a wider motion search range of the encoder can be configured for the screen scrolling scene, so that the code rate is further compressed to improve the coding quality, and the like.

In an embodiment, fig. 3 is a flowchart of still another image encoding method according to an embodiment of the present invention, and this embodiment is used as a preferred embodiment to describe a determination process of an encoding strategy. In this embodiment, the image encoding apparatus may be an encoder, that is, encode an original image, determine a scene type of the encoded image, and dynamically adjust an encoding policy according to the scene type, all performed by the encoder.

As shown in fig. 3, the method includes: S310-S3150.

S310, acquiring an image of the computer desktop to obtain an original image of the current frame.

S320, dividing the original image of the current frame to obtain a plurality of macro blocks.

S330, analyzing and encoding the macro block.

S340, obtaining macro block attribute information of each macro block.

S350, extracting an abnormal motion vector.

S360, carrying out nonlinear quantization and inverse quantization on the abnormal motion vector.

And S370, classifying the macro block motion vectors according to the vector sizes.

S380, storing abnormal motion vectors, categorized macro block motion vectors and other macro block attribute information.

S390, whether all macro blocks of the original image of the current frame are coded or not is finished, if yes, S3100 is executed; if not, return to S330.

S3100, one frame of original image is coded.

S3110, analyzing scene types of the single-frame images.

S3120, storing scene types of the history N frame images.

S3130, whether scene recognition is carried out on the current shared stream corresponding to the original image or not, if so, S3140 is executed; if not, ending.

S3140, analyzing scene types of the historical N-frame images to obtain scene types of the current shared stream.

S3150, dynamically adjusting the coding strategy according to the scene type.

According to the technical scheme, on one hand, the scene recognition is directly carried out in the encoder, and the influence caused by the target code rate and the reference frame difference can be considered as a result of scene recognition; on the other hand, all parameters (variables) required by scene recognition are provided by an encoder, the scene recognition process only carries out combination analysis calculation on the parameters, and compared with the original scheme for calculating the difference values of the two images before and after, the scheme basically does not introduce extra calculation complexity; on the other hand, according to the range of the macro block motion vector in a single image, the macro block attribute information such as the macro block size and the like can identify more scenes according to the proportion of the number of macro blocks of different macro block types to the total number of macro blocks, and the types of the scenes often identified by the original scheme limited by the calculation complexity are less; on the other hand, abnormal motion vector extraction is introduced, and a nonlinear quantization function is adopted to separate a screen scrolling scene from the category of a dynamic scene, so that finer coding strategy adjustment can be performed; finally, the scene recognition result of the current shared stream is used as the scene type of the current shared stream, instead of the recognition result of the single image, so that the reality of scene recognition is greatly improved. For example: the current screen is in a scene of scrolling up and down, a certain pause exists in the scrolling process according to normal mouse operation of a person, and if the current sharing is possibly identified as a static scene by taking the identification result of a single image as the scene type of the current sharing, the current sharing is obviously not in accordance with the actual situation; the scene recognition result of the shared stream is used as the current scene type, the state of the current stream can be analyzed by combining the recognition result of a certain number of frames before pause, and the current shared stream is recognized as a screen scrolling scene, so that the authenticity can be reflected.

In an embodiment, fig. 4 is a schematic structural diagram of an image encoding device according to an embodiment of the present invention. As shown in fig. 4, the image encoding apparatus includes: an acquisition module 410, a first determination module 420, and an adaptive encoding module 430.

The acquiring module 410 is configured to acquire macro block attribute information corresponding to at least one frame of original image;

a first determining module 420, configured to determine a scene type of the corresponding frame encoded image according to the macroblock attribute information;

the adaptive encoding module 430 is configured to determine an encoding policy of an original image of a next frame according to a scene type of the at least one frame of encoded image, so as to encode the original image of the next frame according to the encoding policy.

In one embodiment, the macroblock attribute information includes at least one of: a macroblock motion vector; a macroblock type; macroblock size.

In an embodiment, macro block attribute information corresponding to at least one frame of original image is obtained, which is specifically used for:

dividing the obtained at least one frame of original image according to a preset macro block division strategy to obtain a plurality of corresponding macro blocks;

and coding all macro blocks corresponding to the original image of each frame in sequence to obtain macro block attribute information of the corresponding frame.

In an embodiment, the scene type of the corresponding frame encoded image is determined according to the macroblock attribute information, specifically for:

determining an abnormal motion vector in a macro block motion vector corresponding to an original image;

and determining the scene type of the corresponding frame coding image according to the macro block attribute information and/or the abnormal motion vector.

In one embodiment, determining an abnormal motion vector in the motion vectors of the macro blocks corresponding to the original image is specifically used for:

determining absolute values of motion vectors of macro blocks corresponding to an original image in a horizontal direction and a vertical direction respectively;

the abnormal motion vector is determined based on the absolute values of the motion vectors in the horizontal direction and the vertical direction.

In an embodiment, the image encoding apparatus further includes:

the second determining module is used for respectively carrying out nonlinear quantization and inverse quantization operations on the abnormal motion vector and the non-abnormal motion vector corresponding to each frame of original image to obtain a corresponding abnormal motion vector inverse quantization result and a corresponding non-abnormal motion vector inverse quantization result;

the first classification module is used for classifying the abnormal motion vector according to the abnormal motion vector dequantization result;

and the second classification module is used for classifying the non-abnormal motion vector according to the non-abnormal motion vector dequantization result.

In an embodiment, the scene type of the corresponding frame encoded image is determined according to the macroblock attribute information and/or the abnormal motion vector, and is specifically used for:

and determining whether the scene type of the corresponding frame coding image is a screen scrolling scene or not according to the abnormal motion vector after classification.

In an embodiment, in the case that the scene type of the encoded image is not a screen scrolling scene, the scene type of the encoded image of the corresponding frame is determined according to the macroblock attribute information and/or the abnormal motion vector, specifically for:

and determining whether the scene type of the corresponding frame coding image is a fade-in fade-out scene or not according to the macroblock type and the macroblock size in the macroblock attribute information.

In an embodiment, determining whether the scene type of the corresponding frame encoded image is a fade-in fade-out scene according to the macroblock type and the macroblock size in the macroblock attribute information is specifically used for:

determining the number of macro blocks with the macro block type being an intra macro block in an original image;

when the ratio between the number of macro blocks of the intra macro block and the total number of macro blocks in the current frame image reaches a preset ratio, obtaining the macro block size of each macro block in the original image of the corresponding frame;

when the size of the macro block is smaller than a preset threshold value, determining that the scene type of the corresponding frame coding image is a fade-in fade-out scene;

And when the size of the macro block is larger than or equal to a preset threshold value, determining the scene type of the corresponding frame coding image as a screen switching scene.

In an embodiment, in the case that the scene type of the encoded image is not a screen scroll scene or a fade scene, the scene type of the encoded image of the corresponding frame is determined according to the macroblock attribute information and/or the abnormal motion vector, specifically for:

determining the number of macro block motion vectors in at least two preset motion vector ranges;

determining the motion amplitude of the original image of the corresponding frame according to the number of macro block motion vectors in each preset motion vector range;

and determining whether the scene type of the corresponding frame coding image is a dynamic scene or not according to the motion amplitude.

In an embodiment, when at least one frame of original image is in a state of sharing stream, determining an encoding strategy of the next frame of original image according to a scene type of at least one frame of encoded image, which is specifically used for:

determining the scene type of the current shared stream corresponding to the original image according to the scene type of the historical N-frame coded image;

determining the coding strategy of the original image of the next frame according to the scene type of the current shared stream; wherein N is a positive integer greater than or equal to 1.

The image coding device provided by the embodiment of the invention can execute the image coding method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

In an embodiment, fig. 5 is a schematic structural diagram of an image encoding apparatus according to an embodiment of the present invention. As shown in fig. 5, a schematic diagram of the structure of an image encoding apparatus 10 that may be used to implement an embodiment of the present invention is shown. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 5, the image encoding apparatus 10 includes at least one processor 11, and a memory such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the image encoding apparatus 10 can also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

A plurality of components in the image encoding apparatus 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the image encoding apparatus 10 to exchange information/data with other apparatuses through a computer network such as the internet and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as an image encoding method, including: obtaining macro block attribute information corresponding to at least one frame of original image; determining the scene type of the corresponding frame coding image according to the macro block attribute information; and determining the coding strategy of the original image of the next frame according to the scene type of the at least one frame of coded image so as to code the original image of the next frame according to the coding strategy.

In some embodiments, the image encoding method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the image encoding device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the image encoding method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the image encoding method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an image encoding device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or a trackball) through which a user can provide input to the image encoding device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. An image encoding method, comprising:

2. The method according to claim 1, wherein the obtaining macroblock attribute information corresponding to at least one frame of original image includes:

3. The method of claim 1, wherein determining a scene type of a corresponding frame encoded image from the macroblock attribute information comprises:

determining an abnormal motion vector in the macro block motion vector corresponding to the original image;

4. A method according to claim 3, wherein said determining an abnormal motion vector among the motion vectors of the macro blocks corresponding to the original image comprises:

determining the absolute values of motion vectors of macro blocks corresponding to the original image in the horizontal direction and the vertical direction respectively;

and determining an abnormal motion vector according to the absolute values of the motion vectors in the horizontal direction and the vertical direction.

5. The method according to claim 3 or 4, characterized in that the method further comprises:

respectively carrying out nonlinear quantization and inverse quantization operations on an abnormal motion vector and a non-abnormal motion vector corresponding to each frame of original image to obtain a corresponding abnormal motion vector inverse quantization result and a corresponding non-abnormal motion vector inverse quantization result;

classifying the abnormal motion vector according to the abnormal motion vector dequantization result;

and classifying the non-abnormal motion vector according to the non-abnormal motion vector inverse quantization result.

6. A method according to claim 3, wherein said determining the scene type of the corresponding frame encoded image from said macroblock attribute information and/or said abnormal motion vector in case the scene type of said encoded image is not a screen scroll scene, comprises:

7. The method of claim 6, wherein determining whether the scene type of the corresponding frame encoded image is a fade-in fade-out scene based on the macroblock type and the macroblock size in the macroblock attribute information comprises:

determining the number of macro blocks with the macro block type being intra macro blocks in an original image;

when the macro block size is smaller than a preset threshold value, determining that the scene type of the corresponding frame coding image is a fade-in fade-out scene;

8. The method according to claim 6 or 7, wherein, in case the scene type of the encoded image is not a screen scroll scene or a fade-in and fade-out scene, the determining the scene type of the encoded image of the corresponding frame according to the macroblock attribute information and/or the abnormal motion vector comprises:

9. The method according to claim 1, wherein determining the encoding strategy of the next frame of original image according to the scene type of at least one frame of the encoded image when the at least one frame of original image is in a state of a shared stream, comprises:

10. An image encoding apparatus, characterized in that the image encoding apparatus comprises:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image encoding method of any one of claims 1-9.

11. A computer readable storage medium storing computer instructions for causing a processor to implement the image encoding method of any one of claims 1-9 when executed.