CN114679585A

CN114679585A - Differential encoding mode and computer screen content encoding method

Info

Publication number: CN114679585A
Application number: CN202210087789.3A
Authority: CN
Inventors: 马茂林; 白雪松; 刘炜
Original assignee: Shanghai Cubic Digital Information Technology Co ltd
Current assignee: Shanghai Cubic Digital Information Technology Co ltd
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2022-06-28

Abstract

The invention discloses a computer screen content coding method, which determines an adopted coding mode according to the type of a current frame image of computer screen content, codes the current frame image to obtain image coding data, and inserts coding mode information into the head of the image coding data. The invention also discloses a differential coding mode, which divides the screen content of the word processing type computer into N blocks, compares each block with the block at the corresponding position of the previous frame to form coded data, and updates the data of the block with difference according to the coded data.

Description

Differential encoding mode and computer screen content encoding method

Technical Field

The present invention relates to the field of encoding technologies, and in particular, to a differential encoding mode and a computer screen content encoding method.

Background

With the continuous development of computer and network technologies, digitized texts, images, graphics and videos gradually replace traditional analog media, so that the media is edited and spread more conveniently and widely. For example, in applications such as virtual desktop, video conferencing, remote teaching, and remote medical care, screen sharing technology is applied to transmit and display the contents displayed on the screen of a local computer to a remote terminal. In order to ensure the quality of the screen display content when the screen is shared, the computer screen content needs to be encoded by adopting a proper encoding method.

Compared with naturally shot video, computer screen content has more complex spatial and spectral features, which usually contain discontinuous tones, and is greatly different from naturally shot video. This leads to a reduction in coding performance if conventional video coding techniques are applied directly to the coding of computer screen content.

In view of the above problems, some encoding methods designed for the characteristics of computer screen content have been proposed, for example, in new AVS and MPEG video compression standards, the requirements of computer screen content encoding have been considered, and some targeted encoding techniques have been proposed for the characteristics of computer screen content. However, these technologies are usually designed for a certain computer screen content, and it is difficult to completely adapt to the complicated and varied computer screen content.

Disclosure of Invention

To solve some or all of the problems in the prior art, the present invention firstly provides a differential encoding mode applied to encoding of computer screen contents of word processing type, the differential encoding mode comprising:

dividing the content of the computer screen into N-by-N squares, and comparing the N-by-N squares with the squares at the corresponding positions of the previous frame respectively to form coded data; and

And updating the data of the blocks with the differences according to the coded data.

Further, forming the encoded data includes:

comparing each square with the square at the corresponding position of the previous frame respectively:

if the two marks are identical, the first mark is used as the coded data of the square; otherwise

Taking the second mark and the data of the square as the coded data of the square; and

and sequentially arranging the coded data of the N x N squares to form the coded data of the computer screen content.

The invention also provides a computer screen content coding method, which comprises the following steps:

determining an adopted coding mode according to the type of a current frame image of the computer screen content;

coding the current frame image according to the coding mode to obtain image coded data; and

and inserting coding mode information into the head of the image coding data.

Further, determining the coding mode to employ comprises:

judging the type of the current frame image of the computer screen content; and

the encoding mode is manually selected according to the type.

Further, determining the coding mode to employ comprises:

judging the type of the current frame image of the computer screen content by adopting a deep learning method according to the characteristics of the image; and

According to said type, the coding mode is automatically selected.

Further, determining the coding mode to use includes:

and adopting different coding modes to code the current frame image of the computer screen content, comparing the current frame image and selecting the one with the best effect as a final coding mode.

Further, the encoding mode includes:

the differential encoding mode as described previously for encoding images of the word processing type; a generation image coding mode for coding an image of a generation image type; and

a natural photographed image encoding mode for encoding an image of a natural photographed image type.

Further, the generating an image encoding mode includes encoding a current frame image of the computer screen content according to the AVS2 and/or MPEG encoding standards.

Further, the nature shooting image encoding mode includes encoding a current frame image of the computer screen content according to AVS3 or h.266 encoding standards.

The present invention also provides a computer program product comprising computer program instructions for directing the steps of the computer screen content encoding method as described above.

The present invention is based on the following insights of the inventors: in practical applications, computer screen contents are various and may change at any time, however, existing computers often adopt a single encoding mode to encode the computer screen contents, and the diversity of the screen contents is not considered. Moreover, the existing computer screen content coding method is mostly aimed at generating image types, and the algorithm is complex. In scenes such as video conferences, screen sharing of remote teaching and the like, the use frequency of a word processing software interface is relatively high. The inventor further researches and discovers that the computer screen content formed by the file processing software interface generally has the following characteristics: the images contain less colors, the updating frequency is low, and the change range is small, namely, for the content of the computer screen, the number of pixels which change between each frame of image is less. Therefore, it is conceivable to transmit and display an image by directly storing pixel data. If the existing computer screen content coding method is adopted to code the content, the resource waste is greatly caused. Based on this, the inventor proposes a differential encoding mode, which updates only the changed pixel data without performing other operations, and thus can greatly improve the encoding efficiency of the word processing type image on the premise of ensuring the display effect. In addition, the inventor further provides an encoding method, which firstly judges the type of the computer screen content and then selects different encoding modes in a targeted manner before encoding the computer screen content. The encoding method can ensure the encoding and/or compression quality of different types of computer screen contents and improve the encoding and/or compression efficiency of the computer screen contents.

Drawings

To further clarify the above and other advantages and features of embodiments of the present invention, a more particular description of embodiments of the present invention will be rendered by reference to the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. In the drawings, the same or corresponding parts will be denoted by the same or similar reference numerals for clarity.

FIG. 1 illustrates a flow diagram of a differential encoding scheme in accordance with an embodiment of the present invention; and

fig. 2 is a flow chart of a method for encoding computer screen content according to an embodiment of the present invention.

Detailed Description

In the following description, the present invention is described with reference to examples. One skilled in the relevant art will recognize, however, that the embodiments may be practiced without one or more of the specific details, or with other alternative and/or additional methods, materials, or components. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention. Similarly, for purposes of explanation, specific numbers, materials and configurations are set forth in order to provide a thorough understanding of the embodiments of the invention. However, the invention is not limited to these specific details. Further, it should be understood that the embodiments shown in the figures are illustrative representations and are not necessarily drawn to scale.

Reference in the specification to "one embodiment" or "the embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment.

It should be noted that the embodiment of the present invention describes the process steps in a specific order, however, this is only for the purpose of illustrating the specific embodiment, and does not limit the sequence of the steps. Rather, in various embodiments of the present invention, the order of the steps may be adjusted according to process adjustments.

Aiming at the diversity of computer screen contents and the characteristics of different computer screen contents, the invention provides a differential coding mode and a computer screen content coding method.

The solution of the invention is further described below with reference to the accompanying drawings of embodiments.

Fig. 2 is a flow chart of a method for encoding computer screen content according to an embodiment of the present invention. As shown in fig. 2, a method for encoding computer screen content includes:

First, at step 201, a computer screen content type is determined. In the embodiment of the invention, common computer screen contents are divided into three categories according to the characteristics of different computer screen contents: a word processing type, a generated image type, and a nature shooting image type. The word processing type refers to a word processing software interface, such as a word, PDF, text document, and other software interfaces, the generated image type refers to an image generated by other types of computer software, such as a game interface, two-dimensional and three-dimensional animation, and the naturally shot image type refers to a naturally shot video, picture, and the like played by a computer. In one embodiment of the invention, the current computer screen content type is judged by means of artificial judgment. In another embodiment of the present invention, the type of the current computer screen content is automatically determined by a deep learning method, specifically, by learning sample data, an intrinsic rule and a representation hierarchy of the sample data are obtained, and then the type of the current computer screen content is determined according to the intrinsic rule and the representation hierarchy. In yet another embodiment of the present invention, the type of the current computer screen content is determined by extracting image features, and comparing the image features with preset values, wherein the image features may include, for example, a mean, a variance, a correlation coefficient, and the like of an image;

Next, at step 202, the encoding mode is determined. And determining the coding mode to be adopted according to the type of the computer screen content obtained in the step 201. In the embodiment of the present invention, three encoding modes are respectively designed for the three types of computer screen contents as described above: differential encoding mode, generated image encoding mode, and natural shot image encoding mode:

for the type of word processing, a differential encoding mode is employed. The computer screen content of the word processing type contains less color, so that the image can be considered to be transmitted and displayed by directly storing the pixel data, and meanwhile, the computer screen content of the word processing type has low updating frequency and small change range, namely, for the computer screen content, the changed pixel points between each frame of image are less. With respect to these characteristics, the present invention provides a differential encoding mode that can refresh image data as little as possible. Fig. 1 shows a flow diagram of a differential encoding mode according to an embodiment of the present invention. As shown in fig. 1, the differential encoding mode includes:

first, in step 101, an image is divided. Dividing an image to be processed into N x N squares;

Next, at step 102, the images are compared. Comparing each square with the square at the corresponding position of the previous frame to form encoded data, specifically, in an embodiment of the present invention, the encoded data is sequentially arranged from the encoded data of N × N squares, where the encoded data of any square may include two cases:

when the square block is completely the same as the square block at the position corresponding to the previous frame, a first mark is used as the coded data of the square block, and the first mark can be 'skip' for example, so as to indicate that the data in the square block does not need to be processed; and

when the square block is different from the square block at the position corresponding to the previous frame, a second mark and the data of the square block are taken as the coded data of the square block, wherein the second mark can be 'refreshing' for example, so as to indicate that the data in the square block needs to be refreshed according to the content in the coded data; and

finally, in step 103, the image is updated. Updating the data of the square with difference according to the coded data obtained in the step 101 to finish coding;

for the type of generated image, there are mature encoding standards or techniques, so that the existing computer screen content encoding standard can be directly adopted to encode the generated image, for example, the current frame image of the computer screen content is encoded according to the AVS2 and/or the MPEG encoding standard; and

There are also mature coding standards or techniques for naturally shot image types, and therefore, it is also possible to directly code them using existing computer screen content coding standards, for example, according to the AVS3 or h.266 coding standard, to code the current frame image of the computer screen content.

In an embodiment of the invention, the selection of the encoding mode may be performed manually or automatically. The automatic processing refers to automatically calling a program of a corresponding coding mode for processing according to the determination result in step 201.

In another embodiment of the present invention, step 201 may be omitted, and the encoding mode may be determined directly by an effect comparison method, specifically, the method includes:

firstly, respectively adopting three coding modes to code the content of a computer screen;

then, the coding effects of the three coding modes are compared, and the optimal one is selected as the final coding mode. The coding effect mainly refers to rate distortion performance, rate distortion optimization is an evaluation method commonly used in the field of image and video compression coding, and the essence of the method is that distortion after image compression and bit number after compression are converted into a numerical value, and rate distortion is used for replacing

The general formula for the price is as follows:

J＝D+λR

where D is a measure of distortion, usually the peak signal-to-noise ratio (PSNR), R is the number of bits of the encoded data of the image after encoding, λ is a coefficient that converts the number of bits R into distortion, and J is the rate-distortion cost. In the process of coding a certain block in an image or an image, a plurality of coding modes can be selected, each coding mode generates corresponding distortion and coded data, and the goal of rate distortion optimization is to find out the coding mode with the minimum rate distortion cost J as a final coding mode. Meanwhile, for the optimal judgment, the comprehensive judgment combining the rate distortion cost and the coding complexity can be realized. For example, in an application scenario where the temporal requirement is high, the encoding mode with the shortest encoding time may be used as the final encoding mode, whereas in an application scenario where the compression quality requirement is high, the value of λ may be appropriately reduced when calculating the rate distortion cost, and thus an encoding mode with less distortion tends to be selected. It should be understood that in other embodiments of the present invention, other parameters for evaluating the encoding quality may also be used to determine the encoding effect and determine the optimal standard;

Next, in step 203, the image is encoded. According to the encoding mode determined in the step 202, encoding the current frame image to obtain image encoded data; and

finally, in step 204, the coding mode information is inserted. In order to facilitate correct decoding, encoding mode information needs to be inserted into the header of the image encoded data to facilitate selection of a correct decoding mode.

Before the coding of the computer screen content, the coding method provided by the invention firstly judges the type of the computer screen content and then selects different coding modes in a targeted manner. The encoding method can ensure the encoding and/or compression quality of different types of computer screen contents and improve the encoding and/or compression efficiency of the computer screen contents. Meanwhile, aiming at the computer screen content of the word processing type, the invention also provides a differential coding mode, which only updates the changed pixel data without other operations, thereby greatly improving the coding efficiency of the word processing type image on the premise of ensuring the display effect.

While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various combinations, modifications, and changes can be made thereto without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention disclosed herein should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

1. A differential coding scheme, comprising the steps of:

dividing an image to be processed into N-N squares, and comparing each square with a square at a corresponding position of a previous frame respectively to form coded data, wherein N is a natural number greater than 1; and

2. The differential encoding mode of claim 1, wherein forming encoded data comprises:

and sequentially arranging the coded data of the N-by-N squares to form the coded data of the image to be processed.

3. A method for encoding computer screen content, comprising:

and inserting coding mode information into the head of the image coding data.

4. The computer screen content encoding method of claim 3, wherein determining the encoding mode to use comprises:

judging the type of a current frame image of the computer screen content; and

the encoding mode is manually selected according to the type.

5. The computer screen content encoding method of claim 3, wherein determining the encoding mode to use comprises:

judging the type of the current frame image of the computer screen content according to the characteristics of the image and/or by adopting a deep learning method; and

the encoding mode is automatically selected according to the type.

6. The computer screen content encoding method of claim 3, wherein determining the encoding mode to use comprises:

adopting different coding modes to code the current frame image of the computer screen content, comparing the current frame image, and determining the adopted coding mode according to the coding effect, wherein the coding mode comprises the following steps:

selecting a coding mode with the lowest rate distortion cost; and/or

And selecting the coding mode with the shortest coding time.

7. The computer screen content encoding method of claim 3, wherein the encoding mode comprises:

the differential encoding mode of claim 2, configured to encode a word processing type image;

Generating an image encoding mode configured to encode an image of a generated image type; and

a natural shot image encoding mode configured to encode an image of a natural shot image type.

8. The computer screen content encoding method of claim 7, wherein the generating the image encoding mode comprises encoding a current frame image of the computer screen content according to AVS2 and/or an MPEG encoding standard.

9. The computer screen content encoding method of claim 7, wherein the natural shot image encoding mode comprises encoding a current frame image of the computer screen content according to AVS3 or h.266 encoding standards.

10. A computer program product comprising computer program instructions configured to perform the steps of the computer screen content encoding method according to any one of claims 3 to 9.