CN105007483A - Screen content encoding and decoding method compatible with H264 standard - Google Patents

Screen content encoding and decoding method compatible with H264 standard Download PDF

Info

Publication number
CN105007483A
CN105007483A CN201510400827.6A CN201510400827A CN105007483A CN 105007483 A CN105007483 A CN 105007483A CN 201510400827 A CN201510400827 A CN 201510400827A CN 105007483 A CN105007483 A CN 105007483A
Authority
CN
China
Prior art keywords
image block
dictionary
coding
encoding
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510400827.6A
Other languages
Chinese (zh)
Other versions
CN105007483B (en
Inventor
王中元
傅佑铭
何政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Suirui cloud Technology Co. Ltd.
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201510400827.6A priority Critical patent/CN105007483B/en
Publication of CN105007483A publication Critical patent/CN105007483A/en
Application granted granted Critical
Publication of CN105007483B publication Critical patent/CN105007483B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a screen content encoding and decoding method compatible with an H264 standard, which introduces dictionary compression into a conventional video encoding frame and adds an encoding mode for a text content, namely dictionary encoding. Through joint optimization of a code rate and distortion, a most appropriate encoding mode is selected for each image, a text area generally selects dictionary encoding, and other areas remain the original encoding way, so that the compression quality of a great number of text areas in the screen content is improved. Meanwhile, through reasonably utilizing the encoding mode remaining H264 and properly processing the dictionary encoding time, the compatibility with the standard technology is kept. The method has higher compression quality while the code stream is compatible with the H264 standard.

Description

A kind of screen content coding-decoding method with H264 operating such
Technical field
The invention belongs to video coding and decoding technology field, relate to a kind of screen content decoding method, be specifically related to screen content coding-decoding method that is a kind of and H264 operating such.
Technical background
In video conference, long-distance education, remote collaboration office system, Sharing computer screen content is an important function, and Screen sharing is show and share remote document data and provide a fast approach.Screen content image is the image of a kind of character and graphic and natural image mixing, comprise Word/PDF document, PPT gives a lecture document, the all kinds such as Web page and day by day variation, simultaneously because screen picture resolution is higher, comparatively large to network bandwidth consumption, therefore, effectively must compress it.
Character and graphic part in vision-mix comprises the high-frequency information of many human eye sensitivity, traditional still image pressure standard (as JPEG) and dynamic video compression standard (as H264) based on human eye to the insensitive feature of natural image medium-high frequency information by HFS coarse quantization, be directly used in compressed mixed image, often cause character and graphic smudgy.Some are intended to the improvement opportunity maintaining text pattern edge high-frequency information, as spatial domain directly quantize, palette coding, Lossless Compression, need the kernel of Standard modification coding framework, the compatibility with standard decoder cannot be accomplished, have impact on the interoperability that screen content is shared.
Summary of the invention
In order to solve the problems of the technologies described above, the invention provides screen content coding-decoding method that is a kind of and H264 operating such.
The technical solution adopted in the present invention is: a kind of screen content coding-decoding method with H264 operating such, and it is characterized in that, described coding method, comprises the following steps:
Step 1: image block number of coded bits is estimated, choose some typical text screen contents and form large training dataset, frame by frame dictionary encoding is performed to the image in training set, the overall bit number that statistics produces, again according to the total number of image block, these bit numbers are converted single image block, namely obtains the bit number R under single image block dictionary encoding mode;
Step 2: in the dictionary encoding pattern obtained in H264 Standard coding modes and step 1, optimizing cost function by code rate distortion is that each image block chooses forced coding pattern, by its schema code, I_PCM is set to for the selected image block for dictionary encoding pattern, but does not encode immediately;
Step 3: image block data is collected, and the image block data of each I_PCM of being determined as is write a common buffer; Repeat step 2, until a two field picture is disposed;
Step 4: perform dictionary encoding after recombinating by column major order to the pixel of each image block, comprises brightness and two chromatic components, then performs dictionary encoding, before the code stream of the code stream of dictionary encoding write H264 standard code, forms composite bit stream.
As preferably, the typical text screen content described in step 1 comprises Word document, PPT lantern slide, webpage, CAD figure.
As preferably, what the dictionary encoding described in step 1 adopted is Lempel-Ziv-Markovchain-Algorithm algorithm.
As preferably, described in step 2, optimizing cost function by code rate distortion is that each image block chooses forced coding pattern, its specific implementation process is the distortion D of computed image block under two kinds of coding modes and bit number R, then optimizes cost function J=D+ λ R by code rate distortion and chooses the minimum pattern of Combine distortion J as forced coding pattern; Wherein J is Combine distortion, and parameter lambda is LaGrange parameter, and λ is for weighing the metric weights between distortion and code check.
As preferably, carry out coding/decoding method after coding, comprise the following steps:
Step 1: extract dictionary code stream from composite bit stream, then performs dictionary decoding, obtains the decoding sampling point data that all patterns are I_PCM image block;
Step 2: sequential scanning decoding sampling point data also resolve H264 code stream, are the image block of I_PCM to the pattern of parsing, by the pixel sampling point data of its correspondence write H264 code stream;
Step 3: the H264 decode procedure of operative norm.
Compare with standardized H264 coding techniques and the improvement project of encoding for screen video more at present, the present invention has the following advantages and good effect:
(1) the present invention to be distributed to prospect by the dictionary encoding pattern that newly increases and sparse text filedly takes Lossless Compression, improves coding quality and the compression efficiency of screen video;
(2) the present invention use H264 definition but the I_PCM pattern do not used instruction dictionary coding mode, therefore, code stream by the H264 decoder identification of standard, can not need to revise decoder, maintains standard compatibility.
(3) the dictionary encoding pattern that the present invention newly increases does not need the distortion of block-by-block calculation code and bit number consumption, therefore, does not bring the extra increase of computational complexity.
Accompanying drawing explanation
Fig. 1: the coding flow chart of the embodiment of the present invention.
Fig. 2: the decoding process figure of the embodiment of the present invention.
Embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with drawings and Examples, the present invention is described in further detail, should be appreciated that exemplifying embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.
Screen content is mainly based on text, figure, front backcolor is well arranged, image background color is single, foreground color does not also enrich relatively, the Data distribution8 in pixel color space is more sparse, generally concentrate on the numerically several of minority, local correlations in image block between pixel is not strong, therefore, removes the limited use of the transition coding of spatial coherence; On the other hand, be different from natural scene video, screen content is not generally containing band noise, and word clear-cut margin, and quantizing distortion meeting fuzzy literal edge, causes word unintelligible.Based on above 2 understanding, the encoding scheme that traditional conversion adds quantification is not suitable for the compression of high-quality screen content.On the contrary, the feature of the text filed pixel sparse distribution of screen content, is applicable to harmless dictionary compression just.
Under above-mentioned technical thought, the present invention needs emphasis to solve three key issues:
(1) define Multi-encoding pattern in the existing compression standard such as H264, the present invention has increased again a kind of dictionary encoding pattern newly.Typically, existing coding mode is effective to the regional compare with natural video frequency attribute, as the image, animation etc. that embed, and newly-increased encoder dictionary pattern is generally more suitable for text block compression, have selected coding mode mistakenly and can reduce overall compression performance on the contrary.Therefore, for each image block, to distribute appropriate coding mode very crucial exactly.
(2) central principle of dictionary encoding is exactly from historical data, find the coupling of current data, if the match is successful, just carrys out alternative initial data with data to (matching length, matching distance), thus realizes the Lossless Compression to data.Therefore, the efficiency of dictionary encoding and the length of data to be encoded closely related, first encoding input data more, efficiency is higher, otherwise lower.This point is also demonstrated with a large amount of practices of the tool compresses files such as WinZip or WinRar.But the initial data length of 16x16 pixel image block is very short, carry out separately one by one compressing the performance that will significantly limit dictionary encoding device.How to ensure that dictionary encoding efficiency is most important by suitable data recombination.
(3) newly-increased dictionary encoding pattern obviously can not be met the decoder of the standards such as H264 accept, directly code stream is given standard decoder decoding, decoder can be considered as mistake; Amendment decoder kernel can accomplish the compatibility to the pattern increased, but in many instances, decoder is all transparent concerning user and application developer, does not possess the condition of amendment, as hardware decoder.Therefore, how the code stream of dictionary and existing standard hybrid coding is being sent into code stream that decoder kernel pre reduction is standard thus to maintain standard compatibility very important.
For the problems referred to above, the present invention, on the basis of feature analyzing H264 coding standard and representative dictionary compress technique, proposes following solution one by one.
(1) Video coding selects optimum coding mode according to code rate distortion optimization (RDO) usually, preferred coding mode will provide minimum code distortion and code check consumption simultaneously, namely the cost function optimized is J=D+ λ R, wherein J is Combine distortion, D is the image fault that lossy coding brings, R is the bit number produced by this pattern-coding, parameter lambda is LaGrange parameter, metric weights between balance distortion and code check, generally the prior mode by statistics or experience pre-sets.What the original coding mode of H264 performed is lossy coding, and distortion D and bit number R index all will be added up, but under dictionary encoding pattern, owing to being lossless coding, distortion is actual is zero, only needs to weigh bit number.Usual lossy coding is sent into entropy coder coding by the parameter links such as conversion, quantification produced, is determined the value of parameter R again according to the bit number of actual coding, but under dictionary encoding pattern, if also take this thinking, because dictionary encoding device is not good for the compression performance of low volume data, compression efficiency will be underestimated, cause the bit number R produced will exceed than actual coding situation, thus bring mode decision inaccurate, much should be judged as that the block of dictionary encoding may be mistaken for other pattern.Therefore, the present invention, when weighing the bit number that dictionary encoding image block needs, is not carry out actual coding to image block to obtain, but takes the mode of training to be that image block pre-estimates a numerical value.Specific practice is, choose some typical text screen contents and form large training dataset, then frame by frame dictionary encoding is performed to the image in training set, the overall bit number that statistics produces, again according to the total number of image block, these bit numbers are converted single image block, namely obtains the bit number discreet value under single image block dictionary encoding mode.The prior off-line training of this process is good, and the R using this discreet value to substitute in RDO formula in real cataloged procedure carries out RDO optimum choice.
(2) sign in dictionary encoding device compression efficiency when the low volume data of processing block rank limited, therefore the present invention does not perform coding separately immediately to the image block being determined as dictionary encoding pattern, but collect together, after processing Deng a two field picture, concentrative implementation dictionary encoding again, compressed bit stream becomes a complete code stream with the bit stream complex of standard.Simultaneously, consider that the height of character shared pixel in dot matrix word library is often greater than width, therefore, improve dictionary encoding efficiency for the neighborhood repeatability strengthened between pixel, when accessing the image block of 16x16 pixel-matrix structure, not row major order scanning routinely, but by column major order's scanning.By these two measures, image block is reasonably recombinated, will effectively promote dictionary encoding efficiency.
(3) typically, pattern in the encoder outside self-defined a kind of standard to accomplish operating such hardly may, the encoder kernel that the present invention adopts only carries out dictionary encoding mode decision, the thinking of the outer concentrative implementation coding of encoder creates chance for the compatibility that maintains the standard.And, a kind of I_PCM coding mode of H264 standard definition, this pattern just directly encapsulates pixel sampling point data, do not carry out any damaging or Lossless Compression, therefore, use hardly in normal application of leading with boil down to, therefore, this pattern can be utilized to indicate dictionary encoding.Summary is got up, the standard compatibility strategy that the present invention takes at codec end is as follows: encoder kernel will be determined as the block I_PCM pattern instruction of dictionary encoding pattern, but do not encode, to collect data encoder outward concentrate coding, then by code stream multiplex before normal H264 code stream; Decoder is before entering kernel decoding, extract dictionary code stream, execution dictionary is decoded, then the result of prescan dictionary decoding and H264 code stream, the sampling point data that dictionary decoding recovers are mapped to the macroblocks that coding mode is I_PCM successively, after completing this preprocessing process, then the H264 decode operation of operative norm.Because the result of dictionary decoding is original sampling point data, H264 decoder is understood according to I_PCM pattern can not cause any ambiguity.By post-processing step and the corresponding pre-treatment step of decoder end of as above encoder-side, under the prerequisite not revising decoder kernel, the dictionary coding method that the present invention proposes can be compatible with the H264 decoder of standard.
Ask for an interview Fig. 1, a kind of screen content coding-decoding method with H264 operating such provided by the invention, described coding method, comprises the following steps:
Step 1: image block number of coded bits is estimated, choose some typical text screen contents (comprising Word document, PPT lantern slide, webpage, CAD figure) and form large training dataset, frame by frame dictionary encoding is performed to the image in training set, the overall bit number that statistics produces, again according to the total number of image block, these bit numbers are converted single image block, namely obtains the bit number R under single image block dictionary encoding mode;
The present embodiment chooses representational screen content composition training dataset, comprise 50 width Word document images, 50 width PPT file and pictures, 50 width Web page images, dictionary encoding device adopts LZMA (Lempel-Ziv-Markovchain-Algorithm) algorithm.
Step 2: in the dictionary encoding pattern obtained in H264 Standard coding modes and step 1, optimizing cost function by code rate distortion is that each image block chooses forced coding pattern, by its schema code, I_PCM is set to for the selected image block for dictionary encoding pattern, but does not encode immediately;
Wherein code rate distortion optimizes cost function is that each image block chooses forced coding pattern, its specific implementation process is the distortion D of computed image block under two kinds of coding modes and bit number R, then optimizes cost function J=D+ λ R by code rate distortion and chooses the minimum pattern of Combine distortion J as forced coding pattern; Wherein J is Combine distortion, and parameter lambda is LaGrange parameter, and λ is for weighing the metric weights between distortion and code check.
LaGrange parameter empirically formula λ=2 of the present embodiment qp/6-2determine, wherein qp is quantization parameter.
Step 3: image block data is collected, and the image block data of each I_PCM of being determined as is write a common buffer; Repeat step 2, until a two field picture is disposed;
Step 4: perform dictionary encoding after recombinating by column major order to the pixel of each image block, comprises brightness and two chromatic components, then performs dictionary encoding, before the code stream of the code stream of dictionary encoding write H264 standard code, forms composite bit stream.
Ask for an interview Fig. 2, a kind of screen content coding/decoding method with H264 operating such provided by the invention, comprises the following steps:
Step 1: extract dictionary code stream from composite bit stream, then performs dictionary decoding, obtains the decoding sampling point data that all patterns are I_PCM image block;
Step 2: sequential scanning decoding sampling point data also resolve H264 code stream, are the image block of I_PCM to the pattern of parsing, by the pixel sampling point data of its correspondence write H264 code stream;
Step 3: the H264 decode procedure of operative norm.
Dictionary compression is introduced in traditional video coding framework by the present invention, newly-increased a kind of coding mode for content of text---dictionary encoding.By the combined optimization of code check and distortion, be that each image block selects the most appropriate coding mode, text filed general selection dictionary encoding, other region retains original coded system, thus improves a large amount of text filed compression quality occurred in screen content.Meanwhile, by the Appropriate application of coding mode that retains H264 and the appropriate process on dictionary encoding opportunity, the compatibility with standard technique is maintained.The present invention has higher compression quality, with code stream and H264 operating such.
Should be understood that, the part that this specification does not elaborate all belongs to prior art.
Should be understood that; the above-mentioned description for preferred embodiment is comparatively detailed; therefore the restriction to scope of patent protection of the present invention can not be thought; those of ordinary skill in the art is under enlightenment of the present invention; do not departing under the ambit that the claims in the present invention protect; can also make and replacing or distortion, all fall within protection scope of the present invention, request protection range of the present invention should be as the criterion with claims.

Claims (5)

1., with the screen content coding-decoding method of H264 operating such, it is characterized in that, described coding method, comprises the following steps:
Step 1: image block number of coded bits is estimated, choose some typical text screen contents and form large training dataset, frame by frame dictionary encoding is performed to the image in training set, the overall bit number that statistics produces, again according to the total number of image block, these bit numbers are converted single image block, namely obtains the bit number R under single image block dictionary encoding mode;
Step 2: in the dictionary encoding pattern obtained in H264 Standard coding modes and step 1, optimizing cost function by code rate distortion is that each image block chooses forced coding pattern, by its schema code, I_PCM is set to for the selected image block for dictionary encoding pattern, but does not encode immediately;
Step 3: image block data is collected, and the image block data of each I_PCM of being determined as is write a common buffer; Repeat step 2, until a two field picture is disposed;
Step 4: perform dictionary encoding after recombinating by column major order to the pixel of each image block, comprises brightness and two chromatic components, then performs dictionary encoding, before the code stream of the code stream of dictionary encoding write H264 standard code, forms composite bit stream.
2. the screen content coding method with H264 operating such according to claim 1, is characterized in that: the typical text screen content described in step 1 comprises Word document, PPT lantern slide, webpage, CAD figure.
3. the screen content coding method with H264 operating such according to claim 1, is characterized in that: what the dictionary encoding described in step 1 adopted is Lempel-Ziv-Markov chain-Algorithm algorithm.
4. the screen content coding method with H264 operating such according to claim 1, it is characterized in that: described in step 2 is that each image block chooses forced coding pattern by code rate distortion optimization cost function, its specific implementation process is the distortion D of computed image block under two kinds of coding modes and bit number R, then optimizes cost function J=D+ λ R by code rate distortion and chooses the minimum pattern of Combine distortion J as forced coding pattern; Wherein J is Combine distortion, and parameter lambda is LaGrange parameter, and λ is for weighing the metric weights between distortion and code check.
5. the screen content coding method with H264 operating such according to claim 1, is characterized in that, carry out the method for decoding, comprise the following steps after coding:
Step 1: extract dictionary code stream from composite bit stream, then performs dictionary decoding, obtains the decoding sampling point data that all patterns are I_PCM image block;
Step 2: sequential scanning decoding sampling point data also resolve H264 code stream, are the image block of I_PCM to the pattern of parsing, by the pixel sampling point data of its correspondence write H264 code stream;
Step 3: the H264 decode procedure of operative norm.
CN201510400827.6A 2015-07-09 2015-07-09 A kind of screen content coding-decoding method compatible with H264 standards Active CN105007483B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510400827.6A CN105007483B (en) 2015-07-09 2015-07-09 A kind of screen content coding-decoding method compatible with H264 standards

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510400827.6A CN105007483B (en) 2015-07-09 2015-07-09 A kind of screen content coding-decoding method compatible with H264 standards

Publications (2)

Publication Number Publication Date
CN105007483A true CN105007483A (en) 2015-10-28
CN105007483B CN105007483B (en) 2017-11-14

Family

ID=54379976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510400827.6A Active CN105007483B (en) 2015-07-09 2015-07-09 A kind of screen content coding-decoding method compatible with H264 standards

Country Status (1)

Country Link
CN (1) CN105007483B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108235038A (en) * 2018-02-09 2018-06-29 西安万像电子科技有限公司 The method and apparatus of image coding and decoding compression

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040170335A1 (en) * 1995-09-14 2004-09-02 Pearlman William Abraham N-dimensional data compression using set partitioning in hierarchical trees
US20070116370A1 (en) * 2002-06-28 2007-05-24 Microsoft Corporation Adaptive entropy encoding/decoding for screen capture content
CN101420614A (en) * 2008-11-28 2009-04-29 同济大学 Method for compressing image and device that a kind of hybrid coding and dictionary encoding are integrated
CN102098507A (en) * 2010-06-08 2011-06-15 同济大学 Integrative compressing method and device of image
CN103888770A (en) * 2014-03-17 2014-06-25 北京邮电大学 Efficient self-adaptive video transcoding system based on data mining

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040170335A1 (en) * 1995-09-14 2004-09-02 Pearlman William Abraham N-dimensional data compression using set partitioning in hierarchical trees
US20070116370A1 (en) * 2002-06-28 2007-05-24 Microsoft Corporation Adaptive entropy encoding/decoding for screen capture content
CN101420614A (en) * 2008-11-28 2009-04-29 同济大学 Method for compressing image and device that a kind of hybrid coding and dictionary encoding are integrated
CN102098507A (en) * 2010-06-08 2011-06-15 同济大学 Integrative compressing method and device of image
CN103888770A (en) * 2014-03-17 2014-06-25 北京邮电大学 Efficient self-adaptive video transcoding system based on data mining

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108235038A (en) * 2018-02-09 2018-06-29 西安万像电子科技有限公司 The method and apparatus of image coding and decoding compression
CN108235038B (en) * 2018-02-09 2020-09-11 西安万像电子科技有限公司 Method and device for image coding, decoding and compressing

Also Published As

Publication number Publication date
CN105007483B (en) 2017-11-14

Similar Documents

Publication Publication Date Title
CN106170921B (en) It is related to the source code and decoding method and device of the data of sign compression
CN101420614B (en) Image compression method and device integrating hybrid coding and wordbook coding
WO2020057182A1 (en) Image compression method and apparatus
CN103098469A (en) Method and apparatus for entropy encoding/decoding a transform coefficient
DE202012013410U1 (en) Image compression with SUB resolution images
CN105247871A (en) Block flipping and skip mode in intra block copy prediction
CN105556971A (en) Encoder-side decisions for block flipping and skip mode in intra block copy prediction
CN104853209A (en) Image coding and decoding method and device
CN102098507A (en) Integrative compressing method and device of image
CN104683805A (en) Image encoding method, image decoding method, image encoding device and image decoding device
CN101653004A (en) Decoder for selectively decoding predetermined data units from a coded bit stream
CN105100814A (en) Methods and devices for image encoding and decoding
WO2023020560A1 (en) Video coding and decoding method and apparatus, electronic device and storage medium
CN105592313B (en) A kind of grouping adaptive entropy coding compression method
CN103402091A (en) Cloud desktop image classifying and encoding method
CN105230021A (en) The dictionary Code And Decode of screen content
CN110996127B (en) Image encoding and decoding method, device and system
Wang et al. United coding method for compound image compression
Chen et al. A new compression scheme for color-quantized images
CN102256126A (en) Method for coding mixed image
Lan et al. Compound image compression using lossless and lossy LZMA in HEVC
CN109672891B (en) Lossless secondary compression method of JPEG image
CN105007483B (en) A kind of screen content coding-decoding method compatible with H264 standards
Naaz et al. Implementation of hybrid algorithm for image compression and decompression
Poolakkachalil et al. Comparative analysis of lossless compression techniques in efficient DCT-based image compression system based on Laplacian Transparent Composite Model and An Innovative Lossless Compression Method for Discrete-Color Images

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20180517

Address after: 610041 Sichuan Chengdu high tech Zone Tianren Road No. 387 3 Building 1 unit 28 level 2803

Patentee after: Chengdu Suirui cloud Technology Co. Ltd.

Address before: 430072 Wuhan University, Luojia mountain, Wuchang District, Wuhan, Hubei

Patentee before: Wuhan University

TR01 Transfer of patent right