CN101783939A

CN101783939A - Picture coding method based on human eye visual characteristic

Info

Publication number: CN101783939A
Application number: CN 200910045495
Authority: CN
Inventors: 付伟; 顾晓东; 马成才
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2009-01-16
Filing date: 2009-01-16
Publication date: 2010-07-21
Anticipated expiration: 2029-01-16
Also published as: CN101783939B

Abstract

The invention belongs to the technical field of image processing, relating to a picture coding method based on human eye visual characteristics. According to the visual characteristic that human eyes are more sensitive to picture edge and smooth region information distortion than texture region distortion, the invention firstly carries out wavelet decomposition to an original picture; then, a coefficient after wavelet transform is classified by entropy and variance into a visual important coefficient and a common coefficient; and finally, arithmetic coding is adopted to classify and code different coefficients to obtain a final code stream. The invention improves the subjective visual quality of a lossy compression reconstruction picture to ensure that the recovered picture has better visual observation effect after the original picture is performed with lossy compression coding with high compression ratio, and thus, the invention has strong feasibility on large-scale picture storage and picture data transmission.

Description

A kind of method for encoding images based on human-eye visual characteristic

Technical field

The invention belongs to technical field of image processing, be specifically related to a kind of method for encoding images based on human-eye visual characteristic.

Background technology

Along with the raising of communication channel and computer capacity and speed, image information has become an important process object of computer system processor.Different with Word message, image information needs big memory capacity and wide transmission channel, and the compression of view data is just become active demand.Image compression is meant the technology of representing original picture element matrix with less bit, also claims image encoding.Why view data can be compressed, exactly because exist redundancy in the data.The redundancy of view data mainly shows as: the spatial redundancy that the correlation in the image between neighbor causes; The time redundancy that exists correlation to cause between the different frame in the image sequence; The spectral redundancy that the correlation of different color planes or spectral band causes.The purpose of data compression is exactly to reduce the required bit number of expression data by removing these data redundancies.

For a secondary given digital picture, its original expression generally is the aerial image pixel array, and this is that its spatial domain is represented.In spatial domain was represented, because very big correlation between the neighbor, the space pixel that redundant information is distributed in was in a big way concentrated, and directly handles relatively difficulty.The most frequently used processing method is by a kind of conversion, with image from the space be mapped to the transform domain, in transform domain, carry out simple and direct effective processing.Wavelet transformation be 20th century the mid-80 occur new the time/the frequency-region signal analysis tool, Mallat was used for signal processing with wavelet transformation in 1989, had proposed the notion of multiresolution analysis, and with the wavelet analysis first Application in image encoding.Wavelet transformation, is used widely gradually in the image compression encoding field, and has been obtained good effect with the decorrelation ability and have the multiresolution analysis ability that conforms to human-eye visual characteristic with its good time-frequency local characteristics afterwards.Shapiro in 1993 has proposed embedded zero-tree wavelet coding (EZW) algorithm according to the notion of ZT, this algorithm with its efficiently code efficiency and preferable image also proper mass be considered to one of best image compression algorithm of code efficiency.Said and Pearlman have proposed collection division (SPIHT) image encoding algorithm of hierarchical tree on EZW algorithm basis, it is a kind of more general expression to the EZW algorithm, it has drawn many thoughts of zero tree, its purpose also is the most effectively to represent the effective value mapping by the direction tree, by division to tree, invalid value coefficient as much as possible is collected in a son concentrate, represent with a unit symbol.The improvement algorithm of many EZW is also released one after another subsequently, but classical EZW algorithm and these improvement algorithms do not take into full account human-eye visual characteristic, have influenced the image restoration quality.If can in the image encoding process, make full use of the visual characteristic of human eye, then can adopt lower coding bit rate to make the subjective quality of image remain unchanged according to the Shannon rate distortion theory.

Studies show that human eye is very sensitive to the distortion of image border district information, relatively more responsive to the distortion of image smoothing district information, and insensitive to the distortion of image texture district information.This method distributes the more bits number to make it to quantize meticulousr to image border and dead smooth district wavelet coefficient on the basis of classical EZW algorithm.Because pay attention to the information in image border and dead smooth district more, the reconstruct image restored has better subjective vision effect.

Summary of the invention

The purpose of this invention is to provide a kind of method for encoding images based on human-eye visual characteristic, improve the subjective visual quality do of lossy compression method reconstructed image, make original image have better visual observation effect in the lossy compression method coding back institute image restored of carrying out high compression rate.

Purpose of the present invention realizes by following method and step:

At first original image is carried out wavelet decomposition, then wavelet coefficient is classified according to entropy and variance, mark off vision significant coefficient and common coefficient, utilize arithmetic coding that two class coefficient classification quantitatives coding is obtained final code stream according to dividing the result in view of the above.

Below content of the present invention is further elaborated:

1, original image is carried out wavelet decomposition, and coefficient behind the wavelet transformation is classified according to entropy and variance, mark off vision significant coefficient and common coefficient:

Wavelet transformation, is used widely in the image compression encoding field with the decorrelation ability and have the multiresolution analysis ability that conforms to human-eye visual characteristic gradually with its good time-frequency local characteristics.Wavelet transformation becomes the form of wavelet basis function weighted sum with picture breakdown, keeps the fine structure of original image under various resolution.Image is through being divided into 4 frequency bands behind the wavelet transformation: level, vertical, diagonal and low frequency, low frequency part can also continue to decompose.The data total amount of the Wavelet image that generates behind the image process wavelet transformation equates with the data total amount of original image, but because the Wavelet image that generates has the characteristic different with original image, be that image energy mainly concentrates on low frequency part, the energy of level, vertical and diagonal part then seldom.Level, vertically characterized edge, profile and the texture information of original image in level, vertical and diagonal with the diagonal high-frequency sub-band, low frequency sub-band is approaching original image.Because most of important visual informations are compressed in a spot of coefficient, then can be quantized roughly or intercept be 0 to Sheng Xia coefficient, and image does not almost have distortion.As Fig. 1 is three grades of wavelet decomposition design sketchs of Lena image.Because human eye is very sensitive to the distortion of image border district information, distortion to image smoothing district information is relatively more responsive, and insensitive to the distortion of image texture district information, in view of the above the coefficient after the wavelet decomposition is divided into fringing coefficient, level and smooth fauna number and texture area coefficient.Because the entropy in image smoothing district is less, the zone that entropy is bigger then belongs to image border district or texture area, the texture area corresponding variance is less simultaneously, and the marginal zone corresponding variance is bigger, so can select suitable entropy threshold value and variance threshold values to determine the image zones of different.The entropy of four child nodes by calculating wavelet coefficient and the size of variance reflect whether this wavelet coefficient belongs to fringing coefficient, and smoothing factor still is the texture coefficient.If 4 child node coefficients of certain wavelet coefficient X are X1, X2, X3 and X4, the entropy of these four child node coefficients is defined as:

ENTROPY = \underset{i = 1,2,3,4}{Σ} probs (Xi) \cdot I (i)

Wherein: I (i)=-log2 (probs (Xi))

The variance that defines four child node coefficients simultaneously is:

D (X) = \underset{i = 1,2,3,4}{Σ} {(Xi - E (X))}^{2}

Wherein:

E (X) = \frac{1}{4} \underset{i = 1,2,3,4}{Σ} Xi

When the entropy of child node coefficient during less than certain threshold value δ (value is less), then the father node coefficient is a dead smooth fauna number, (abbreviating level and smooth fauna number as), otherwise be edge or texture area coefficient, and simultaneously when child node parameter variance during greater than certain threshold value Δ, then the father node coefficient is the marginal zone coefficient, otherwise is the texture area coefficient.And, judge with the character of its father node coefficient that then promptly the father node coefficient is the marginal zone coefficient for high frequency coefficient, then it is the marginal zone coefficient.This method is considered as the vision significant coefficient with marginal zone coefficient and dead smooth fauna number, and the texture area coefficient then is considered as common coefficient.

2, with arithmetic coding two class coefficient classification quantitatives are encoded

According to top wavelet coefficient classification results, define 6 kinds of symbols: zerotree root T, isolated null value Z, positive significant coefficient P, bear a heavy burden and want coefficient N, positive vision significant coefficient R and negative vision significant coefficient S.Finish embedded encoded by successive approximation to quantification.Common threshold series T also promptly is set ₀, T ₁..., T _N-1Decide significant coefficient, wherein T _i=T _I-1/ 2 and initial threshold T ₀=2M, M=log2[Max (abs (Xi))], Xi is a wavelet coefficient.Vision threshold value series DT is set simultaneously ₀, DT ₁..., DT _N-1Decide the vision significant coefficient, wherein DT _i=DT _I-1/ 2, and DT _i=T _i/ 2.Like this at a certain quantized level T _i, for the wavelet coefficient of identical size, the vision significant coefficient has littler quantization error, and (the maximum quantization error of common wavelet coefficient is T owing to quantification is meticulousr _i/ 4, and peace skating area, edge coefficient is DT _i/ 4=T _i/ 8), vision significant coefficient (because the vision significant coefficient quantized level DT of littler one-level of while _i=T _i/ 2) also quantized simultaneously.Obviously after setting a certain specific bit rate, vision significant coefficient (edge and dead smooth fauna number) has distributed the more bits rate.

Method for encoding images based on human-eye visual characteristic proposed by the invention, improved the subjective visual quality do of lossy compression method reconstructed image effectively, make and satisfying under certain picture quality condition, represent original image with the least possible bit number, with efficient that improves the image transmission and the capacity that reduces the image storage.

Description of drawings

Fig. 1 is the wavelet coefficient tree.

Fig. 2 is three grades of wavelet decomposition design sketchs of Lena image.

Fig. 3 is that the present invention and EZW algorithm reconstructed image directly perceived compares, and wherein (a) and (b) are respectively the reconstructed image and the reconstructed image of the present invention of the EZW algorithm of Man standard grayscale image under bit rate 0.25bpp.

Specific embodiments

Below in conjunction with specific embodiment, the present invention is further elaborated.Embodiment only is used for the present invention is done explanation rather than limitation of the present invention.

Embodiment 1

Present embodiment is an example with the standard Man gray level image of 512 * 512 * 8bit, introduces the entire image cataloged procedure, and in the end with the subjective visual quality do effect of form and image format presentation code result and reconstructed image.

1, the coefficient behind the wavelet transformation is carried out main scanning

Selection standard Man gray level image (original image) adopts D (9,7) biorthogonal wavelet to carry out wavelet transformation as coded object, scans each wavelet coefficient to produce coefficient symbols.

(1) if coefficient amplitude greater than threshold value T and be positive number, output symbol P;

(2) if coefficient amplitude greater than threshold value T and be negative,, output symbol N;

(3) if coefficient amplitude less than threshold value T but greater than threshold value T/2 and be positive number, and 4 child nodes coefficient entropy in the branch less than threshold value δ or variance greater than the threshold value Δ, output symbol R;

(4) if coefficient amplitude less than threshold value T but greater than threshold value T/2 and be negative, and 4 child nodes coefficient entropy in the branch less than threshold value δ or variance greater than the threshold value Δ, output symbol S;

(5) if coefficient do not satisfy above one of four, and the child node coefficient value is arranged in the branch greater than threshold value T, then represent with symbols Z;

(6) if coefficient and all node coefficients thereof all less than T, and this coefficient do not satisfy edge-smoothing fauna said conditions again, then this coefficient is a zerotree root, output T.

2, to the auxilliary scanning of wavelet coefficient, the coefficient of quantification tape symbol P, N, R and S is also encoded

(1) for the quantification of P and N.Before quantization parameter, to construct quantizer.The input of quantizer is spaced apart [T _I-1, 2T _I-1), this is at interval by 1.5T _I-1Be divided into two parts: [T _I-1, 1.5T _I-1) and [1.5T _I-1, 2T _I-1), quantized interval is 0.5T _I-1, wherein i is the i time coding.Quantizer is output as quantification symbol " 0 " and " 1 ", and " 0 " corresponding quantized value is (1.5-0.25) T _I-1, " 1 " corresponding quantized value is (1.5+0.25) T _I-1

(2) for symbol R and S and since their correspondences image border or level and smooth district, the input of quantizer is spaced apart [DT _I-1, 2DT _I-1), DT wherein _I-1=T _I-1/ 2.This is at interval by 1.5DT _I-1Be divided into two parts: [DT _I-1, 1.5DT _I-1) and [1.5DT _I-1, 2DT _I-1), quantized interval is 0.5DT _I-1, wherein i is the i time coding.Quantizer is output as quantification symbol " 0 " and " 1 ", and " 0 " corresponding quantized value is (1.5-0.25) DT _I-1, " 1 " corresponding quantized value is (1.5+0.25) DT _I-1

After finishing main scanning and auxilliary scanning, export is-symbol collection { P, N, R, S, T, Z, 0, the series of sign among the 1}.But adopt the Huffman coding can obtain final transfer encoding code stream at last.

3, interpretation of result

Because PSNR has certain limitation to the subjective vision effect of evaluation map picture, this method adopts the actual subjective visual quality do of more weighing reconstructed image as the image quality evaluation index near the MSSIM value and the VIF value of subjective visual evaluation quality, wherein MSSIM and VIF value are big more, show that the subjective quality that recovers image is good more.(a) among Fig. 3 and (b) be respectively the recovery image and the recovery image of the present invention of the EZW algorithm of Man standard grayscale image under bit rate 0.25bpp.The edge of figure (b) is more clear than the edge of figure (a), as wrist-watch, and the back of the hand, cap, hair, marginal zones such as clothes decorative pattern are more clear, have better visual effect.Table 1 is that the MSSIM and the VIF value of reconstructed image under the different code checks compares.As can be seen from Table 1 for smoothed image and edge rich image this method with respect to classical EZW algorithm, reconstructed image has better subjective visual quality do.And can obtain the reconstructed image of better visual quality equally for texture-rich image this method under low bit rate.

The coding result of table 1, method of the present invention and EZW algorithm (being the MSSIM/VIF value) relatively

Experimental result shows that the present invention can improve the subjective visual quality do of lossy compression method reconstructed image well, makes original image have better visual observation effect in the lossy compression method coding back institute image restored of carrying out high compression rate.

Claims

1. method for encoding images based on human-eye visual characteristic, it is characterized in that at first original image being carried out wavelet decomposition, then wavelet coefficient is classified according to entropy and variance, mark off vision significant coefficient and common coefficient, utilize arithmetic coding that two class coefficient classification quantitatives coding is obtained final code stream according to dividing the result in view of the above.

2. the method for encoding images based on human-eye visual characteristic according to claim 1, it is as follows to it is characterized in that the wavelet coefficient after the original image wavelet decomposition is carried out characteristic of division:

Original image is through being divided into level, vertical, diagonal and 4 frequency bands of low frequency behind the wavelet transformation, the distribution of wavelet coefficient is tree-shaped form and distributes, level, vertically characterized edge, profile and the texture information of original image in level, vertical and diagonal with the diagonal high-frequency sub-band, low frequency sub-band is approaching original image, most of important visual informations are compressed in a spot of coefficient, and remaining coefficient is quantized roughly or intercepts is 0;

Coefficient after the wavelet decomposition is divided into fringing coefficient, level and smooth fauna number and texture area coefficient, the entropy in image smoothing district is less, and the zone that entropy is bigger then belongs to image border district or texture area, the texture area corresponding variance is less simultaneously, and the marginal zone corresponding variance is bigger; The entropy of four child nodes by calculating wavelet coefficient and the size of variance reflect whether this wavelet coefficient belongs to fringing coefficient, and smoothing factor still is the texture coefficient; If 4 child node coefficients of certain wavelet coefficient X are X1, X2, X3 and X4, the entropy of these four child node coefficients is defined as:

ENTROPY = \underset{i = 1,2,3,4}{Σ} probs (Xi) \cdot I (i)

Wherein: I (i)=-log2 (probs (Xi))

The variance that defines four child node coefficients simultaneously is:

D (X) = \underset{i = 1,2,3,4}{Σ} {(Xi - E (X))}^{2}

Wherein:

E (X) = \frac{1}{4} \underset{i = 1,2,3,4}{Σ} Xi

When the entropy of child node coefficient during less than certain threshold value δ (value is less), then the father node coefficient is a dead smooth fauna number, otherwise is edge or texture area coefficient, and simultaneously when child node parameter variance during greater than certain threshold value Δ, then the father node coefficient is the marginal zone coefficient, otherwise is the texture area coefficient; For high frequency coefficient, judge with the character of its father node coefficient that then promptly the father node coefficient is the marginal zone coefficient, then it is the marginal zone coefficient; Marginal zone coefficient and dead smooth fauna number are the vision significant coefficient, and the texture area coefficient is considered as common coefficient.

3. the method for encoding images based on human-eye visual characteristic according to claim 1, it is characterized in that utilizing arithmetic coding that two class coefficient classification quantitatives are encoded, according to the wavelet coefficient classification results, define 6 kinds of symbols: zerotree root T, isolated null value Z, positive significant coefficient P, bear a heavy burden and want coefficient N, positive vision significant coefficient R and negative vision significant coefficient S finish embedded encodedly by successive approximation to quantification, common threshold series T is set ₀, T ₁..., T _N-1Decide significant coefficient, wherein T _i=T _I-1/ 2 and initial threshold T ₀=2M, M=log2[Max (abs (Xi))], Xi is a wavelet coefficient, and vision threshold value series DT is set simultaneously ₀, DT ₁..., DT _N-1Decide the vision significant coefficient, wherein DT _i=DT _I-1/ 2, and DT _i=T _i/ 2; At a certain quantized level T _i, for the wavelet coefficient of identical size, the vision significant coefficient is owing to quantizing the meticulousr littler quantization error that has, and the vision significant coefficient of littler one-level is also quantized simultaneously.