[background technology]
Fast development along with technology such as computer, microelectronics, information processing, communication and laser, the multimedia technology that integrates picture and text, sound, image especially rapid permeability to computer, communication, radio and television and consumption show business, in above-mentioned each field, adopt digital device more and more by digital data transmission.Digital signal has many good qualities, but its frequency band can be widened greatly behind analog signal digital, and after the normal tv signal digitlization as one road 6MHz, its numeric code rate will be up to 167Mbps, this is very big to memory capacity and transmission bandwidth requirement, thereby makes digital signal lose practical value.Digital compression technology has solved above-mentioned difficulties well, and the shared frequency band of compression back signal is significantly less than the frequency band of original analog signal.Therefore say that the digital compression coding techniques is to make digital signal move towards one of key technology of practicability, the key factor why digital picture and digital video can be transmitted and preserve is this compressibility of digital picture and digital video.This compression is to be prerequisite with the quality that reduces image or video, exchanges valuable memory space or transmission bandwidth for the quality of sacrificing image or video.Certainly, this compression can not be excessive, so that the visual effect of image or video becomes unacceptable, and this just requires to improve constantly compression efficiency under the certain quality condition.In addition, compression behavior should be a standard, will help transmission of Information like this and share.For this compressed encoding behavior of standard, many international standards have been put into effect at present, as image compression standard JPEG (international standard ISO/ICE IS 10918, use at digital camera and Internet in a large number), (international standard ISO/ICE 11172 for video compression standard MPEG-1, in VCD, use), MPEG-2 (international standard ISO/ICE 13818, use in DVD and Digital Television) and MPEG-4 (international standard ISO/ICE 14496, and use is in stream media technology) etc.These standards except standard the compression behavior, compression efficiency is also improving constantly.Compression efficiency as MPEG-2 is higher than MPEG-1, and the compression efficiency of MPEG-4 is higher than MPEG-2.
In these compression standards and technology, a method in common is arranged, that just has been to use the orthogonal transform technology.At present, in JPEG, MPEG-1, MPEG-2 and MPEG-4, used orthogonal transformation method all is discrete cosine transform (Discrete Cosine Transformation is called for short DCT).DCT is the normal instrument that adopts of Classical Spectrum analysis, and its appearance has landmark meaning to digital picture and video compression technology.It is earlier general image to be divided into N * N block of pixels, then N * N block of pixels is carried out discrete cosine transform one by one.Give up then the insensitive frequency information of vision, only keep of paramount importance data message.Like this, compression process must be lost to the fine and smooth level and smooth degree aspect of image.People also attempt research and seek other better more effective orthogonal transform to replace DCT, as Fourier transform, and discrete sine transform, hada nurse conversion or the like.But these conversion all can't surmount DCT on performance, even differ also far away.Some evidence proof DCT " almost being optimal transformation " are also arranged in theory.
In recent years, because handset applications and universal, compression algorithm required further to simplify cost and the power consumption to save relevant chip.Therefore, integer transform technology and method have obtained many progress, but these integer transforms also have some shortcomings: the calculated performance of the integer transform that has also has room for improvement, and especially on inverse transformation, present integer transform computational efficiency is also not high.As up-to-date international standard H.264 in, the integer transform D of a kind of following approximate DCT is adopted in suggestion
1:
Above-mentioned matrix D
1Though calculated performance obtained certain improvement, do not see the report that its compression performance surpasses DCT.
The applicant once proposed following transformation matrix D
2:
But this matrix D
2Coding efficiency and DCT have his own strong points: during low code check, be better than DCT; And when high code check, DCT then is better than this matrix D
2At present, the technical thought of existing integer transform is still sought the approximate matrix of DCT, and performance has decline to a certain degree after the integer.Therefore, need new invention thinking further to improve the integer transform technology.
[summary of the invention]
Technical problem to be solved by this invention provides a kind of orthogonal integer transform method that is used for image and video compression, and its compression performance is better than discrete cosine transform comprehensively.
To achieve the above object of the invention, technical scheme proposed by the invention is:
A kind of orthogonal integer transform method that is used for image and video compression, it is earlier image to be divided into 8 * 8 fritter M, use 8 * 8 orthogonal transform matrix P that each fritter M is made two-dimensional transform then, be used for compressed encoding, obtain transform coefficient matrix N, to entropy coding behind each coefficient quantization of N, it is characterized in that again:
Described 8 * 8 orthogonal transform matrix P are:
Wherein:
And
And x
1, x
2, x
3, x
4, y
1, y
2, y
3, y
4, z
1, z
2, z
3, z
4It all is integer.
The present invention is based on the structure of special INTEGER MATRICES class, designed 8 * 8 new orthogonal integer transform matrix class, can form a unlimited INTEGER MATRICES, constraint by certain condition, under an effect of optimizing under the Mathematical Modeling, in thousands of matrixes, produce optimum integer transform.On the computational efficiency of conversion, reached theoretic optimum.Because the present invention elects by optimizer in several ten thousand matrixes, evidence, with this orthogonal transform compressed image and video, its compression performance is better than DCT comprehensively.
[embodiment]
In image and video compression technology field, image compression is most crucial basis, and orthogonal transformation method is the most crucial technology in image and the video compression technology.In image compression standard JPEG (international standard ISO/ICE IS10918) and video compression standard MPEG-1 (international standard ISO/ICE 11172), MPEG-2 (international standard ISO/ICE13818) and MPEG-4 (international standard ISO/ICE 14496) lining, at present, the orthogonal transformation method that is adopted is 8 * 8 discrete cosine transform (DCT, i.e. Discrete Cosine Transform).
The notion of orthogonal integer matrix is: if all elements of A all is an integer, and AA
TBe a diagonal matrix, just claim that A is the orthogonal integer matrix, also abbreviate INTEGER MATRICES as under the situation about obscuring not causing.Here, " A
T" transposition of representing matrix A.
Simple integer matrix is famous Walsh-Hadamard matrix, and its element is 1 or-1, and norms of all row all equate.This conclusion also is right conversely, if promptly norms of all row of INTEGER MATRICES all equate, and except that delegation, other row all has single order vanishing moment (be this capable element and be 0), and then this orthogonal integer matrix is the Walsh-Hadamard matrix.With regard to computational efficiency, the Walsh-Hadamard matrix is the highest, and the norm that mainly has benefited from its row all is same number.Can verify matrix D
1In capable norm 3 different numbers are arranged, and matrix D
2 Capable norm 6 different numbers are arranged.In general, the row norm is few more, and computational efficiency is high more, especially for inverse transformation.In order to understand this point, we consider matrix P
1And matrix D
1The inverse transformation form be respectively:
With
C in formula (1) and (2)
iBe integer.Clearly, corresponding to matrix D
1, formula (2) ratio is corresponding to P
1The calculating of formula (1) want complicated because will do 2 divisions, and only need a division in (1).Note, in two-dimensional transform,
Become 2, thereby 2 division becomes shift operation.D
2Inverse transformation more complicated.
The present invention proposes a kind of 8 * 8 orthogonal integer transform matrix P, it is:
Wherein:
And
And x
1, x
2, x
3, x
4, y
1, y
2, y
3, y
4, z
1, z
2, z
3, z
4It all is integer.
The different persons of row order among the above-mentioned integer transform matrix P, for the equivalent transformation of above-mentioned INTEGER MATRICES P; Row differs a factor person among the matrix P, for the equivalent transformation of above-mentioned INTEGER MATRICES P.These two kinds of situations have the coding effect of equivalence, and this is because the data after the conversion have just changed position and symbol, big or small not change.Symbol is to encode separately, and reindexing is to not influence of code efficiency.To the position that has changed, be adjustable to original the same.Certainly, generally a ranking method can be arranged to the row of matrix.The number of times that the symbol of element changes in the delegation is called the exponent number of vanishing moment.Generally according to the exponent number of the vanishing moment row of permutation matrix from low to high.As superincumbent matrix P
1In, the exponent number of vanishing moment is followed successively by 0,1, and 2,3,4,5,6,7.
One of main thought of above-mentioned orthogonal matrix design is that to make the capable norm of matrix P be 2, thereby in minimum computational complexity, the person that seeks the best performance.
This be because, the INTEGER MATRICES in the matrix class P has unlimited, its calculated performance can both reach theoretic optimum (maximum two capable norms), but not all matrix all has the compression performance of getting well.The matrix how to select is the difficult problem in a present image and the video compression technology.A feasible way is, sets up a Mathematical Modeling, and most of matrix is got rid of, and stays a spot of more excellent matrix, determines optimum matrix with the way of testing.
If restriction x
1, x
2, x
3, x
4, y
1, y
2, y
3, y
4, z
1, z
2, z
3, z
4Absolute value be no more than 20, matrix P will have more than 4000; Be no more than 30 if limit its absolute value, matrix P then has more than 20000.In image and video compression, by conversion space field transformation is become frequency domain, then frequency coefficient is encoded.Frequency coefficient is more little, helps coding more.Use this principle, we set up a model based on signal.Particularly, to each A among the matrix P, its orthonormal matrix is designated as B. and gets one 512 * 512 typical resolution chart I, and I is divided into 4096 8 * 8 data block I
i, take following step then:
Step 1: calculate Q
i=[q
J, k]=BI
iB
T
Step 2: calculate
Step 3: calculate
Like this, to image I, we can solve
In general, formula (3) depends on image I.Yet if image is relatively more typical, separating of (3) have consistency, and promptly for different images, the optimal solution of (3) is a little set the inside.Like this, we can obtain down column matrix:
We are to several typical resolution charts, as Lena, and Barbara, Goldhill etc. travel through more than 4000 integer solution, and optimal solution generally concentrates on matrix P
1And P
2On, some image P
1Optimum, P
2Suboptimum; And for other image, P
2Optimum, P
1Suboptimum.Especially, we to the Lena image traversal more than 20000 matrix, optimal solution is P all the time
1So we use P
1Make instantiation, carry out compression verification.
For transformation matrix, the character of second row is even more important, and smooth more good more, it has almost determined the performance of conversion.As Fig. 1, Fig. 2, Fig. 3, Fig. 4, shown in Figure 5, we have provided corresponding P
1, P
2, D
1, D
2With the oscillogram of DCT, its smooth degree comes into plain view.From above-mentioned oscillogram also as can be seen, P
1And P
2More similar in appearance to DCT.
With above-mentioned compressed video data block M (8 * 8 matrix), with the orthogonal transform among the P this data block is made two-dimensional transform, resulting transform coefficient matrix N is: N=PMP
T, so that finish follow-up compression encoding process.Wherein, " P
T" be the transposed matrix of P.
To the data block after handling through entropy decoding and inverse quantization etc.
With described 8 * 8 orthogonal integer transform matrix P orthogonal transforms, its two-dimension inverse transformation recovers video data block and is:
Wherein, " P
T" expression P transposed matrix.
Embodiment:
We test P with the verification model XVID (Ver.1.1.0 can pass through the Internet download) of the open source code of MPEG-4
1Video compression performance.For short form test, we use " IPPIPPIPP " frame structure with its basic model (baseline), and entropy coding adopts the Huffman coding.In XVID, we only change two contents with test P
1The compressed encoding performance: (1) uses matrix P
1Standard quadrature form replace DCT; (2) two quantization matrixes are adjusted into following two matrix P
3And P
4:
Wherein: P
3Be interframe quantization matrix, P
4Be quantization matrix in the frame.Thereby form a new video compression coding solution.By contrasting with original XVID scheme, just can comparator matrix P
1Quality with the DCT performance.In two schemes, the Huffman code table all is identical.
5 video clips commonly used, Foreman, Akiyo, Mother and daughter, Coast Guard and Mobile are used to testing experiment.These video clips all are 300 frame lengths, the 43.5MB size, and every frame sign is 352 * 288 * 1.5bytes, YUV component, broadcasting speed were 25 frame/seconds.
Test result:
We have made encoded test to 5 video sequences at 4 code checks, and each frame Y is calculated in the decoding back then, and the Y-PSNR (PSNR) of U and V component is averaged at last.All experimental results are listed in Tab.1-Tab.5., and we see, nearly all video is under all code checks, and new departure has all surpassed original XVID scheme.Especially under low code check, advantage is more obvious.These test result explanations P
1Compressed encoding performance being significantly improved than DCT.
Table 1: experimental result-Foreman
Table 2: experimental result-Akiyo
Table 3: experimental result--Coast Guard
Table 4: experimental result-Mobile
Table 5: experimental result--Mother and daughter
The above embodiment has only expressed several preferred implementation of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection range of patent of the present invention should be as the criterion with claims.