CN1658673A

CN1658673A - Video compression coding-decoding method

Info

Publication number: CN1658673A
Application number: CN200510038537.8A
Authority: CN
Inventors: 马国强; 徐苏珊; 吴金勇; 徐健键
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2005-03-23
Filing date: 2005-03-23
Publication date: 2005-08-24

Abstract

A compressing encoding and decoding method which includes the following: encode the video compressed signals with the programs, discrete cosine transform DCT; transform and quantitate; set the channel buffer storage before the encoding bit flow gets into the channel; the buffer storage must have a control mechanism; movement estimate; the position excursion is described by the movement vector, and a movement vector represents the displacement in the two actinic and vertical directions; as movement estimating, the P frame image uses the former most recent decoded I frame or P frame as the referenced image called the forward forecast; the movement compensation; the movement vector calculated by the movement estimate moves the macro piece in the referenced image to the corresponding position in the actinic and vertical direction, then namely produce the forecast to the compressed image; and search and calculate the subpels; the quantitation, storage and movement search after the sampling signals are made the discrete cosine transform DCT transform are all completed in the frequency field. The video encoder finishes all the calculation in the frequency field. The compressed rate is high, and the calculating quantity is small.

Description

Video compression coding-decoding method

One, technical field

The present invention relates to a kind of video compression encoding algorithm, and be based upon the AVCS video conferencing system on this algorithm basis.

Two, background technology

H.263 external similar video conference terminal generally adopts, H.264 waits coding techniques at present.Adopt product H.263 to have lower computation complexity, realize on lower-cost hardware that easily production cost of products is lower, but this series products is very low to the compression of video data rate simultaneously, takies bigger bandwidth, has increased the network operation cost again; Adopt product H.264 to have high compression ratio, take less Internet resources, but what bring is high computing cost thereupon, make this series products depend on the quite high hardware platform of cost.

Nineteen ninety-five, the video coding expert group (VCEG) of International Telecommunication Association develops a kind of new low bit-rate video communication standard after having finished the H.263 formulation work of standard, and called after H.26L.Calendar year 2001, the Motion Picture Experts Group of ISO (MPEG) recognizes potential advantages H.26L, has set up joint video team (JVT) with the VCEG cooperation.The achievement of this group be exactly in the second quarter in 2003 issue advanced video coding (Advanced Video Coding, AVC).In the ITU-T series standard, H.264 AVC is known as.Since the year ends 2003, because just can realize original picture quality with half bandwidth, this glamour has H.264 been subdued those rapidly and has been suffered from the video signal user of special line bandwidth expensive always.Market at home, comprise at present ancient cooking vessel look logical, in too, in video signal manufacturers such as emerging, Kodak, TANDBERG all or be about to release and support the H.264 video signal new product of standard.H.264 with respect to H.263, sharpest edges are that it is the coded system of a very low code check.Theoretically, under the situation of equal reduction picture quality, H.264 than the code check of H.263 saving half.In other words, one section same video is used the H.264 picture quality after the encoding and decoding under 384kbps, and the 768kbps encoding and decoding are identical with H.263 using.This just provides the possibility that obtains high quality graphic under the low bandwidth for the relatively more nervous user of those bandwidth resources.H.264 at the beginning of design, just considered the hierarchical coding transmission under the heterogeneous networks resource.H.264 have stronger fault-tolerant ability, in the unsettled network environment of quality, can obtain than the better quality of encoded video H.263.Along with video communication applications shifts to public network from government and enterprise's private network gradually, noiseproof feature H.264 will be brought into play key effect.H.264 with coded system H.261 and H.263 another significantly difference be it when carrying out motion compensated prediction, can support thinner branch pixel motion vector.H.264, contrast 1/2 Pixel-level prediction H.263 can be implemented in the prediction of 1/4 pixel scale, and this just makes that the video quality of H.264 encoding out is higher.H.264 the benefit of being brought is not free.H.264 cost is that H.263 its computation complexity is much higher than.H.264 decoding complex degree is H.263 2 times under the square one, and encoder complexity H.263 3 times especially.The increase of computation complexity just makes realization H.264 be subjected to certain restriction.Simple example is exactly the last word of a famous international video terminal manufacturer, can support the 2M code check under H.263, but H.264 down, only to support 512kbps.

Certainly, as a new coding standard, H.264 there is its limitation in application facet.Because original design object is wish to adopt H.264 so that obtain the preferable image quality under the situation of low bandwidth, but in the reality test, we can see, H.264 under high code check situation, and picture quality and H.263 do not have significant difference by comparison.So when choosing H.264 product, the used network bandwidth is the factor that the user must consider.Because if video conference is to operate on the private network, can guarantee the 1M bandwidth usually, so also just need not on H.264, spend more investment.Because H.264 standard is released only 1 year, most of propaganda supports that manufacturer terminal H.264 mainly all is the basic class of supporting H.264.Because the H.264 increase of encoding and decoding complexity, challenge has been proposed for the Video processing ability of manufacturer terminal.Existing platform, or just can't do encoding and decoding H.264 at all, or just can not support the encoding and decoding under the high code check.And H.264 implementation method is not quite similar between several main manufacturer terminals, and the terminal of different brands is difficult to use and H.264 connects, and the ability of interconnecting is difficult to be protected, and these objective factors all be that H.264 universal rapidly is provided with very big obstacle.

In any case, H.264 the advance of its technological layer is arranged after all, as an emerging encoding and decoding standard, its efficiently coding efficiency help and the service efficiency that improves resource, save the huge investment on the network bandwidth.In 2003, China's broadband popularity was more and more higher, and the video communication demand under the such low bandwidth of DSL can increase gradually, and we have reason to believe H.264 will the crucial effect of performance in the process of popularizing video communication.

H.264 birth is in video communication and storage application, and video encoding and decoding standard is in occupation of the core status of technology.All the time, video coding exists two Standardization System, and one is by the leading MPEG series standard (as MPEG-1, MPEG-2, MPEG-4) of ISO/IEC; Another is the leading H.26x series standard of ITU-T (as H.261 and H.263).The MPEG series standard is widely used in video storage, program request and forwarding field, such as the video format of VCD, just is based on the MPEG-1 technological development.Equally, because the recommendation of International Telecommunication Association, H.26x series standard also is widely used in field of video communication, by vast operator and equipment supplier are adopted.

The patent application of method for video coding has: the integer transform matrix system of selection of CN 200410012857.1 video codings and relevant integer transform method relate to the integer transform of Image Data Compression in the Video Codec, first audio/video encoding standard (AVS) that will formulate at current China adopts 8 to take advantage of 8 integer class dct transforms, a kind of transform-based system of selection of integer transform has been proposed, two indexs of the decorrelation efficiency of overall merit transform-based and energy compaction efficiency and transform-based conversion dynamic range and computation complexity, and propose 8 of two groups of excellent performances by the method and take advantage of 8 integer translation bases (5,6,4,1) and (4,5,3,1), and obtain integer transform fast algorithm based on these two groups of bases.

CN03157077.1 discloses a kind of bi-directional predicted method that is used for video coding, when the coding side bi-directional predictive coding, at first, to each image block of current B frame, obtains the given candidate's forward motion vector of current image block; Then, utilize and calculate the candidate backward motion vector, adopt bi-directional predicted method to obtain the bi-directional predicted reference block of candidate; Within the given hunting zone and/or within the given matching threshold, calculate coupling; At last, choose optimum match block and determine the final forward motion vector of this piece, backward motion vector and piece residual error.Combine with forward direction and back forecast coding, realize new predictive coding type, applicable to the AVS standard of formulating.

CN200310116090.2 has proposed to determine under a kind of direct coding pattern the method for reference image block, can solve well when keeping accurate motion vectors, can realize by the mode of no division again, thereby improve the precision of the calculating of motion vector, can embody the motion of object in the video more realistically, obtain motion-vector prediction more accurately, with forward predictive coded, the back forecast coding combines, a kind of new predictive coding type of available realization, not only can guarantee direct mode coding high efficiency but also be convenient to the realization of hardware, obtain and effect that traditional B frame coding is similar the AVS standard that can be used for formulating.98123036.9 the coding and decoding video of a mistake reset mode (CODEC) method contains the computer-readable medium of video CODEC method program and video CODEC device.Video CODEC method provides bigger recovery capability for preventing channel error, and it is less to make communication influenced by mistake.Wherein, from each macroblock partitions header data position district of mistake reset mode video data, motion vector data position district and discrete cosine transform data bit district, then to dividing position district Variable Length Code, to the reversible Variable Length Code of selecting from the Variable Length Code district according to the priority that is used to recover in position district, at Variable Length Code or can insert mark in reverse variable-length encoding position district.But existing method does not concentrate on the problem that solves calculated load.

Three, summary of the invention

The objective of the invention is: the video compression encoding algorithm system that adopts autonomous Design, on network overhead and amount of calculation, reach perfect balance, has the compression ratio height, the low characteristics of while calculated load, approaching compression ratio H.264 can be provided, calculated load can be reduced to approaching level H.263 again.

Video compression coding-decoding method, it is characterized in that comprising that following program carries out encoding process to the video compression signal, discrete cosine transform DCT:DCT is a kind of spatial alternation, being that unit carries out as piece, what generate is DCT coefficient data piece, and general image can both will be looked like the concentration of energy of piece on minority low frequency DCT coefficient; Conversion and quantification: quantification is carried out at dct transform coefficient, quantizing process is exactly to remove the DCT coefficient with certain quantization step, 64 dct transform coefficients in the discrete cosine transform block are adopted different quantified precisions, to guarantee comprising specific DCT spatial frequency information as much as possible, make quantified precision be no more than needs again.In the dct transform coefficient, low frequency coefficient is higher to the importance of visual response, and therefore the quantified precision that distributes is thinner; High frequency coefficient is lower to the importance of visual response, and the quantified precision of distribution is thicker, all can vanishing after the most of high frequency coefficients in discrete cosine transform block quantize;

Before coded bit stream enters channel, the channel buffer memory need be set.The channel buffer memory inwards writes data with variable bit rate from entropy coder by a buffer, outwards reads with the constant bit rate of transmission system nominal, sends into channel.The size of buffer, or the title capacity configures, but the instantaneous output bit rate of encoder often apparently higher than or be lower than the frequency band of transmission system, this just might cause buffer on overflow or under overflow.Therefore buffer must have controlling mechanism, by the FEEDBACK CONTROL compression algorithm, adjusts the bit rate of encoder, make buffer write data rate and sense data speed tends to balance.Buffer is to realize by the quantization step of control quantizer to the control of compression algorithm, when the instantaneous output speed of encoder too high, when buffer will overflow, quantization step is increased to reduce encoding throughput, the also corresponding loss that increases image certainly; When the instantaneous output speed of encoder is low excessively, when buffer will overflow down, quantization step is reduced to improve encoding throughput.

Estimation: when estimation is used in inter-frame encoding, produce being compressed the estimation of image by reference frame image.Estimation is that unit carries out with the macro block, calculates the offset between the macro block on the correspondence position that is compressed image and reference picture.This offset is described with motion vector, and motion vector is represented the displacement on level and the vertical both direction.During estimation, the P frame is different with the employed reference frame image of B two field picture.The P two field picture uses front the I frame or the P frame image for referencial use of decoding recently, is called forward prediction; And the B two field picture uses two two field pictures as prediction reference, be called bi-directional predicted, prior to coded frame (forward prediction), another frame is later than coded frame (back forecast) to one of them reference frame on DISPLAY ORDER on DISPLAY ORDER, and the reference frame of B frame under any circumstance all is I frame or P frame;

Motion compensation: the motion vector that utilizes estimation to calculate, the macro block in the reference frame image is moved to opposite position on level and the vertical direction, can generate being compressed the prediction of image.Motion all is orderly in the overwhelming majority's natural scene.The predicted picture that therefore this motion compensation generates is very little with the difference value that is compressed image.

Feature of the present invention is: carry out motion search in frequency domain behind the motion estimation algorithm that adopts at frequency domain, sampled signal is done quantification, storage, motion search after the dct transform all finish video encoder at frequency domain and finish all calculating at frequency domain.

Basis of the present invention also comprises: in the Run-Length Coding, have only nonzero coefficient to be encoded.The coding of a nonzero coefficient is made up of two parts: the quantity (being called the distance of swimming) of the continuous zero coefficient before the preceding part expression nonzero coefficient, a back part is that nonzero coefficient.So just the advantage applies of type scanning has been come out because type scanning zero chance in most of the cases to occur connecting many, the efficient of Run-Length Coding is just than higher.When the remaining DCT coefficient in the rear portion in the one-dimensional sequence all is zero, as long as indicate, just can finish the coding of this 8 * 8 transform block with one " block end " sign (EOB), the compression effectiveness of generation is very tangible.

The subjective assessment of digital picture quality: the condition of subjective assessment comprises: estimate group structure, viewing distance, test pattern, ambient light illumination and background tone etc.Evaluation group is made of certain number observer, and wherein professional and layman respectively account for certain proportion.Viewing distance is 3-6 a times of display Diagonal Dimension.Test pattern has some image sequences with certain image detail and motion to constitute.What subjective assessment reflected is the mean values of many people to the picture quality statistical appraisal.

Type scanning and Run-Length Coding: dct transform produces is one 8 * 8 two-dimensional array, for transmitting, also must be converted into the one dimension arrangement mode.Two kinds of two dimensions are arranged to the conversion regime of one dimension, or claim scan mode: type scanning (Zig-Zag) and mixed sweep, type scanning wherein is the most frequently used a kind of.Because after quantizing, most of summation about non-zero DCT coefficients concentrate on the upper left corner of 8 * 8 two-dimensional matrixs, it is the low frequency component district, type scanning after, these summation about non-zero DCT coefficients just concentrate on the front portion that one dimension is arranged array, the zero DCT coefficient that is quantified as of long string is followed in the back, and these have just created condition for Run-Length Coding.Entropy coding to a kind of effective discrete representation of the DCT coefficient that quantize to generate, before transmission, carries out the bit stream coding, produces the digital bit stream that is used to transmit.Entropy coding is based on the statistical property of code signal, makes mean bit rate descend.The distance of swimming and nonzero coefficient both can be independently, the also associable entropy coding of doing.Use in the huffman coding in the entropy coding, behind the probability of having determined all code signals, produce a code table, the less bit of recurrent big probability signal allocation is represented, the more bit of infrequently small probability signal allocation is represented, made the average length of whole code stream be tending towards the shortest.

Characteristics of the present invention are: the performance that improves motion search this step itself.In the conventional video coding scheme, encoder must be done repeatedly conversion at spatial domain-frequency domain, when motion search, the algorithm that all is based on spatial domain that uses, and to residual coding the time, need carry out at frequency domain again, so that the concentration of energy of coefficient at low frequency range, conveniently quantizes.The frequent suitable consumes resources of conversion between spatial domain-frequency domain.

Characteristics of the present invention also are: a cover elder generation and then a complete video conferencing system is provided.The technological core of video conference is the coding and decoding video algorithm system, the present invention launches further investigation in this field, innovative sub-pixel motion searching algorithm based on frequency domain is proposed, set up the efficient and stable video coding algorithm system of a cover, this video coding algorithm system is the code efficiency height not only, and computation complexity is being realized on the hardware platform easily cheaply far below similar other algorithms.The present invention proposes a kind of searching algorithm of original creation in the inferior pel search step of motion search, computation complexity can be reduced to below 10%, can guarantee the accuracy that Search Results is enough simultaneously.

Remove in addition the function that system of the present invention is achieved as follows:

Use novel video coding algorithm, finish the sub-pixel motion search at frequency domain.

Provide the electronic remote blank, long-range lantern slide, data sharing.

12 " liquid crystal touch screen can be drawn arbitrary graphic and written communication.

The visit of support mobile phone, conferencing data in time mails to portable terminal.

Built-in web server provides the user interface modifications coding parameter.

Built-in disk video recorder is recorded the ultra-long time video image.

Usb is provided interface, makes things convenient for swap data and plug into the usb digital camera.

Built-in highly sensitive motion detection algorithm, but double as safety monitoring.

Four, description of drawings

Fig. 1 video compression encoding algorithm block diagram 2 is leaked barrel mould

Delta-response during Fig. 3 movement of objects is Fig. 3 (a) object delta-response during translation s to the right wherein

Fig. 3 (b) object is the delta-response during translation s-1 left

The inferior pixel space of Fig. 4 position

Calculated performance under each standard test sequences of Fig. 5 relatively

Fig. 6 is fully based on the video coding flow process of frequency domain

The distribution of pixel and surrounding pixel among Fig. 74 * 4

Fig. 8 intra-frame 4 * 4 forecasting model

Fig. 9 intra-frame 4 * 4 fast prediction model selection flow chart

Piece and adjacent block that sub-sampling Figure 11 of Figure 10 4 * 4 fritters is current

Current 4 * 4 fritters of Figure 12 and former frame same position 4 * 4 fritter schematic diagrames

Figure 13 systems soft ware composition frame chart Figure 14 system hardware composition frame chart

Five, embodiment

1 video compression encoding algorithm

Fig. 1 is a video compression encoding algorithm block diagram of the present invention.

Each algoritic module in the block diagram is described below:

A. motion search (estimation)

Motion search (or being called estimation) is one of the core technology in video compression coding field, simultaneously also is the algoritic module of consumption systems computational resource in the video coding.Motion search divides whole pel search and two levels of inferior pel search, and Video Coding Scheme of the present invention adopts conventional Hybrid Search algorithm in whole pel search; And in inferior pel search, the present invention has realized innovative search technique.The back will be introduced this novel searching algorithm in detail.

B. infra-frame prediction

In video flowing, the coded system of every two field picture both can be I frame (an infra-frame prediction frame), also can be P frame (MB of prediction frame).The P frame when coding not directly the information in self image as the coded data source, but, find movable information carried out motion search in the image encoded in the past, as the foundation of inter prediction, and then the difference of two two field pictures encoded.Can significantly reduce like this and be used for describing the used bit number of image, thereby realize the purpose of compression.The I frame when coding not by means of any image in the past, but utilize self the pixel of coded portion predict the not value of coded portion pixel.The code efficiency of I frame does not have the P vertical frame dimension, but the I frame is the important composition unit in the video code flow, because the I frame provides heavy synchronous ability.If certain frame generation packet loss in transmission then uses the follow-up P frame of this frame prediction not to be correctly decoded, but because the I frame is self-contained, does not quote any image in the past, so that code stream here obtains again is synchronous, within the specific limits with the mistake restriction.Because the importance of I frame, also be one of research emphasis of any Video Coding Scheme to the intraframe prediction algorithm of I frame.The present invention proposes a kind of intraframe prediction algorithm of novelty in the back, and efficient, stable infra-frame prediction performance is provided under limited computing cost.

C. rate-distortion optimization

In each coding mode, choose the best alternatives.In video coding, have the decision problem of many coding modes and parameter.For example, what value motion vector should get when inter prediction, and how many search precisions is, the rate-distortion optimization algorithm is depended in the selection of these coding parameters and pattern.The rate-distortion optimization algorithm is assessed the coding mode or the parameter of each candidate, picks out optimization model according to certain rule then.This rule of selecting generally be to weigh simultaneously code efficiency (being compression performance) and the compression after two performance index of signal to noise ratio.These two performance index concern that right and wrong are linear, in order to accelerate computational speed, reduce the computing cost of system,

Adopt Lagrangian to realize linear approximation in the video compression coding scheme of the present invention.Following formula is the Lagrangian in this programme.Wherein, DREC is the distortion factor, and PREC is the code efficiency after the prediction, and Sk, Q are coding mode and parameters to be selected, total the cost value that JMODE is, making the coding mode of JMODE value minimum and parameter is exactly optimal value to be selected.

JMODE(Sk，Ik，λ)＝DREC(Sk，Q)+λRREC(Sk，Q)

D. Rate Control

The supervisory channel situation makes decisions to the distribution of code check.This algoritic module utilization leakage barrel mould as shown in Figure 2 detects the transmission situation of channel.

E. storage management

The logic of memory and physical management, and be responsible for the reference frame queue management.When the P frame is encoded, need to carry out motion search, so in Code And Decode, must set up the reference frame formation, the stored reference frame data with reference to the image of having encoded or having decoded in the past.Use same memory logic model between encoder and the decoder, safeguard the reference frame formation independently of one another, only transmit the minimum synchronous information that is used for.

F. entropy coding

The whole bag of tricks to the video sequence compression all centers on three aspects: eliminate time redundancy, eliminate spatial redundancy, eliminate statistical redundancy.Interframe and infra-frame prediction are respectively at time redundancy and spatial redundancy, and the method for elimination statistical redundancy just is called entropy coding.Video coding algorithm system of the present invention adopts ripe Huffman algorithm as entropy coding.

G. conversion and quantification

When residual error data is done-conversion frequently, and quantize at frequency domain.

1.1 sub-pixel motion search

Motion search (or being called estimation) is one of the core technology in video compression coding field.Vision signal has googol according to amount after analog-digital conversion, can't be directly with it storage or be used for communication.Yet with respect to sample frequency at a high speed, the natural forms that occurs in the video image all is slowly to change, and this causes all having great redundancy in time-domain and spatial domain in the original video information.The basic principle of motion search technology is the adjacent image in the search video sequence, finds out movable information and motion vector, replaces the raw information of respective image with the data that characterize object of which movement, thereby greatly eliminates time redundancy, reaches the purpose of data compression.

The precision of modern sport searching algorithm no longer is confined to whole pixel.Experiment showed, when reaching half pixel or above inferior pixel accuracy, will make the code check behind the coding that remarkable reduction be arranged.Under low noise conditions, when search precision whenever doubles, compression ratio can improve about 0.5bit/sample, and coding back average bit rate can descend 24.41%～36.92%.Yet when search precision reaches 1/8 pixel when above, because noise strengthens, the raising of compression ratio is no longer obvious.At present the video encoding standard of main flow has all adopted inferior pel search technology to improve coding efficiency, H.263 with among the MPEG-2 is introducing the half picture element movement search, and at the motion search that H.264 more has been to use 1/4 pixel accuracy of MPEG-4 and up-to-date formulation.

In existing inferior pel search algorithm, widely used technology is based on the full-search algorithm of spatial domain or the various fast algorithms of full search, these algorithms are that match block is searched by unit with the pixel block in search window, with average variance and or absolute difference and serve as the judgement rule, need to do repeatedly filtering interpolation in its search procedure, and the repeated calculation cost function, computation complexity is very high.Experiment shows, enter inferior pixel accuracy after, the computing cost of motion search process often will exceed more than a times of former whole pel search.Moreover, the accuracy of coupling also depends on the precision of interpolation algorithm, influences code efficiency to a certain extent.The present invention proposes a kind of searching algorithm of novelty, utilize the dependency prediction and the searching motion vector of phase place at frequency domain, this algorithm need be done interpolation calculation hardly in inferior pel search process, also without the calculation cost function, can greatly cut down the computing cost that the spatial domain searching algorithm brings, be applicable to the embedded platform that needs video content services.

1.1.1 frequency domain phase place and object space translation

As everyone knows, in Fourier transform, the variation of phase place is corresponding with the translation of object in time-domain/spatial domain:

F{x(s-τ)}＝e ^-jwτF{x(s)}????(1)

F{} represents the Fourier transform of discrete signal in formula (1), s representation space displacement (if in time domain, replace with t, below only introduce spatial domain).By this character of Fourier transform, can in frequency domain, parse the movable information in the spatial domain at an easy rate.In the scheme of video coding, if adopt Fourier transform, searching moving information will become very convenient and accurate in frequency domain.Yet the Energy Convergence of Fourier transform can be bad, can not remove spatial redundancy effectively after the conversion, and this shortcoming makes Fourier transform can not be applied in the practical video encryption algorithm.What generally adopted by each video encoding standard at present is dct transform, dct transform has the Energy Convergence energy near Karhunen-Loeve transformation, can by behind the low pass filter, can under high compression ratio, guarantee picture quality with most of concentration of energy in direct current and low frequency part.At this point, when the present invention adopts DCT to realize-the frequency conversion, calculate the translation in space below from the phase place of dct transform domain, because the particularity of dct transform no longer has simple corresponding relation as Fourier in the DCT territory.

Suppose to have one-dimensional discrete signal (x ₁(n) | n ∈ [0, N-1] } (N is the size of search window), behind the m that moves to right, form signal { x ₂(n) | n ∈ [0, N-1] }:

x_{2} (n) = \{\begin{matrix} x_{1} (n - m), n &GreaterEqual; m \\ 0, n < m \end{matrix} - - - - (2)

According to ^[2], DCT that is defined as follows and DST conversion:

X_{2}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \cos (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1] - - - - (3)

X_{2}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \sin (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1] - - - - (4)

Z_{1}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \cos (\frac{kπ}{N} n), k &Element; [0, N - 1] - - - - (5)

Z_{1}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \sin (\frac{kπ}{N} n), k &Element; [0, N - 1] - - - - (6)

In the following formula,

C (k) = \{\begin{matrix} \frac{1}{\sqrt{2}}, k = {0, N} \\ 1, k = [1, N - 1] \end{matrix} - - - - (7)

Prove that easily following equation is satisfied in these four conversion:

[\begin{matrix} X_{2}^{C} (k) \\ X_{2}^{S} (k) \end{matrix}] = [\begin{matrix} Z_{1}^{C} (k) - Z_{1}^{S} (k) \\ Z_{1}^{S} (k) + Z_{1}^{C} (k) \end{matrix}] [\begin{matrix} g_{m}^{C} (k) \\ g_{m}^{S} (k) \end{matrix}] - - - - (8)

Wherein,

g_{m}^{S} = \sin ((kπ / N) (m + 0.5)), g_{m}^{C} = \cos ((kπ / N) (m + 0.5)) .

We see that these two variablees that belong to frequency domain have comprised translation information m.At known signal x ₁(n), x ₂(n) under the situation, if can find fast algorithm to solve g _m ^C, g _m ^S, and therefrom extract m, just can realize the motion search in DCT territory.

Equation in (8) is rewritten as

\overset{&RightArrow;}{X} (k) = Z (k) \overset{&RightArrow;}{Ω} (k) .

Can prove that Z (k) is an orthogonal matrix, and has:

λZ ^T(k)Z(k)＝I ₂????(9)

I ₂It is one 2 * 2 unit matrix.Like this, we can solve equation:

\overset{&RightArrow;}{Ω} (k) = λ Z^{T} (k) \overset{&RightArrow;}{X} (k) - - - - (10)

Thereby can solve g _m ^C, g _m ^S

Quadrature rule according to SIN function has following law ^[4]:

Σ_{k = 1}^{N} C^{2} (k) \sin (\frac{kπ}{N} (m + 0.5)) \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1) - - - - (11)

Σ_{k = 0}^{N - 1} C^{2} (k) \cos (\frac{kπ}{N} (m + 0.5)) \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1) - - - - (12)

Wherein, δ (n) is discrete impulse function.

According to formula (8), (10～12), we can draw:

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1) - - - - (13)

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{C} \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1) - - - - (14)

Analysis mode (13), when m greater than 0, and when being positioned at search window [0, N], can find positive delta-response at the n=m place, find negative delta-response at the n=-m-1 place simultaneously; When m＜0, and be positioned at search window negative mirror image [N, 0) time, can find negative delta-response at the n=m place, simultaneously find positive delta-response at the n=-m-1 place.As shown in Figure 3, gray area is a search window, when find mean that then object has translation to the right, and moving displacement to be s by positive delta-response in search window; When in search window, finding mean that then object has translation left, and moving displacement to be s-1 by negative delta-response.See Fig. 3 (a) object delta-response during translation s and Fig. 3 (b) object delta-response during translation s-1 left to the right.Fig. 4 is inferior pixel space position view.

When concrete calculating, can with Replace

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)),

To reduce computation complexity.

1.1.2 the inferior pel search algorithm flow of frequency domain

In the above on the basis of Tui Daoing, as follows based on the flow process of the inferior pel search algorithm of frequency domain:

1) determine that search window is N, being extracted in and putting in order picture element F with reference picture on the x direction is initial one-dimensional signal x ₁(n) x of correspondence position and in the present image ₂(n).

2), calculate x according to formula (3～6) ₁(n) and x ₂(n) four discrete DCT/DST conversion coefficients.

3) calculate at [1, N] interval g _m ^S, obtain by formula (3～6), (8):

g_{m}^{S} (k) = \{\begin{matrix} 1, k = N \\ (Z_{1}^{C} (k) \cdot X_{2}^{S} (k) - Z_{1}^{S} (k) \cdot X_{2}^{C} (k)) / ({(Z_{1}^{C} (k))}^{2} + {(Z_{1}^{S} (k))}^{2}), k &Element; [1, N) \end{matrix} - - - - (15)

4) draw translation direction d on the x direction according to formula (13) _xAnd displacement s _x

5) on the y direction, repeat above step, draw the d on the y direction _y, s _y

6) carry parameter m _x, m _yQuestion blank 1 is determined the match point in Fig. 4, and definite half picture element movement vector.

Table 1 m and motion vector

????m _x	????m _yThe match point motion vector
????m _x	????m _yThe match point motion vector	????＞0 ????＞0 ????＞0 ????＜0 ????＜0 ????＜0 ????＝0 ????＝0	????＞0????3?????(0.5，0.5) ????＜0????8?????(0.5，-0.5) ????＝0????5?????(0.5，0) ????＞0????1?????(-0.5，0.5) ????＜0????6?????(-0.5，-0.5) ????＝0????4?????(-0.5，0) ????＞0????2?????(0，0.5) ????＜0????7?????(0，-0.5)
????＝0	????＝0????F?????(0，0)	????＞0 ????＞0 ????＞0 ????＜0 ????＜0 ????＜0 ????＝0 ????＝0

7) motion vector of 1/4 pixel accuracy if desired is by 6) in the motion vector of gained use the bi-linear filter interpolation, on the gained pixel block, repeat 1)-6) step.

Fig. 5 is that the algorithm of this paper is comparing with the computation complexity of full-search algorithm in inferior pel search under each standard test sequences, and promptly calculated performance relatively.Because the image construction of each cycle tests is different, computing environment has nothing in common with each other, and for simplicity, the full-search algorithm computation complexity in each cycle tests is made as 1, as a comparison benchmark.

The full name of dct transform is discrete cosine transform (Discrete Cosine Transform), is meant to convert one group of light intensity data to frequency data, so that learn the situation of Strength Changes.If the data of high frequency are done a little modifications, when going back to the data of original form again, obvious and some difference of initial data, but human eyes but are to be not easy to recognize.During compression, raw image data is divided into 8*8 data cell matrix, for example first matrix of brightness value thes contents are as follows:

y ₀₀??y ₀₁??y ₀₂??y ₀₃??y ₀₄??y ₀₅??y ₀₆??y ₀₇

y ₁₀??y ₁₁??y ₁₂??y ₁₃??y ₁₄??y ₁₅??y ₁₆??y ₁₇

y ₂₀??y ₂₁??y ₂₂??y ₂₃??y ₂₄??y ₂₅??y ₂₆??y ₂₇

y ₃₀??y ₃₁??y ₃₂??y ₃₃??y ₃₄??y ₃₅??y ₃₆??y ₃₇

y ₄₀??y ₄₁??y ₄₂??y ₄₃??y ₄₄??y ₄₅??y ₄₆??y ₄₇

y ₅₀??y ₅₁??y ₅₂??y ₅₃??y ₅₄??y ₅₅??y ₅₆??y ₅₇

y ₆₀??y ₆₁??y ₆₂??y ₆₃??y ₆₄??y ₆₅??y ₆₆??y ₆₇

y ₇₀??y ₇₁??y ₇₂??y ₇₃??y ₇₄??y ₇₅??y ₇₆??y ₇₇

JPEG is with full luminance matrix and chrominance C b matrix, and saturation Cr matrix is considered as an elementary cell and is called MCU.The matrix quantity that each MCU comprised must not be above 10.For example, the ratio of row and column sampling was all 4: 2: 2, and then each MCU will comprise four luminance matrix, a chrominance matrix and a saturation matrix.

After view data is divided into a 8*8 matrix, also each numerical value must be deducted 128, one by one in the substitution dct transform formula, can reach the purpose of dct transform then.Image data value must deduct 128, is because the digital scope that the DCT conversion formula is accepted is between-128 to+127.

The dct transform formula:

The coordinate position of certain numerical value in the x, y data representing image matrix.(x, y) several numerical value in the data representing image matrix., v represents behind the dct transform coordinate position of certain numerical value in the matrix, and (u v) represents certain numerical value in the matrix behind the dct transform to F.

U=0 and v=0 c (u) c are (v)=1/1.414

U＞0 or v＞0 c (u) c (v)=1

Through the matrix data natural number behind the dct transform is coefficient of frequency, and these coefficients are called DC with the value maximum of F (0,0), and remaining 63 coefficient of frequency then is that some approach 0 positive and negative floating number mostly, is referred to as AC without exception.

1.1.3 brief summary

For video coding, in frequency domain, carry out motion search, its excellent place not only is to improve the performance of this step of motion search itself.In the conventional video coding scheme, encoder must be done repeatedly conversion at spatial domain-frequency domain, when motion search, the algorithm that all is based on spatial domain that uses, and to residual coding the time, need carry out at frequency domain again, so that the concentration of energy of coefficient at low frequency range, conveniently quantizes.The frequent suitable consumes resources of conversion between spatial domain-frequency domain, behind the motion estimation algorithm that adopts at frequency domain, video encoder will be finished all calculating at frequency domain, and the coding flow process is as shown in Figure 6.Compare with the video coding flow process of searching motion vector in spatial domain, quantification, storage, the motion search of Fig. 6 after sampled signal is done dct transform all finished at frequency domain, this has not only reduced the anti-dct transform step in the spatial domain coding flow process, more effectively the space that needs is stored in reduction, helps the optimization of encoder and decoder.

1.2 the fast selection algorithm of intra prediction mode

1.2.1 intraframe coding predictive mode

If do not have very strong temporal correlation between present image and the front input picture, this two field picture generally is encoded as the I frame, uses intra-frame encoding mode.In video encoding standard in the past, the I two field picture does not use the technology of prediction but direct coding, just with the macro block data Direct Transform, quantize the back coding transmission, the data volume after the I two field picture is encoded like this is very big.For more effective raising code efficiency, video coding system of the present invention makes full use of the spatial redundancies between each pixel in the image, has defined 16 * 16 and 4 * 4 prediction units.The distribution of pixel and surrounding pixel among Fig. 74 * 4

In intra-framed prediction module of the present invention, if the current macro coding mode is intraframe coding, the predicted value of macro block is the macro block that comes from after the adjacent coding and rebuilding.Luminance component can use 16 * 16 macro blocks or 4 * 4 fritters base unit as intraframe predictive coding.When using 16 * 16 macro blocks, there are 4 kinds of predictive modes available as the coding unit; When using 4 * 4 fritters as the coding unit, it is available to have 9 kinds of predictive modes.Two chromatic components use the base unit of 8 * 8 macro blocks as intraframe predictive coding, have 4 kinds of predictive modes available, and the pattern of two chromatic component selections must be the same.Because 4 * 4 fritters are more meticulous, computation complexity is mainly reflected in this unit.

In 4 * 4 fritters distribution of pixel and surrounding pixel as shown in Figure 7, wherein small letter English alphabet a represents 16 pixels of fritter inside to p, capitalization A represents pixel around the fritter to M.Intra-frame 4 * 4 uses 9 kinds of patterns to predict, wherein pattern 2 is DC predictions, and remaining predictive mode direction is shown in Fig. 8 intra-frame 4 * 4 forecasting model.For example, if select for use pattern 1 to carry out the prediction of horizontal direction, the predicted value in the fritter comes from pixel I, J, K, L.

1.2.2 fast frame intraprediction encoding model selection algorithm

The intra prediction mode selection algorithm that the present invention proposes utilizes the predictive coding pattern of boundary direction histogram, context model and former frame same position fritter, select available candidate's predictive mode fast, carry out precoding according to pre-selected pattern, utilize Lagrange cost function to select optimum predictive mode again.In order further to reduce amount of calculation, before the computation bound direction vector, earlier initial data is carried out sub-sampling.With the intra-frame 4 * 4 is example, and the flow process that the fast frame inner estimation mode is selected hereinafter will be introduced respectively the various piece in the flow process shown in Fig. 9 intra-frame 4 * 4 fast prediction model selection flow chart.

1.2.2.1 pixel sub-sampling

Raw pixel data to input is carried out 2: 1 sub-samplings, and the number of pixels after the sampling is 1/2 of an original pixels number, and it approximately is original 1/2 that the pixel after the sampling is carried out the spent time of boundary direction vector calculation.The sub-sampling method of the pixel that is adopted is shown in the sub-sampling of Figure 10 4 * 4 fritters herein, and among the figure behind sub-sampling, what filled circles was represented is available sampled pixel.

1.2.2.2 model selection based on boundary direction

Natural image is continuous and relevant in the space, on each pixel of composition diagram picture 8 prediction direction spatially correlation is arranged all, this characteristic can be utilized to reduce spatial redundancy, if can find that the strongest direction of correlation, and the value of using infra-frame prediction to come encoded pixels, just can reach the optimal effectiveness of intraframe coding.This paper uses the Sobel operator ^[3～5]Calculate the boundary direction vector of the pixel behind the sub-sampling, the Sobel operator is

[\begin{matrix} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{matrix}]

With

[\begin{matrix} - 1 & 0 & 1 \\ - 2 & 0 & 2 \\ - 1 & 0 & 1 \end{matrix}],

Be used for the level and the vertical direction component of computation bound vector respectively.

For the pixel p i behind the sub-sampling, j, corresponding border vector is

{\overset{&RightArrow;}{D}}_{i, j} = {{dx}_{i, j}, {dy}_{i, j}},

Dxi, j and dyi, j represent border vector level and vertical direction component respectively.Dxi, j and dyi, the computing formula of j, shown in 1 formula, p wherein _{I-1, j+1}Deng referring to pixel p i, the neighbor of j in original image.

dx _i，j＝p _i-1，j+1+2×p _i，j+1+p _i+1，j+1-p _i-1，j-1-2×p _i，j-1-p _i+1，j-1

dy _i，j＝p _i+1，j-1+2×p _i+1，j+p _i+1，j+1-p _i-1，j-1-2×p _i-1，j-p _i-1，j+1????(1)

Calculate for convenience, the mould of definition boundary direction vector is:

Amp ({\overset{&RightArrow;}{D}}_{i, j}) = | {dx}_{i, j} | + | {dy}_{i, j} | - - - - (2)

The direction of boundary direction vector is:

Modulo addition with the vector of equidirectional in the fritter, obtain corresponding boundary direction histogram (Edgedirection histogram), the histogrammic foundation of the boundary direction of intra-frame 4 * 4 is as shown in the formula shown in 3, and the direction of mould maximum is as candidate's prediction direction in the choice direction histogram.

Histo (k) = \underset{(m, n) &Element; SET (k)}{Σ} Amp ({\overset{&RightArrow;}{D}}_{m, n}),

SET (k) &Element; {(i, j) | Ang ({\overset{&RightArrow;}{D}}_{i, j}) &Element; a_{u}},

while

a ₀＝(-103.30，-76.60]

a ₁＝(-13.30，13.30]

a ₃＝(35.80，54.20]

a ₄＝(-54.20，-35.80]

a ₅＝(-76.70，-54.20]

a ₆＝(-35.80，-13.30]

a ₇＝(54.20，-76.70]

a ₈＝(13.30，35.80]

(3)

1.2.2.3 carry out model selection based on contextual model

The correlation of having living space between each fritter of piece image is so can utilize the coding mode of adjacent isles to predict the coding mode of current fritter.As shown in figure 11, C represents 4 * 4 current fritters, and what A and B represented is 4 * 4 fritters of current block top and 4 * 4 fritters on the current block left side.With the maximum of A and B predictive mode candidate's predictive mode as current block.Current as shown in figure 11 piece and adjacent block

1.2.2.4 the state model based on piece on the former frame image same position is selected

Coding mode according to current fritter 4 * 4 fritters of correspondence position in the former frame image, if the intra-frame encoding mode that the corresponding fritter of former frame image is to use, the coding mode of corresponding fritter just is selected candidate code pattern as current 4 * 4 fritters in the former frame image so, shown in current 4 * 4 fritters of Figure 12 and former frame same position 4 * 4 fritter schematic diagrames.

1.2.2.5 precoding and performance are relatively

Precoding is with the pixel around the current fritter, and the candidate's predictive mode according to above having chosen carries out predictive coding to current fritter successively, utilizes Lagrange cost function to select optimum predictive mode, and the cost function of Lagrange is:

J(s，c，IMODE|QP，λ _MODE)＝SSD(s，c，IMODE|QP)+λ _MODE·R(s，c，IMODE|QP)????(4)

Wherein IMODE refers to such an extent that be the alternative several prediction direction of infra-frame prediction, SSD refer to be between the pixel value c of original pixel value s of intra-frame 4 * 4 and reconstruction mean square error and, R (s, c, IMODE|QP) referring to use IMODE pattern encodes, the code stream size of required coding, use be elongated huffman coding.Use Y-PSNR (PSNR) to carry out quality testing in video coding, formula (5) is the formula of Y-PSNR:

PSNR = 10 \log_{10} (\frac{255^{2}}{MSE}) - - - - (5)

1.2.3 experimental result

The cycle tests that experiment is used is Mobile, Tempete, Bus, the Paris of size as QCIF, only luminance component is tested simultaneously.Result of the test is as shown in table 2.

The coding efficiency of table 2 under different cycle testss changes

Cycle tests	The variation of first I two field picture scramble time (%)	The variation of average every two field picture bit rate (%) in the sequence	The variation of average every two field picture scramble time (%) in the sequence	The variation of image PSNR (dB)
Cycle tests				The variation of image PSNR (dB)	??Mobile	????-70.25	????0.12	????-33.56	????-0.016
??Tempete	????-69.78	????0.26	????-32.14	????-0.014	??Mobile	????-70.25	????0.12	????-33.56	????-0.016
??Tempete	????-69.78	????0.26	????-32.14	????-0.014	??Bus	????-69.58	????0.39	????-24.34	????-0.024
??Paris	????-71.03	????0.42	????-31.76	????-0.021	??Bus	????-69.58	????0.39	????-24.34	????-0.024

2 systems soft ware composition frame charts (Figure 13 systems soft ware composition frame chart)

In the software architecture of system, most crucial module is video coding and decoder, and these two main bodys that part is the whole software framework also are maximum innovation of the present invention places.The designed video conferencing system of the present invention uses the RTP/RTCP agreement to come transmission of video and speech data.Wherein RTP is responsible for the media data packing is sent, and RTCP is responsible for linking up the transmission and the recipient of video and audio data stream, transmits feedback information and time synchronizing information.

3 system hardware composition frame charts (shown in Figure 14), system adopts embedded design.

In a word, video conference is a market that is increasing fast, but because industry standard complete unity as yet, the western countries status of also can't on core technology, monopolizing, and China is being faced with opportunity greatly, is expected in this field one and carries out a great plan.The partial video conference network equipment of domestic production at present such as products such as MCU, gatekeeper have occupy advanced technology even leading status in the world, and for the terminal equipment product of video conference, China still lacks competitive product, and market is almost captured fully by external product.The AVCS-II video conferencing system of Nanjing University's Applied Physics Research Institute, can be described as China to a certain extent at the video conference terminal product scope, especially technical new trial of Video Codec and breakthrough is expected to open the market of home and overseas video conference.The sub-pixel motion searching algorithm based on frequency domain that the present invention proposes is an innovation technically, and experiment and user's actual use proves, this algorithm accuracy rate height, and computation complexity is extremely low, can mate the optimal motion vector fast.Except the video coding system of uniqueness, designed system of the present invention provides rich video meeting tool set, thereby has made up the complete video and the platform of data interaction for the user.

Claims

1, video compression coding-decoding method, comprise that following program carries out encoding process to the video compression signal, discrete cosine transform DCT:DCT is a kind of spatial alternation, to be that unit generates DCT coefficient data piece as piece, general image can both will be looked like the concentration of energy of piece on minority low frequency DCT coefficient; Conversion and quantification: quantification is carried out at dct transform coefficient, and quantizing process is exactly to remove the DCT coefficient with certain quantization step, and in the dct transform coefficient, low frequency coefficient is higher to the importance of visual response, and therefore the quantified precision that distributes is thinner; High frequency coefficient is lower to the importance of visual response, and the quantified precision of distribution is thicker; Before coded bit stream enters channel, the channel buffer memory need be set: the channel buffer memory, inwards write data with variable bit rate from entropy coder by a buffer, outwards read with the constant bit rate of transmission system nominal, send into channel; Buffer must have controlling mechanism, by the FEEDBACK CONTROL compression algorithm, adjusts the bit rate of encoder, make buffer write data rate and sense data speed tends to balance; Estimation: when being used in inter-frame encoding, produce being compressed the estimation of image by reference frame image, estimation is that unit carries out with the macro block, calculating is compressed the offset between macro block on the correspondence position of image and reference picture, this offset is described with motion vector, and motion vector is represented the displacement on level and the vertical both direction; During estimation, the P two field picture uses front the I frame or the P frame image for referencial use of decoding recently, is called forward prediction; And the B two field picture uses two two field pictures as prediction reference, be called bi-directional predicted, prior to coded frame (forward prediction), another frame is later than coded frame (back forecast) to one of them reference frame on DISPLAY ORDER on DISPLAY ORDER, and the reference frame of B frame under any circumstance all is I frame or P frame; Motion compensation: the motion vector that utilizes estimation to calculate, the macro block in the reference frame image is moved to opposite position on level and the vertical direction, can generate being compressed the prediction of image; And inferior pixel searched for calculating;

It is characterized in that: in frequency domain, carry out motion search behind the motion estimation algorithm that adopts at frequency domain, sampled signal is done quantification, storage, motion search after the dct transform all finish video encoder and finish all calculating at frequency domain at frequency domain.

2, by the described video compression coding-decoding method of claim 1, it is characterized in that: the flow process based on the inferior pel search algorithm of frequency domain is as follows:

3) calculate at [1, N] interval g _m ^S, obtain by formula (3～6), (8):

g_{m}^{S} (k) = \{\begin{matrix} 1, k = N \\ (Z_{1}^{C} (k) \cdot X_{2}^{S} (k) - Z_{1}^{S} (k) \cdot X_{2}^{C} (k)) / ({(Z_{1}^{C} (k))}^{2} + {(Z_{1}^{S} (k))}^{2}), k &Element; (1, N) \end{matrix}- - - (15)

M and motion vector ????m _x ????m _yThe match point motion vector ????＞0 ????＞0 ????＞0 ????＜0 ????＜0 ????＜0 ????＝0 ????＝0 ????＞0????3?????(0.5，0.5) ????＜0????8?????(0.5，-0.5) ????＝0????5?????(0.5，0) ????＞0????1?????(-0.5，0.5) ????＜0????6?????(-0.5，-0.5) ????＝0????4?????(-0.5，0) ????＞0????2?????(0，0.5) ????＜0????7?????(0，-0.5) ????＝0 ????＝0????F?????(0，0)

3, by the described video compression coding-decoding method of claim 1, it is characterized in that the intra prediction mode selection algorithm utilizes the predictive coding pattern of boundary direction histogram, context model and former frame same position fritter, select available candidate's predictive mode fast, carry out precoding according to pre-selected pattern, utilize Lagrange cost function to select optimum predictive mode again; And before the computation bound direction vector, earlier initial data is carried out sub-sampling; The pixel sub-sampling: the raw pixel data to input is carried out 2: 1 sub-samplings, and the number of pixels after the sampling is 1/2 of an original pixels number, and it approximately is original 1/2 that the pixel after the sampling is carried out the spent time of boundary direction vector calculation.Model selection based on boundary direction

Natural image is continuous and relevant in the space, on each pixel of composition diagram picture 8 prediction direction spatially correlation is arranged all, uses the Sobel operator ^[3～5]Calculate the boundary direction vector of the pixel behind the sub-sampling, the Sobel operator is

[\begin{matrix} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{matrix}]

With

[\begin{matrix} - 1 & 0 & 1 \\ - 2 & 0 & 2 \\ - 1 & 0 & 1 \end{matrix}],

Be used for the level and the vertical direction component of computation bound vector respectively; For the pixel p i behind the sub-sampling, j, corresponding border vector is

{\overset{&RightArrow;}{D}}_{i, j} = {{dx}_{i, j}, {dy}_{i, j}},

Dxi, j and dyi, j represent border vector level and vertical direction component respectively.Dxi, j and dyi, the computing formula of j, shown in 1 formula, p wherein _{I-1, j+1}Deng referring to pixel p i, the neighbor of j in original image.dx _i,j＝p _i-1，j+1+2×p _i，j+1+p _i+1，j+1-p _i-1，j-1-2×p _i，j-1-p _i+1，j-1dy _i，j＝p _i+1，j-1+2×p _i+1，j+p _i+1，j+1-p _i-1，j-1-2×p _i-1，j-p _i-1，j+1????(1)

Amp ({\overset{&RightArrow;}{D}}_{i, j}) = | {dx}_{i, j} | + | {dy}_{i, j} | - - - (2)

The direction of boundary direction vector is:

Modulo addition with the vector of equidirectional in the fritter, obtain corresponding boundary direction histogram (Edge directionhistogram), the histogrammic foundation of the boundary direction of intra-frame 4 * 4 is as shown in the formula shown in 3, and the direction of mould maximum is as candidate's prediction direction in the choice direction histogram;

Histo (k) = \underset{(m, n) &Element; SET (k)}{Σ} Amp ({\overset{&RightArrow;}{D}}_{m, n}),

SET (k) &Element; {(i, j) | Ang ({\overset{&RightArrow;}{D}}_{i, j}) &Element; a_{u}},

while

a ₀＝(-103.3°，-76.6°]

a ₁＝(-13.3°，13.3°]

a ₃＝(35.8°，54.2°]

a ₄＝(-54.2°，-35.8°]

a ₅＝(-76.7°，-54.2°]

a ₆＝(-35.8°，-13.3°]

a ₇＝(54.2°，-76.7°]

a ₈＝(13.3°，35.8°]

(3) according to the coding mode of current fritter 4 * 4 fritters of correspondence position in the former frame image, if the intra-frame encoding mode that the corresponding fritter of former frame image is to use, the coding mode of corresponding fritter just is selected candidate code pattern as current 4 * 4 fritters in the former frame image so;

J (s, c, IMODE|QP, λ _MODE)=SSD (s, c, IMODE|QP)+λ _MODER (s, c, IMODE|QP) (4) wherein IMODE refer to such an extent that be the alternative several prediction direction of infra-frame prediction, SSD refer to be between the pixel value c of original pixel value s of intra-frame 4 * 4 and reconstruction mean square error and, (s, c IMODE|QP) refer to use IMODE pattern and encode R, the code stream size of required coding, use be elongated huffman coding.