CN101247525A

CN101247525A - Method for improving image intraframe coding velocity

Info

Publication number: CN101247525A
Application number: CN 200810102517
Authority: CN
Inventors: 邓中亮; 段大高
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2008-03-24
Filing date: 2008-03-24
Publication date: 2008-08-20
Anticipated expiration: 2028-03-24
Also published as: CN101247525B

Abstract

The present invention discloses a method for increasing the encoding speed in the image frame basing on the H.264 standard. The method filters the estimating mode along the grain direction according to the grain direction of the image, and then executes RDO calculation thereby greatly reducing the calculating time of the RDO. The method of the invention effectively increases the image encoding speed on the base that the excellent image quality is kept.

Description

A kind of method that improves image intraframe coding velocity

Technical field

The present invention relates to technology of video compressing encoding, be specifically related to a kind of raising based on the method for image intraframe coding velocity H.264.

Background technology

The method of video/image coding is varied, its application relates to every aspect, for the terminal that different vendor is produced can be intercoursed information, or from a public signal source reception information, phase late 1980s, some international organizations begin to be devoted to the standardization effort of video/image coding.Simultaneously, all big enterprises produce keen interest to this, and have directly promoted video/image coding Standardization Research Work process.The most finally the 1988 (CCITT of international Telephone and Telegraph Consultative Committee, International Telephone and Telegraph ConsultativeCommittee, renamed ITU-T as in 1992, International TelecommunicationUnion Telecommunication Standardization Sector) formulate first video encoding standard---H.261, thereby become milestone on the video coding history.ISO MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video CodingExperts Group) integrate according to different application environment and demand subsequently, H.261 and core, a series of video encoding standards have been formulated in succession based on.ITU-T is absorbed in real-time video communications applications field, issued H.26x series standard (as H.261, H.262, H.263 and H.264 waiting), ISO MPEG then mainly uses towards video storage medium, television broadcasting and multimedia communication, has formulated MPEGx series standard (as MPEG-1, MPEG-2 and MPEG-4 etc.).Figure 1 shows that the development course of video encoding standard, briefly introduce the evolution of international video encoding standard below.

CCITT had organized the 15th expert group in 1985, further studied the standardization of video conferencing, and had formulated the H.261 standard at 64kbit/s video telephone/conference applications in 1988 ^[30]H.261 define (I frame, Intra frame) coding and interframe (P frame, Predictive frame) coding in the frame in the standard, and adopting technology such as inter prediction, dct transform, Huffman coding.In order to strengthen flexibility, H.261 only the bit stream syntax relevant with compatibility, code stream multiplex, decode procedure etc. have been made strict restriction provision, and do not make restriction provision to restructuring graph picture element figureofmerit being had material impact but do not influence compatible part such as the adaptive control of quantized level, estimation, Rate Control etc., very big activity space is provided for developer, manufacturer and user.H.261 the successful release of standard, all big enterprises, International Standards Organization and scientific research institutions etc. are subjected to very big inspiration, have started the research of video compression coding and have used climax.

ISO MPEG began to formulate Moving Picture Experts Group-1 in 1991, and main target is to set up a live image that is applicable to stored digital, media store and retrieval, the standard of related sound and assembly coding thereof, and become international standard in November, 1993.H.261, Moving Picture Experts Group-1 is being basic framework, introduced bi-directional predicted frames (B frame, Bi-directional predictionframe), half-pixel accuracy estimation, and image sets (GOP, Group Of Picture) notion, carried out at random read, functions such as fast advance and retreat search and reverse reproduction.

The ITU-TVCEG of ISO MPEG associating has subsequently started the formulation of MPEG-2 draft ^[32](in the ITU-T standard series, being called H.262 again), and determined to become standard in 1994.MPEG-2 has done important expansion based on MPEG-1.Be provided with " framing code " and " by a coding " method specially at interleaved existing-quality television image; Founded the dual code flow structure: program stream (program stream) and transmission stream (transport stream), serious mistake might appear in the running environment that transmits stream, and the running environment of program stream then seldom goes wrong; According to the complexity of coding techniques, introduced the notion of class (Profile) and rank (Level) first, solved the interchangeability of bit stream and international dexterously.In addition, also increased gradability (scalability) notion, allow to obtain the vision signal of different quality grade or different spatial and temporal resolutions from an encoded data stream, gradability comprises spatial domain, signal to noise ratio, time-domain etc.MPEG-2 is an extremely successful standard, extensively has been dissolved in people's the live and work.

Along with development of internet technology and universal, the network bandwidth becomes the bottleneck that the obstruction people use video day by day.In order to alleviate this problem, ITU-T VCEG proposed at hanging down the H.263 coding standard that code check is used in nineteen ninety-five.It is based on H.261, and attracted effective and reasonable technology in the suggestion such as MPEG1/2, four kinds of optional encryption algorithms are provided simultaneously, have not promptly had constrained motion vector operation, algorithm coding, senior predicted method and PB frame algorithm, further improved code efficiency based on grammer.In addition, H.263 expand picture format, supported to comprise the image of multiple forms such as QCIF, Sub-QCIF, CIF, 4CIF and 16CIF.In addition, therefore frame number in the unqualified per second in the standard can limit maximum rate by reducing number of pictures per second.On the basis of standard H.263, ITU-TVCEG has proposed two work plans: one is so-called short-term (Short Term) plan, promptly H.263 adding some new function choosing-items on the basis, further improve compression efficiency and some functions of expansion, begin so-called long-term (Long Term) plan simultaneously to develop a new international standard that is adapted to low bit-rate video communication.In short-term plan, release one after another H.263+ and redaction H.263++.Long-term plan estimates that H.263L the standard that generates is referred to as, but is renamed as H.26L in 1998.

ISO MPEG had proposed the MPEG-4 standard again in 1998, and it combines the technology and the function in fields such as Digital Television, interactive graphics and Internet, H.263, an enterprising step of MPEG-1 and MPEG-2 basis expands and replenish.The encoder bit rate of MPEG-4 is pulled and has been drawn together from being low to moderate 5kbit/s to a very big scope that is higher than 2Mbit/s, proposed and the diverse coding notion of image encoding standard in the past, it has drawn the thought of object-based coding method, its encoding scheme is based upon on the object model of arbitrary shape, in description, increased shape information than traditional coding standard to image.MPEG-4 no longer is a standardized fixedly algorithm, but sets up an extendible coding tools collection, constructs various algorithms by tool set.Under the situation of not doing to decode, it supports the content-based processing and the editor of code stream, support the synthetic of artificial image/sound and natural image/sound, support content-based arbitrary access or the like, simultaneously under the different application environment, all have robustness preferably, support content-based graduated encoding.Meanwhile, ISO MPEG has also formulated standards such as MPEG-7 and MPEG-21, for all kinds of multimedia messagess provide a kind of standardized description, and a kind of media framework of efficient, transparent and interoperable.

In order further to improve video coding efficient, December calendar year 2001, ISO and ITU-T formally set up joint video team (Joint Video Team, JVT), begin to be devoted to H.264/MPEG-4part 10 (AVC, Advanced Video Coding) standardization (unified abbreviating as H.264 in this paper) formally is defined as international standard in May, 2003.It has expanded the application from low code check to high code check based on H.26L.H.264 except that the advantage of inheriting standard in the past, also introduced many new technologies, thereby made under identical decoding quality condition, code efficiency than H.263 with MPEG-4 high nearly 50%.

In addition, it is worth noting that China Ministry of Information Industry begins to formulate the audio/video encoding standard (being called for short AVS, Audio Video codingStandard) that has independent intellectual property right in June, 2002.The target of AVS is the common technology standards such as encoding and decoding, processing and expression of working out digital audio/video, for digital audio/video equipment and system provide the encoding and decoding technique of high-efficiency and economic, be primarily aimed at great information industry such as HDTV, HD-DVD, WiMAX multi-media communication, the Internet broadband Streaming Media and use.

With video encoding standard was identical in the past, H.264 system also adopts the MC-DCT structure, i.e. motion compensation adds mixing (hybrid) structure of transition coding, and its coding techniques framework as shown in Figure 2.H.264 coding is mainly by formations such as infra-frame prediction, inter prediction (motion estimation and compensation), integer transform, quantification and entropy codings.

H.264 adopt (Intra) and two kinds of coding modes of interframe (Inter) in the frame.The video source form of supporting comprises (YUV) 4:2:0,4:2:2 and 4:4:4, support simultaneously to line by line scan and interleaved video sequence, for interleaved frame of video, H.264 support the parity field absolute coding is also supported the mode that parity field is encoded together.For I (intracoded frame) two field picture, adopt the frame mode coding; For P frame (forward predicted frame) and B frame (bi-directional predicted frames) image, then adopt coded in inter mode, but, also can select the frame mode coding in macroblock layer.(MB, Macro-Block) for unit carries out, macro block is commonly defined as 16 * 16 block of pixels to coding with the macro block of non-overlapping copies.

For the I two field picture, at first carry out infra-frame prediction, then predicted residual signal (original value and predicted value poor) is carried out integer transform and quantification, again quantization parameter is carried out variable-length encoding or arithmetic coding, generate compressed bit stream, simultaneously through process reengineering images such as inverse transformation, inverse quantizations, the reference with as the subsequent frame coding time.For the P two field picture, at first carrying out the high-precision motion of multi-mode multi-reference frame estimates and infra-frame prediction, and according to rate-distortion optimization (RDO, Rate-Distortion Optimization) selects interframe, intra-frame encoding mode and divide block mode accordingly, then residual signals is carried out conversion, quantification and entropy coding, generate compressed bit stream, simultaneously through inverse transformation, inverse quantization reconstructed image.For the B two field picture, similar to the P two field picture, at first adopt bi-directional predicted technology to carry out the estimation and the infra-frame prediction of multi-mode multi-reference frame, and lose optimized choice forced coding pattern according to rate, then residual signals is carried out conversion, quantification and entropy coding, generate compressed bit stream.In addition, SI, SP frame have H.264 also been defined.

In order to improve the network-adaptive ability of coding, H.264 adopt video coding layer (VCL, Video Coding Layer) with network abstract layer (NAL, Network Abstraction Layer) coding structure that is separated, as shown in Figure 3, VCL finishes the efficient compression to video image, and NAL is responsible for data to be packed and transmitting in the desired appropriate mode of network.

H.264 more high compression ratio, better pictures quality and the good network adaptability of video have been realized, therefore, H.264 application scenario is quite extensive, comprises video telephone (fixing or mobile), real-time video conference system, video monitoring system, internet video transmission and multimedia information storage etc.In order to meet the characteristics of extensive use, H.264 only bit stream, syntactic element and decode procedure have been made regulation, encoder is not limited, make that H.264 the realization of encoder is very flexible.

H.264 infra-frame prediction

In H.264,9 kinds of optional predictive mode collection of 4 * 4 are: mode 0, mode1 ..., mode 8}, promptly the vertical prediction direction is mode 0, the horizontal forecast direction is mode 1, by that analogy.4 kinds of optional predictive mode collection of 16 * 16: mode 0, mode 1 ..., mode 3}, promptly corresponding vertical, level, DC and plane prediction direction.8 * 8 prediction mode for chroma are identical with 16 * 16 prediction mode.Except that the DC predictive mode, each predictive mode all has corresponding prediction direction and prediction weight, and the DC prediction mode is to adopt the mean value of adjacent boundary pixel to predict, predicts that promptly all pixel values of piece equal the mean value of adjacent boundary pixel.

The infra-frame prediction basic process of macro block is as follows:

(1) judges current macro available macroblock information on every side, comprise whether top, the left side and top right-hand side macro block be available.

(2) luminance macroblock is divided into 16 4 * 4 fritters, adopts 9 kinds of different predictive modes, calculate the rate distortion costs of each 4 * 4 fritter successively, rate distortion costs is defined by Lagrangian:

J(s，c，mode|QP，λ _mode)＝SSD(s，c，mode|QP)+λ _mode·R(s，c，mode|QP)

Wherein, s is former pixel block signal, and c is the reconstructed blocks signal, and QP is the quantization parameter of macro block, λ _ModeBe Lagrange's multiplier, λ _Mode=0.852 ^QP/3, (s, c mode|QP) are encoder bit rate under corresponding mode and QP to R, comprise the number of coded bits of header, predictive mode and all DCT coefficients.SSD (.) be between 4 * 4 former block of pixels and the reconstructed blocks square error and:

SSD (s, c, mode | QP) = Σ_{y = 0}^{3} Σ_{x = 0}^{3} {(s (x, y) - c (x, y))}^{2}

Finish all intra prediction modes of 4 * 4 successively and select, and obtain all rate distortion costs of 4 * 4 and.

(3) adopt 16 * 16 block modes, from 4 kinds of different predictive modes, select optimum prediction mode, method is SATD (the Sum of AbsoluteTransformed Difference) value of calculating under all patterns, and the pattern of choosing minimum value is as best 16 * 16 predictive modes.The SATD value is as 16 * 16 rate distortion (RD) cost, be residual values be divided into 16 4 * 4 and respectively the absolute value after the Hardamard conversion and half, shown in down, (i j) is the Hardamard conversion coefficient to DiffT.

SATD = (\underset{i, j}{Σ} | DiffT (i, j) |) / 2

(4) relatively (2) and (3) obtain rate distortion costs, therefrom select an optimum prediction mode as the macroblock encoding pattern.The calculating of chrominance block is identical with 16 * 16.

Luminance block and chrominance block one total N8 in the macro block * (N4 * 16+N16) plants mode combinations ^[7], N8, N4 and N16 represent the predictive mode quantity of chrominance block, 4 * 4 and 16 * 16 luminance block respectively.That is to say that a macro block will obtain best RDO pattern, will calculate 592 different RDO altogether and calculate.This shows that the computation complexity that intra prediction mode is selected is very high, influence coding rate H.264.

In order to reduce the complexity of infra-frame prediction, many scholars have carried out extensive studies.As Feng Pan at document (F.Pan, X.Lin.Fast Mode Decision for Intra Prediction.ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-G013.doc, JVT 7th MeetingPattaya II, Thailand, 7-14, March 2003., be called for short Feng Pan method below) the middle infra-frame prediction quick mode selection algorithm that proposes, basic thought is at first to adopt the Sobel operator to make rim detection, obtain the edge trend of object in the image block, determine corresponding candidate's predictive mode according to edge direction again; [C.Kim.Feature-based intra-prediction modedecision for h.264.IEEE. at document for Changsung Kim, 2004, infra-frame prediction fast algorithm based on feature is proposed pp:769-772.], adopt SAD and two factor characteristic feature of SATD, calculating and comparison by complexity obtain final coding mode.These methods have improved intraframe coding speed to a certain extent, calculate but introduced complicated early stage simultaneously, therefore have limitation, and wherein the algorithm of FengPan proposition is generally acknowledged best in numerous algorithms, and have been adopted by the JVT tissue.

Summary of the invention

The objective of the invention is to keeping improving image encoding speed on the preferable image quality base.

For realizing purpose of the present invention, inventive concept is to improve code rate by reducing the concentrated candidate pattern quantity of intra prediction mode.The inventive method filters out the predictive mode along grain direction according to the grain direction of image, carries out RDO again and calculates.

The grain direction of above-mentioned image obtains by the following method:

At first, be two row with the adjacent boundary pixel-expansion, again according to 9 prediction direction that propose in the standard, define 4 grain directions: 0 °, 45 °, 90 ° and 135 °;

The average gray on grain direction that then calculates adjacent two row pixels is poor, and is designated as D respectively ₀, D ₄₅, D ₉₀And D ₁₃₅, get D then ₀, D ₄₅, D ₉₀And D ₁₃₅The middle pairing direction of minimum value is as the grain direction at edge, and order:

D _min＝min(D ₀，D ₄₅，D ₉₀，D ₁₃₅)

Get D then ₀, D ₄₅, D ₉₀And D ₁₃₅Middle minimum value D _MinPairing direction is as the grain direction at edge.

Specifically, D ₀, D ₄₅, D ₉₀And D ₁₃₅Value can be respectively calculate and obtain by following formula:

D_{0} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i) |

D_{45} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i - 1, y_{0} - 2) - I (x_{0} + i, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i - 1) - I (x_{0} - 1, y_{0} + i) |)

D_{90} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} + i, y_{0} - 2) - I {(x}_{0} + i, y_{0} - 1) |

D_{135} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i + 1, y_{0} - 2) - I (x_{0} + i - 1, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i + 1) |)

Wherein, (x is that pixel is at (x, the y) gray value at coordinate place, (x y) to I ₀, y ₀) being the top left corner pixel coordinate of prediction piece, N is the size of encoding block.Among the present invention, the value of described N is preferably 4,8 or 16.When the N value is 8 or 16, get D ₀, D ₄₅And D ₉₀Middle minimum value D _MinPairing direction is as the grain direction at edge.

Predictive mode along grain direction can carry out in the following manner:

Making 9 kinds of predictive mode Candidate Sets of 4 * 4 is F _{4 * 4}=mode 0, and mode 1 ..., and mode 8}, 4 kinds of predictive mode Candidate Sets of 16 * 16 are F _{16 * 16}=mode 0, mode1 ..., mode3}, 4 kinds of predictive mode Candidate Sets of 8 * 8 are F _{8 * 8}=mode 0, mode1 ..., mode 3} then filters out the predictive mode along grain direction by the following method:

(1) judges current macro available macroblock information on every side, comprise top, the left side and top right-hand side macro block;

(2), then judge complexity in the frame of current macro, if the current macro complexity, then jumps to step (4) less than prescribed threshold if 16 * 16 predictive modes are all adopted in top, left side macro block; The complexity of current macro can be calculated by following formula:

X_{I} = Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} abs (I (x, y) - \frac{1}{M * M} Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} I (x, y))

Wherein, M is a macroblock size.

(3) 4 * 4 block prediction modes are selected:

Macro block is divided into 4 * 4, and following strategy is taked in the selection of predictive mode Candidate Set:

1) if D _Min=D ₉₀, Candidate Set F _{4 * 4}={ mode 0, and mode 7, and mode 5, mode2};

2) if D _Min=D ₀, Candidate Set F _{4 * 4}={ mode 1, and mode 8, and mode 6, mode2};

3) if D _Min=D ₄₅, Candidate Set F _{4 * 4}={ mode 4, and mode 5, and mode 6, mode2};

4) if D _Min=D ₁₃₅, Candidate Set F _{4 * 4}={ mode 3, and mode 7, and mode 8, mode2};

Through after the above calculating, candidate's predictive mode is reduced to 4 kinds by original 9 kinds.Encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode, and calculate all pieces rate distortion costs and;

(4) 16 * 16 luminance block and 8 * 8 chroma block prediction modes are selected:

Macro block is adopted 16 * 16 predicts that following strategy is taked in the selection of predictive mode Candidate Set:

1) if D ' _Min=D ₉₀, Candidate Set F _{16 * 16}={ mode 0, mode 2};

2) if D ' _Min=D ₀, Candidate Set F _{16 * 16}={ mode 1, mode 2};

3) if D ' _Min=D ₄₅, wait formula collection F _{16 * 16}={ mode 3, mode 2};

8 * 8 chrominance block adopt and 16 * 16 identical predictive modes.Through after the above calculating, predictive mode is reduced to 2 kinds by original 4 kinds.Encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode;

(5) rate distortion costs of comparison (3) and (4) selects the minimum cost pattern as final coding mode.

Can reduce RDO by said method and calculate, improve the intraframe coding speed of image effectively, and picture quality and code check change seldom.

Description of drawings

Fig. 1 is the international standard development course;

Fig. 2 is a coding framework schematic diagram H.264;

Fig. 3 is a hierarchical design schematic diagram H.264;

Fig. 4 is Foreman (QCIF) sequence first two field picture;

Fig. 5 is the 69th macro block (16 * 16 macro block) among Fig. 4;

Fig. 6 is the 12nd 4 * 4 fritters of Fig. 5;

Fig. 7 is the schematic diagram of 4 * 4 fritter adjacent boundaries expansion, two row;

Fig. 8 is a texture definition direction;

Fig. 9 is that Foreman sequence PSNR compares;

Figure 10 is that Stefan sequence PSNR compares;

Figure 11 is that Carphone sequence PSNR compares;

Figure 12 is that Tempete sequence PSNR compares.

Embodiment

Further specify the present invention below in conjunction with accompanying drawing.Should be appreciated that following examples only are used to illustrate the present invention, and can not be as restriction of the present invention, under the prerequisite that does not deviate from the present invention's spirit and essence, modification of carrying out or replacement all belong to scope of the present invention.

Embodiment 1

1. grain direction is estimated

As everyone knows, natural image has very strong spatial coherence, and the texture trend between the adjacent macroblocks also is very similar, and especially for 4 * 4 fritters, correlation is then stronger.What show as Fig. 4 is Foreman (QCIF) sequence first two field picture, and Fig. 5 is the 69th macro block in the image, and what show in the white box among Fig. 6 is the 12nd 4 * 4 fritters of the 69th macro block.As can be seen from the figure, the texture between the adjacent macroblocks (or piece) moves towards closely similar.Respectively 4 * 4 brightness, 16 * 16 brightness and 8 * 8 chrominance block being carried out grain direction below estimates.

(1) 4 * 4 grain direction is estimated

At first be two row, as shown in Figure 7 with the adjacent boundary pixel-expansion.According to 9 prediction direction that propose in the standard, define 4 grain directions again: 0 °, 45 °, 90 ° and 135 °, as shown in Figure 8.

The average gray on grain direction that then calculates adjacent two row pixels is poor, and is designated as D respectively ₀, D ₄₅, D ₉₀And D ₁₃₅Method is as follows:

D_{0} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i) |

D_{45} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i - 1, y_{0} - 2) - I (x_{0} + i, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i - 1) - I (x_{0} - 1, y_{0} + i) |)

D_{90} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} + i, y_{0} - 2) - I (x_{0} + i, y_{0} - 1) |

D_{135} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i + 1, y_{0} - 2) - I (x_{0} + i - 1, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i + 1) |)

Wherein, (x is that pixel is at (x, the y) gray value at coordinate place, (x y) to I ₀, y ₀) being the top left corner pixel coordinate of prediction piece, N is the size of encoding block, at 4 * 4 middle N=4.Get D then ₀, D ₄₅, D ₉₀And D ₁₃₅The middle pairing direction of minimum value is as the grain direction at edge, and order:

D _min＝min(D ₀，D ₄₅，D ₉₀，D ₁₃₅)

16 * 16 luminance block and 8 * 8 chrominance block grain directions are estimated

For 16 * 16 luminance block and 8 * 8 chrominance block, have only vertical, level and plane prediction direction, add the DC predictive mode.Therefore only need 3 grain directions of definition: 0 °, 45 ° and 90 °.According to 4 * 4 same methods, try to achieve minimum D ' respectively _MinAnd D _Min, and with the grain direction of this direction as the edge.

By above calculating, tentatively determined the grain direction of encoding block.

2. frame mode selection algorithm

Making 9 kinds of predictive mode Candidate Sets of 4 * 4 is F _{4 * 4}=mode 0, and mode 1 ..., and mode 8}, 4 kinds of predictive mode Candidate Sets of 16 * 16 are F _{16 * 16}=mode 0, mode1 ..., mode 4}, 4 kinds of predictive mode Candidate Sets of 8 * 8 are F _{8 * 8}=mode 0, mode1 ..., mode 4}.

Arthmetic statement is as follows:

(2), then calculate complexity in the frame of current macro if 16 * 16 predictive modes are all adopted in top, left side macro block:

X_{I} = Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} abs (I (x, y) - \frac{1}{M * M} Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} I (x, y))

Wherein, M is a macroblock size, equals 16 in this article.If X _ILess than a threshold value T (obtain by experiment test, equal 256 in this example), then leap to (4), adopt 16 * 16 macro blocks to predict.

(3) 4 * 4 block prediction modes are selected:

Through after the above calculating, candidate's predictive mode is reduced to 4 kinds by original 9 kinds.Encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode, and calculate all pieces rate distortion costs and.

1) if D ' _Min=D ₉₀, Candidate Set F _{16 * 16}={ mode 0, mode 2};

2) if D ' _Min=D ₀, Candidate Set F _{16 * 16}={ mode 1, mode 2};

3) if D ' _Min=D ₄₅, wait formula collection F _{16 * 16}={ mode 3, mode 2}; Chrominance block takes identical method to select.Through after the above calculating, predictive mode is reduced to 2 kinds by original 4 kinds.Encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode.

Experimental result and analysis

Validity for verification algorithm, (H.S.Malvar and A.Hallapuro.Low-complexity transform and quantization in is Trans.CSVT. H.264/AVC.IEEE for experiment employing reference model JM7.5 H.264, vol.13 (7), pp:598-602,2003.) as platform, realize this method therein.Experimental technique is to adopt that CAVLC entropy coding, reference frame are 2, the hunting zone is 32, adopts Hardmard conversion, utilization rate aberration optimizing (RDO), sequential structure to adopt IPPP and full I form, quantization parameter is selected: 28 and 32, a plurality of standard video sequence are tested.

This method experimental result and syntype algorithm computation result, Feng Pan algorithm computation result compare, and the syntype algorithm is exactly the method that exhaustive search calculates all predictive modes.Because the speed of service of algorithm has much relations with hardware platform, therefore three algorithms above on identical hardware platform, testing, ratio to test result compares, thereby can get rid of the interference of hardware environment to algorithm, promptly the encoder bit rate to cycle tests changes (B_CHG), the scramble time changes (T_CHG) and picture quality variation (PSNR_CHG) compares, and computational methods are as follows:

B_CHG = \frac{Bits_ours - Bits_all}{Bits_all} \times 100 %

T_CHG = \frac{Time_all - Time_ours}{Time_all} \times 100 %

PSNR_CHG＝PSNR_ours-PSNR_all

Wherein Bits_ours and Bits_all are respectively the bit number that uses the inventive method and syntype algorithm to produce, Time_ours and Time_all are respectively the scramble time of using the inventive method and syntype algorithm, and PSNR_ours and PSNR_all are respectively the picture quality of using the inventive method and syntype algorithm.The computational methods of Feng Pan algorithm are identical with it.

1.IPPP structure, 120 frames of encoding, frame per second=30f/s, GOP=15, coding result is compared as follows:

Table 1 QCIF standard sequence experimental result (QP=28)

Table 2 CIF standard sequence experimental result (QP=32)

By table 1 and table 2 as can be seen, for the IPPP structure, this method on average will improve 21.5% than the syntype method aspect coding rate, and image quality decrease has only 0.026dB, and code check on average increases about 0.593%.Simultaneously as can be seen, this method is compared with document [7] algorithm, and both are suitable with image quality decrease than the code check increase of syntype algorithm, but this method on average has 2% raising.Main cause is to use the amount of calculation of Sobel algorithm estimated edge bigger in the Feng Pan algorithm.Following Fig. 9 and Figure 10 are respectively the brightness PSNR contrast figure of Foreman (QCIF) and Stefan (CIF) sequence, dotted line is the picture quality under the syntype algorithm, solid line is the picture quality under this method, and the quality decline of P two field picture is because the I frame image quality decline that is used to predict causes.

2. full I frame structure, 120 frames of encoding, frame per second=30f/s, coding result is compared as follows:

Table 3 QCIF standard sequence experimental result (QP=28)

Table 4 CIF standard sequence experimental result (QP=32)

Tempete	3.87	38.26	-0.129	3.54	43.66	-0.125
Tempete	3.87	38.26	-0.129	3.54	43.66	-0.125	Stefan	4.12	37.64	-0.098	4.77	39.48	-0.084
Mobile	4.36	33.63	-0.135	4.51	37.89	-0.139	Stefan	4.12	37.64	-0.098	4.77	39.48	-0.084
Mobile	4.36	33.63	-0.135	4.51	37.89	-0.139	On average	4.07	38.18	-0.117	4.18	42.36	-0.120

By table 3 and table 4 as can be seen, for full I frame structure, the performance of this method performance is more obvious, on average will improve 43% than the syntype method aspect coding rate, and image quality decrease has only 0.12dB, and code check on average increases about 4.0%.Same this method also is better than document [7] algorithm, and both code checks increase and image quality decrease is more or less the same, and this method on average has nearly 4% raising.Following Figure 11 and Figure 12 are respectively the brightness PSNR comparison diagrams of Carphone (QCIF) and Tempete (CIF) sequence, and dotted line is the picture quality under the syntype algorithm, and solid line is the picture quality under this method.

Show that by above experimental result this method can improve the intraframe coding speed of image effectively, and picture quality and code check change seldom.

Claims

1, improve the method for image intraframe coding velocity based on standard H.264, this method filters out the predictive mode along grain direction according to the grain direction of image, carries out RDO again and calculates.

2, the method for claim 1 is characterized in that, the grain direction of described image obtains by the following method:

D _min＝min(D ₀，D ₄₅，D ₉₀，D ₁₃₅)

3, method as claimed in claim 2 is characterized in that, described D ₀, D ₄₅, D ₉₀And D ₁₃₅Value calculate acquisition by the following method:

D_{0} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i) |

D_{45} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i - 1, y_{0} - 2) - I (x_{0} + i, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i - 1) - I (x_{0} - 1, y_{0} + i) |)

D_{90} = \frac{1}{N} Σ_{i = 0}^{N - 1} | I (x_{0} + i, y_{0} - 2) - I (x_{0} + i, y_{0} - 1) |

D_{135} = \frac{1}{3 \times N} (Σ_{i = 0}^{2 N - 1} | I (x_{0} + i + 1, y_{0} - 2) - I (x_{0} + i - 1, y_{0} - 1) | +

Σ_{i = 0}^{N - 1} | I (x_{0} - 2, y_{0} + i) - I (x_{0} - 1, y_{0} + i + 1) |)

Wherein, (x is that pixel is at (x, the y) gray value at coordinate place, (x y) to I ₀, y ₀) being the top left corner pixel coordinate of prediction piece, N is the size of encoding block.

4, method as claimed in claim 3 is characterized in that, described N value is 4,8 or 16.

5, method as claimed in claim 4 is characterized in that, when the N value is 8 or 16, gets D ₀, D ₄₅And D ₉₀Middle minimum value D _MinPairing direction is as the grain direction at edge.

As each described method of claim 1～5, it is characterized in that 6, making 9 kinds of predictive mode Candidate Sets of 4 * 4 is F _{4 * 4}=mode 0, and mode 1 ..., and mode 8}, 4 kinds of predictive mode Candidate Sets of 16 * 16 are F _{16 * 16}=mode 0, and mode 1 ..., and mode3}, 4 kinds of predictive mode Candidate Sets of 8 * 8 are F _{8 * 8}=mode 0, and mode 1 ..., mode 3} then filters out the predictive mode along grain direction by the following method:

(2), then judge complexity in the frame of current macro, if the current macro complexity, then jumps to step (4) less than prescribed threshold if 16 * 16 predictive modes are all adopted in top, left side macro block;

(3) 4 * 4 block prediction modes are selected:

Encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode, and calculate all pieces rate distortion costs and;

1) if D ' _Min=D ₉₀, Candidate Set F _{16 * 16}={ mode 0, mode 2};

2) if D ' _Min=D ₀, Candidate Set F _{16 * 16}={ mode 1, mode 2};

8 * 8 chrominance block adopt and 16 * 16 identical predictive modes, and encoder adopts the predictive mode in the Candidate Set to carry out RDO and calculates, and tries to achieve optimal mode;

7, method as claimed in claim 6 is characterized in that, the complexity of current macro is calculated by following formula:

X_{I} = Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} abs (I (x, y) - \frac{1}{M * M} Σ_{y = 0}^{M - 1} Σ_{x = 0}^{M - 1} I (x, y))

Wherein, M is a macroblock size.