CN101583028A

CN101583028A - Video compression coding search algorithm

Info

Publication number: CN101583028A
Application number: CN 200810067220
Authority: CN
Inventors: 马国强
Original assignee: SHENZHEN RONGHE VISION TECHNOLOGY Co Ltd
Current assignee: Shenzhen Rongchuang Tianxia Technology Development Co., Ltd.
Priority date: 2008-05-14
Filing date: 2008-05-14
Publication date: 2009-11-18

Abstract

The invention relates to a video coding method, in particular to an algorithm which greatly reduces software and hardware to realize a part of calculation amount of sub-pixel motion on the premise of ensuring the video coding efficiency. A fast motion search algorithm based on a frequency domain detects object motion information by analyzing a phase of the frequency domain. Compared with a search algorithm of a spatial domain, the analysis technology more approaches to the nature of object motion, has very low calculation complexity, contains translation information m through variances of g m <C> and g m <S> which belong to the frequency domain, and extracts m from the translation information m, therefore, motion search of DCT domain can be realized. The calculation complexity of the sub-pixel motion search algorithm based on the frequency domain is between 5 percent and 20 percent of sub-pixel full search algorithm of the spatial domain, and is particularly suitable for an embedded platform or a handhold terminal.

Description

A kind of video compression coding search algorithm

[technical field]

The present invention relates to a kind of method for video coding, specifically refer to a kind of algorithm of under the prerequisite that guarantees video coding efficient, reducing the partial arithmetic amount of sub-pixel motion in the software and hardware realization significantly.

[background technology]

In the inferior pel search algorithm of existing video compression coding, widely used technology is based on the full-search algorithm of spatial domain or the various fast algorithms of full search, these algorithms are that match block is searched by unit with the pixel block in search window, with mean square deviation and or absolute difference and serve as the judgement rule, need to do repeatedly filtering interpolation in its search procedure, and the repeated calculation cost function, computation complexity is very high.Experiment shows, after entering inferior pixel accuracy and calculating, the computing cost of motion search process often will exceed more than a times of former whole pel search.Moreover, the accuracy of coupling also depends on the precision of interpolation algorithm, influences code efficiency to a certain extent.In the conventional video coding, coding image at first needs to carry out spatial domain-frequency domain transform, finishes quantification, entropy coding in frequency domain, and then does frequency domain-space field transformation, gets back to spatial domain and carries out motion search and storage.The computation complexity height of such full-search algorithm in inferior pel search, the time of cost and inefficiency also influence the precision of video coding.

[summary of the invention]

The present invention is directed to existing searching algorithm complexity height, the defective that precision is low proposes a kind of searching algorithm of novelty, directly utilize the dependency prediction and the searching motion vector of phase place at frequency domain, this algorithm in inferior pel search process owing to do not need to get back to spatial domain, need not do interpolation calculation, avoid the calculation cost function, thereby greatly cut down the computing cost that motion search brings in the video coding, be applicable to the embedded platform that needs video content services.

What generally adopted by each video encoding standard at present is dct transform, and the algorithm flow of the sub-pixel motion searching algorithm in DCT territory is as follows:

7) determine that search window is N, being extracted in and putting in order picture element F with reference picture on the x direction is initial one-dimensional signal x ₁(n) x of correspondence position and in the present image ₂(n);

8), calculate x according to DCT and DST transformation for mula ₁(n) and x ₂(n) four discrete DCT/DST conversion coefficients;

9) calculate at [1, N] interval g _m ^S, by DCT and DST transformation for mula and satisfy equation and obtain:

g_{m}^{S} (k) = \{\begin{matrix} 1, k = N \\ (Z_{1}^{C} (k) \cdot X_{2}^{S} (k) - Z_{1}^{S} (k) \cdot X_{2}^{C} (k)) / ({(Z_{1}^{C} (k))}^{2} + {(Z_{1}^{S} (k))}^{2}), k &Element; [1, N) \end{matrix}

10) according to g _mEquation analyze the positive negative direction of delta-response, draw m _xPositive negative direction on displacement;

11) on the y direction, repeat above step, draw the m on the y direction _yPositive negative direction on displacement;

12) carry parameter m _x, m _yQuestion blank 1 is determined the match point in the inferior pixel space location drawing, determines the half picture element movement vector simultaneously.

Table 1 m and motion vector

m _x	m _y	Match point	Motion vector
m _x	m _y	Match point	Motion vector	＞0	＞0	3	(0.5，0.5)
＞0	＜0	8	(0.5，-0.5)	＞0	＞0	3	(0.5，0.5)
＞0	＜0	8	(0.5，-0.5)	＞0	＝0	5	(0.5，0)
＜0	＞0	1	(-0.5，0.5)	＞0	＝0	5	(0.5，0)
＜0	＞0	1	(-0.5，0.5)	＜0	＜0	6	(-0.5，-0.5)
＜0	＝0	4	(-0.5，0)	＜0	＜0	6	(-0.5，-0.5)
＜0	＝0	4	(-0.5，0)	＝0	＞0	2	(0，0.5)
＝0	＜0	7	(0，-0.5)	＝0	＞0	2	(0，0.5)
＝0	＜0	7	(0，-0.5)	＝0	＝0	F	(0，0)

Above-mentioned g _mEquation:

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1)

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{C} \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1);

Above-mentioned described DCT and DST transformation for mula, the DCT and the DST that are defined as follows are transformed to:

X_{2}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \cos (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1]

X_{2}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \sin (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1]

Z_{1}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \cos (\frac{kπ}{N} n), k &Element; [0, N - 1]

Z_{1}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \sin (\frac{kπ}{N} n), k &Element; [0, N - 1],

In the following formula,

C (k) = \{\begin{matrix} \frac{1}{\sqrt{2}}, k = {0, N} \\ 1, k = [1, N - 1] \end{matrix};

The above-mentioned described equation that satisfies:

[\begin{matrix} X_{2}^{C} (k) \\ X_{2}^{S} (k) \end{matrix}] = [\begin{matrix} Z_{1}^{C} (k) - Z_{1}^{S} (k) \\ Z_{1}^{S} (k) + Z_{1}^{C} (k) \end{matrix}] [\begin{matrix} g_{m}^{C} (k) \\ g_{m}^{S} (k) \end{matrix}]

Wherein,

g_{m}^{S} = \sin ((kπ / N) (m + 0.5)),

g_{m}^{C} = \cos ((kπ / N) (m + 0.5)) .

Above-mentionedly satisfy equation and can be rewritten as

\overset{&RightArrow;}{X} (k) = Z (k) \overset{&RightArrow;}{Ω} (k),

Can prove that Z (k) is an orthogonal matrix, and λ Z is arranged ^T(k) Z (k)=I ₂, I ₂It is one 2 * 2 unit matrix; Can solve equation like this:

\overset{&RightArrow;}{Ω} (k) = λ Z^{T} (k) \overset{&RightArrow;}{X} (k),

Thereby can solve g _m ^C, g _m ^S

Will The approximate replacement

Will

The approximate replacement

With further reduction amount of calculation.

The motion vector of 1/4 pixel accuracy if desired is by 6) in the motion vector of gained use the bi-linear filter interpolation, on the gained pixel block, repeat 1)-6) step.

Can analyze from above-mentioned algorithm and formula: g _m ^C, g _m ^SThese two variablees that belong to frequency domain have comprised translation information m, and therefrom extract m, just can realize the motion search in DCT territory.Rapid movement searching algorithm based on frequency domain comes the inspected object movable information by analyzing in the phase place of frequency domain, searching algorithm compared with spatial domain, this analytical technology is more near the essence of object of which movement, have its extremely low computation complexity, the computation complexity that the present invention proposes based on the sub-pixel motion searching algorithm of frequency domain approximately be the inferior pixel full-search algorithm of spatial domain 5% to 2 0% between, be specially adapted to embedded platform or handheld terminal.

[description of drawings]

Fig. 1 is the object delta-response schematic diagram during translation m to the right;

Fig. 2 is the object delta-response schematic diagram during translation m left;

Fig. 3 is that the calculated performance complexity under each standard test sequences compares;

Fig. 4 is inferior pixel space position view.

[embodiment]

What generally adopted by each video encoding standard at present is dct transform, dct transform has the Energy Convergence energy near Karhunen-Loeve transformation, can by behind the low pass filter, can under high compression ratio, guarantee picture quality with most of concentration of energy in direct current and low frequency part.At this point, the present invention mainly calculates the translation in space from the phase place of dct transform domain, because the particularity of dct transform no longer has simple corresponding relation as Fourier in the DCT territory.

One-dimensional discrete signal { x is arranged ₁(n) | n ∈ [0, N-1] } (N is the size of search window), behind the m that moves to right, form signal { x ₂(n) | n ∈ [0, N-1] }:

x_{2} (n) = \{\begin{matrix} x_{1} (n - m), n &GreaterEqual; m \\ 0, n < m \end{matrix} - - - (2)

According to DCT that is defined as follows and DST transform:,

X_{2}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \cos (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1] - - - (3)

X_{2}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \sin (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1] - - - (4)

Z_{1}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \cos (\frac{kπ}{N} n), k &Element; [0, N - 1] - - - (5)

Z_{1}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \sin (\frac{kπ}{N} n), k &Element; [0, N - 1] - - - (6)

In the following formula,

C (k) = \{\begin{matrix} \frac{1}{\sqrt{2}}, k = {0, N} \\ 1, k = [1, N - 1] \end{matrix} - - - (7)

Prove that easily following equation is satisfied in these four conversion:

[\begin{matrix} X_{2}^{C} (k) \\ X_{2}^{S} (k) \end{matrix}] = [\begin{matrix} Z_{1}^{C} (k) - Z_{1}^{S} (k) \\ Z_{1}^{S} (k) + Z_{1}^{C} (k) \end{matrix}] [\begin{matrix} g_{m}^{C} (k) \\ g_{m}^{S} (k) \end{matrix}] - - - (8)

Wherein,

g_{m}^{S} = \sin ((kπ / N) (m + 0.5)),

g_{m}^{C} = \cos ((kπ / N) (m + 0.5)) .

Equation in (8) is rewritten as

\overset{&RightArrow;}{X} (k) = Z (k) \overset{&RightArrow;}{Ω} (k) .

Can prove that Z (k) is an orthogonal matrix, and has:

λZ ^T(k)Z(k)＝I ₂ (9)

I ₂It is one 2 * 2 unit matrix.Like this, we can solve equation:

\overset{&RightArrow;}{Ω} (k) = λ Z^{T} (k) \overset{&RightArrow;}{X} (k) - - - (10)

Thereby can solve g _m ^C, g _m ^S

Quadrature rule according to SIN function has following law:

Σ_{k = 1}^{N} C^{2} (k) \sin (\frac{kπ}{N} (m + 0.5)) \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1) - - - (11)

Σ_{k = 0}^{N - 1} C^{2} (k) \cos (\frac{kπ}{N} (m + 0.5)) \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1) - - - (12)

Wherein, δ (n) is discrete impulse function.

According to formula (8), (10-12), can draw about g _mEquation:

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1) - - - (13)

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{C} \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1) - - - (14)

We see that these two variablees that belong to frequency domain have comprised translation information m.If can find fast algorithm to solve g _m ^C, g _m ^S, and therefrom extract m, just can realize the motion search in DCT territory.

Analysis mode (13), (14), when m greater than 0, and when being positioned at search window [0, N], can find positive delta-response at the n=m place, find negative delta-response at the n=-m-1 place simultaneously; When m＜0, and be positioned at search window negative mirror image [N, 0) time, can find negative delta-response at the n=m place, simultaneously find positive delta-response at the n=-m-1 place.Shown in Fig. 1 (a), Fig. 1 (b), gray area is a search window, if find positive delta-response in search window [0, N], means that then object has translation to the right, m＞0; If find negative delta-response in search window [0, N], mean that then object has translation left, m＜0.Hereinafter table 1 shows, when the positive negative direction of only knowing m, the demand concrete value of separating m can not determined the sub-pixel motion vector, greatly reduces amount of calculation.

When concrete calculating, can with

The approximate replacement

Will

The approximate replacement

With further reduction amount of calculation.

On the various basis of above setting up, the algorithm flow of the sub-pixel motion searching algorithm in DCT territory is as follows: the first step: determine that search window is N, being extracted in and putting in order picture element F with reference picture on the x direction is initial one-dimensional signal x ₁(n) x of correspondence position and in the present image ₂(n);

Second step:, calculate x according to formula (3-6) ₁(n) and x ₂(n) four discrete DCT/DST conversion coefficients;

The 3rd step: calculate at [1, N] interval g _m ^S, obtain by formula (3-6), (8):

g_{m}^{S} (k) = \{\begin{matrix} 1, k = N \\ (Z_{1}^{C} (k) \cdot X_{2}^{S} (k) - Z_{1}^{S} (k) \cdot X_{2}^{C} (k)) / ({(Z_{1}^{C} (k))}^{2} + {(Z_{1}^{S} (k))}^{2}), k &Element; [1, N) \end{matrix} - - - (15)

The 4th step:, draw m according to the positive negative direction of formula (13) according to delta-response _xThe displacement of positive negative direction;

The 5th the step: with last step in like manner, draw the m on the y direction _yPositive negative direction displacement;

The 6th step: carry parameter m _x, m _yQuestion blank 1 is determined the match point in inferior pixel space position view,

Determine the half picture element movement vector simultaneously.

Table 1 m and motion vector

H.264, experiment is being carried out on the encoding platform, and the version of the test video encoding software of use is JM8.6 (issue of JVT official), and coding parameter is: quantization parameter=26,1 frame reference picture.The standard test sequences of using is that size is that QCIF, length are Foreman, Container, News, Silent, the Carphone of 100 frames.Fig. 3 is that algorithm and the computation complexity of full-search algorithm in inferior pel search of this paper compares.Because the image construction of each cycle tests is different, computation complexity has nothing in common with each other, and for simplicity, the full-search algorithm computation complexity in each cycle tests is made as 1, as a comparison benchmark.By can finding out among Fig. 3, the computation complexity that the present invention proposes based on the sub-pixel motion searching algorithm of frequency domain approximately be the inferior pixel full-search algorithm of spatial domain 5% to 20% between.

Shown in the test data following table of the coding efficiency of algorithm, use behind this algorithm coding the Y-PSNR of image to compare for different image measurement sequences with full-search algorithm, descend 0.04 to 0.65dB.

Following table is that the coding efficiency under different cycle testss changes:

Cycle tests	Spatial domain full-search algorithm image PSNR (dB)	Frequency domain sub-pixel motion searching algorithm image PSNR (dB)	The variation of image PSNR (dB)
Cycle tests	Spatial domain full-search algorithm image PSNR (dB)		The variation of image PSNR (dB)	Foreman	36.77	36.16	-0.61
Container	37.33	37.29	-0.04	Foreman	36.77	36.16	-0.61
Container	37.33	37.29	-0.04	News	38.16	37.74	-0.42
Silent	37.28	37.19	-0.09	News	38.16	37.74	-0.42
Silent	37.28	37.19	-0.09	Carphone	38.38	37.73	-0.65

Come inspected object movable information by analyzing in the phase place of frequency domain based on the rapid movement searching algorithm of frequency domain, compared with the searching algorithm of spatial domain, this analytical technology is more near the essence of object of which movement.Because its extremely low computation complexity is specially adapted to embedded platform or handheld terminal.

In the above-described embodiments, only the present invention has been carried out exemplary description, but those skilled in the art can design various execution modes according to different actual needs under the situation of the scope and spirit that do not break away from the present invention and protected.

Claims

1. video compression coding search algorithm, its algorithm flow is as follows:

1) determine that search window is N, being extracted in and putting in order picture element F with reference picture on the x direction is initial one-dimensional signal x ₁(n) x of correspondence position and in the present image ₂(n);

2), calculate x according to DCT and DST transformation for mula ₁(n) and x ₂(n) four discrete DCT/DST conversion coefficients;

3) calculate at [1, N] interval g _m ^s, by DCT and DST transformation for mula and satisfy equation and obtain:

g_{m}^{S} (k) = \{\begin{matrix} 1, k = N \\ (Z_{1}^{C} (k) \cdot X_{2}^{S} (k) - Z_{1}^{S} (k) \cdot X_{2}^{C} (k)) / ({(Z_{1}^{C} (k))}^{2} + {(Z_{1}^{S} (k))}^{2}), k &Element; [1, N) \end{matrix}

4) according to g _mEquation is analyzed the positive negative direction of delta-response, draws m _xPositive negative direction on displacement;

5) on the y direction, repeat above step, draw the m on the y direction _yPositive negative direction on displacement;

6) carry parameter m _x, m _yQuestion blank 1 is determined the match point in the inferior pixel space location drawing, determines the half picture element movement vector simultaneously.

Table 1m and motion vector

m _x m _y Match point Motion vector ＞0 ＞0 3 (0.5，0.5) ＞0 ＜0 8 (0.5，-0.5) ＞0 ＝0 5 (0.5，0) ＜0 ＞0 1 (-0.5，0.5) ＜0 ＜0 6 (-0.5，-0.5) ＜0 ＝0 4 (-0.5，0) ＝0 ＞0 2 (0，0.5) ＝0 ＜0 7 (0，-0.5) ＝0 ＝0 F (0，0)

2. require 1 described video compression coding search algorithm, it is characterized in that: the g above-mentioned steps 4) _mEquation is:

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)) = δ (m - n) - δ (m + n + 1)

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{C} \cos (\frac{kπ}{N} (n + 0.5)) = δ (m - n) + δ (m + n + 1) .

3. the video compression coding search algorithm stated of claim 1 is characterized in that: above-mentioned steps 2), 3) described in DCT and DST transformation for mula be:

X_{2}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \cos (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1]

X_{2}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{2} (n) \sin (\frac{kπ}{N} (n + 0.5)), k &Element; [0, N - 1]

Z_{1}^{C} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \cos (\frac{kπ}{N} n), k &Element; [0, N - 1]

Z_{1}^{S} (k) = \frac{2}{N} C (k) Σ_{n = 0}^{N - 1} x_{1} (n) \sin (\frac{kπ}{N} n), k &Element; [0, N - 1],

In the following formula,

C (k) = \{\begin{matrix} \frac{1}{\sqrt{2}}, k = {0, N} \\ 1, k = [1, N - 1] \end{matrix} .

4. require 1 described video compression coding search algorithm, it is characterized in that: above-mentioned steps 3) describedly satisfy equation:

[\begin{matrix} X_{2}^{C} (k) \\ X_{2}^{S} (k) \end{matrix}] = [\begin{matrix} Z_{1}^{C} (k) - Z_{1}^{S} (k) \\ Z_{1}^{S} (k) + Z_{1}^{C} (k) \end{matrix}] [\begin{matrix} g_{m}^{C} (k) \\ g_{m}^{S} (k) \end{matrix}],

Wherein,

g_{m}^{S} = \sin ((kπ / N) (m + 0.5)),

g_{m}^{C} = \cos ((kπ / N) (m + 0.5)) .

5. the video compression coding search algorithm of stating according to claim 4 is characterized in that: above-mentionedly satisfy equation and can be rewritten as

\overset{&RightArrow;}{X} (k) = Z (k) \overset{&RightArrow;}{Ω} (k) .

\overset{&RightArrow;}{Ω} (k) = {λZ}^{T} (k) \overset{&RightArrow;}{X} (k),

Thereby can solve g _m ^C, g _m ^S

6. video compression coding search algorithm according to claim 2 is characterized in that: will

The approximate replacement

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{S} \sin (\frac{kπ}{N} (n + 0.5)),

Will

The approximate replacement

\frac{2}{N} Σ_{k = 1}^{N} C^{2} (k) g_{m}^{C} \cos (\frac{kπ}{N} (n + 0.5))

With further reduction amount of calculation.

7. video compression coding search algorithm according to claim 1 is characterized in that: the motion vector of gained uses the bi-linear filter interpolation when the motion vector of needs 1/4 pixel accuracy, set by step 6), and repeating step 1 on the gained pixel block)-6).