CN100502511C - Method for organizing interpolation image memory for fractional pixel precision predication - Google Patents

Method for organizing interpolation image memory for fractional pixel precision predication Download PDF

Info

Publication number
CN100502511C
CN100502511C CN 200410076759 CN200410076759A CN100502511C CN 100502511 C CN100502511 C CN 100502511C CN 200410076759 CN200410076759 CN 200410076759 CN 200410076759 A CN200410076759 A CN 200410076759A CN 100502511 C CN100502511 C CN 100502511C
Authority
CN
China
Prior art keywords
pixel
subclass
image
interpolation
integer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 200410076759
Other languages
Chinese (zh)
Other versions
CN1750659A (en
Inventor
罗忠
王静
宋彬
常义林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN 200410076759 priority Critical patent/CN100502511C/en
Publication of CN1750659A publication Critical patent/CN1750659A/en
Application granted granted Critical
Publication of CN100502511C publication Critical patent/CN100502511C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention provides a memory organization method fro high effective image storage according to pixel property, which contains in pixel accuracy movement prediction, dividing pixel into integer position subset, 1/2 position subset, 1/4 position subset, --- to 1/2 to the power n position subset, finally divide into 2 to the power n sub images having same size with raw image, forming sub images into a continues memory area and stored in memory. Said invention also provides a method of adopting single instruction multiple data (SIMD) speeding technology in data signal processor (DSP) for filtering interpolation to generate said interpolation image and method (SAD, sum-of-absolute-differences) for quick calculating cost function in motion estimation.

Description

The interpolation image memory organization method that is used for the fraction pixel precision motion prediction
Technical field
The present invention relates to the fraction pixel precision motion prediction algorithm in the video compression coding, more particularly, relate to the interpolation image memory organization method that is used for the fraction pixel precision motion prediction in a kind of video compression coding, the method for each fraction pixel that generates 4 times of interpolation images based on this interpolation image memory organization method and the method for calculating predicated error index S AD based on these two kinds of methods fast.
Background technology
Occupy in the video compression coding standard of main flow in the industry cycle using at present, though be international H.263, H.263+, H.264 and MPEG-4, or domestic AVS (Advanced Audio-VideoSystem, the advanced audio frequency and video coded system of China) all based on a common framework, that is: image block+motion prediction+residual image dct transform (also claiming integer transform, Hadarmard conversion)+quantification+entropy coding.Wherein, utilize the correlation between the frame of moving image front and back, predict through the corresponding region in the frame behind the post exercise by the zone in the preceding frame, thereby the acquisition residual image also quantizes and entropy coding, the statistic correlation that has made full use of between the moving image frame is eliminated redundancy, reaches the purpose of data compression.Therefore, motion prediction is the core of this class based on the video compression coding standard of common frame, is the topmost factor that influences overall compression efficiency.
The general process of motion prediction is such: for certain given area (as MB) in the current video frame, search is based on the matching area of certain error criterion optimum in reference frame (former frame, the perhaps preceding k frame under the multi-reference frame situation).Predicted zone can represent that this vector is called motion vector or displacement vector with a two-dimensional vector with respect to the variation of reference zone geometric position.This process based on certain certain errors criterion search optimal motion vector is called estimation, is the part of whole motion prediction.The efficient of motion prediction depends on the residual image between predicted zone and the estimation range, and residual error is more little, and efficient is high more.Further, in fact the efficient of motion prediction be decided by the precision of estimation, and precision of prediction directly depends on motion vector.
Image in the digital video all is that analog video carries out discrete sampling and digitized result on time and space, and sampling in time forms each discrete frame, and sampling spatially forms each pixel in the frame.In a frame, pixel is to obtain according to certain sampling interval sampling for the continuous analog image in space.Therefore the distance between two adjacent pixels is exactly the sampling interval.In order more accurately to represent motion vector, need to introduce the notion of fractional sampling position (Fractional Sample Position) motion prediction.Adopt the fractional sampling position, it is contemplated that the fractional position pixel is arranged between two adjacent integer pixels, such as 1/2 pixel (is 1/2 sampling interval apart from the integer pixel distance), 1/4 pixel (is 1/4 sampling interval apart from the integer pixel distance), 1/8 pixel (is 1/8 sampling interval apart from the integer pixel distance) etc.Fact proved, after having adopted fractional sampling position motion prediction, video compression coding efficient can improve a lot, after adopting 1/4 pixel precision motion prediction, the PSNR (Peak Signal to Noise Ratio, Y-PSNR) of general compressed video can improve 2dB.At present, what H.263, H.263+ adopt is 1/2 pixel precision motion prediction, and what H.264 adopt is 1/4 pixel precision motion prediction, and what domestic AVS adopted is 1/4 pixel precision motion prediction.
The general process of fractional sampling position motion prediction is: at first adopt certain integer pixel motion estimation algorithm, such as full search, 3 footworks, new 3 footworks, 4 footworks etc., obtain optimum integer pixel motion vector; And then carry out 1/2 pixel motion around in this integer pixel motion vector position and estimate, find 1/2 optimum pixel motion vector position; Carrying out 1/4 pixel motion if desired and estimate, is the center with this optimum 1/2 location of pixels then, carries out 1/4 pixel motion around and estimates; Equally, after obtaining optimum 1/4 pixel motion vector position, can carry out 1/8 pixel motion and estimate.
1/4 pixel precision motion prediction standard implementation method with regulation in H.264/AVC is an example, as shown in Figure 1, hypographous circle is that current macro position, unblanketed circle are that integer pixel positions, triangle are that 1/2 location of pixels, pore are 1/4 location of pixels among the figure, the path direction of searching for when the arrow among the figure is represented to carry out motion search.The motion vector vector ratio that to be current macro form with respect to the difference of its reference macroblock position on x, y both direction is as [2 ,-3] etc.; In standard H.264/AVC, the search procedure of 1/4 pixel precision motion prediction can be divided into three steps:
1) adopts certain method for estimating, find out the integer pixel best match position;
2) from integer pixel best match position and 8 1/2 location of pixels on every side thereof, find out 1/2 pixel best match position:
3) from 1/2 pixel best match position and 8 1/4 location of pixels on every side thereof, find out 1/4 pixel best match position.
In the said process, the match search of integer pixel positions is that local decode reconstructed image with a certain frame in front is a reference picture.The match search of 1/2 location of pixels and 1/4 location of pixels will be reference picture with the image after the local decode reconstructed image interpolation then, the wide and height of this reference picture all 4 times to original image.
The structure of 4 times of reference pictures as shown in Figure 2.Pixel wherein is divided into following a few class:
1) integer pixel: the row, column coordinate all is those pixels of sampling interval integral multiple, as the hypographous circle pixel A among Fig. 2, B, C, D, E, F, G, H, I, J, K, L or the like.
2) 1/2 pixel: promptly have at least one to have (k+1/2) d or (k-1/2) form of d in the row, column coordinate, but the row, column coordinate does not have (k+1/4) d or (k-1/4) those pixels of d form, wherein k is an integer, and d is the sampling interval.As shown in Figure 2, wherein each 1/2 pixel is divided into two subclasses again:
A, complete 1/2 pixel: promptly the row, column coordinate all has (k+1/2) d or (k-1/2) form of d, as pixel j, gg, hh or the like.
B, half 1/2 pixel: at once, have only one to have (k+1/2) d or (k-1/2) form of d, for example pixel b, h, m, s, aa, bb, cc, dd, ee, ff or the like in the row coordinate.
3) 1/4 pixel: promptly have at least one to have (k+1/4) d or (k-1/4) those pixels of the form of d in the row, column coordinate, wherein k is an integer, and d is the sampling interval.As unblanketed circle pixel a, c, d, e, f, g, i, k or the like among Fig. 2.
The generation of 4 times of reference pictures adopts a multistage interpolation process to finish.Be divided into following steps:
1) generate 1/2 pixel by integer pixel by interpolation, wherein the interpolation filter of Cai Yonging is FIR (the Finite Impulse Response finite impulse response) filter on one 6 rank, and its weight vector is w=[1 ,-5,20,20 ,-5,1] TProcess is as follows:
A, producing half 1/2 pixel by integer pixel by interpolation, is example with pixel b, h:
b 1=(E-5*F+20*G+20*H-5*I+J), generate median b 1,
B=Clip ((b 1+ 16)〉〉 5), skew, normalization is sheared.
Wherein, skew is to add a number (side-play amount can just can be born); Normalization refers to for a variable to make that divided by a positive number merchant's absolute value is not more than 1 all the time in this variable-value scope; Shear expression for the variable that surpasses certain scope, force its value in this scope.Scope such as variable x is [0,18], when x=20, has exceeded this scope, and then x will be sheared x=18.
Because the absolute value sum of each weights of filter is 32, so normalization is exactly divided by 32, and 5 bit manipulations realize with moving to right.Shear function C lip not adjusting to by shearing in [0,255] scope at the interior numerical value of [0,255] scope.As a same reason, can be in the hope of h:
h 1=(A-5*C+20*G+20*M-5*R+T)
h=Clip((h 1+16)>>5)
B, produce complete 1/2 pixel by interpolation by half 1/2 pixel.The used filter of interpolation remains 6 top rank FIR filters.With pixel j is example:
j 1=(bb-5gg+20*h 1+ 20*m 1-5*kk+cc), generate median j 1
j=Clip((j 1+512)>>10)
Through above two sub-steps, all 1/2 pixels have all generated.
2) generate 1/4 pixel by integer pixel and 1/2 pixel by interpolation.Therefore 1/4 pixel is between two integer pixels or 1/2 pixel (horizontal direction, vertical direction, diagonal) all, can adopt the method for carrying out arithmetic average for two integer pixels that close on or 1/2 pixel to try to achieve.Concrete computing formula is as follows:
a=(D+b+1)>>1
c=(E+b+1)>>1
d=(D+h+1)>>1
n=(H+h+1)>>1
More than average for horizontal direction.For the situation that diagonal is averaged, account form is as follows:
e=(b+h+1)>>1
g=(b+m+1)>>1
In the prior art, 4 times of interpolation images depositing according to the natural order continuation mode in internal memory carried out, and promptly deposits according to pattern shown in Figure 2.Yet it is not the most rational pattern that 4 times of interpolation images are deposited according to natural order.Generate in the process of 4 times of images in interpolation, what at first generate is integer pixel (existing), generates 1/2 pixel then, generates 1/4 pixel at last.In the 1/2 pixel process of generation, if adopt based on SIMD (Single Instruction Multiple Data, single-instruction multiple-data) DSP (Digital Signal Processor, digital signal processing chip) quickens treatment technology, and the SIMD instruction needs monoblock to read integer pixel; In the 1/4 pixel process of generation, need monoblock to read 1/2 pixel and integer pixel equally.According to such natural order tissue image internal memory, can't accomplish that monoblock reads all kinds of pixels, because present storage means is according to natural order, does not classify for pixel.Such as since 4 times of interpolation image top left corner pixel, store successively.In such order be exactly: whole, 1/4,1/2,1/4, whole, 1/4,1/2,1/4 ..., after first row finishes, enter second row, repeat this order, to the last delegation.Like this, in any one region of memory, pixel is not continuously arranged according to class.Such as any two discontinuous appearance of integer pixel, any two 1/2 discontinuous appearance of pixel.Said nothing of the similar pixel of a monoblock.Therefore if read all whole pixels, just must read according to certain intervals (every 3 numbers) in internal memory, efficient is very low.
In addition, after 4 times of interpolation images generate, carry out in the process of estimation, when the integer-pel precision motion prediction, need the integer position subclass in predicted macro block and the 4 times of images be compared, ask SAD (Summed Absolute Difference, absolute difference and).Equally, when carrying out 1/2 pixel precision motion prediction, only need and 4 times of interpolation images in the part of 1/2 position subclass compare; The just part of 1/4 position subclass that 1/4 pixel precision motion prediction need compare.If therefore adopt SIMD class DSP speed technology, relatively calculate SAD at every turn, all only need monoblock to read the pixel that a certain class has predicable.But in the natural order memory organization of image, these pixels with predicable are not deposited continuously, are not easy to monoblock and read.
Summary of the invention
Above-mentioned defective at prior art, the present invention will solve that monoblock reads problems such as all kinds of pixels because of interpolation image is deposited can't accomplishing of causing according to natural order in internal memory in the existing technology of video compressing encoding, a kind of new image memory organization method is provided, make 1/2 pixel and 1/4 pixel precision motion prediction can make full use of SIMD class DSP accelerating algorithm, to improve computational efficiency.
For solving the problems of the technologies described above, the invention provides a kind of interpolation image memory organization method that is used for the fraction pixel precision motion prediction, carrying out 1/2 nDuring the pixel precision motion prediction (wherein n is a natural number), organize the internal memory of interpolation image according to the following steps:
(1) according to generate 2 nTimes interpolation image is divided into integer position subclass, 1/2 with wherein pixel 1Position subclass, 1/2 2The position subclass ..., and 1/2 nThe position subclass, comprise respectively in described each subclass whole integer pixel, all 1/2 pixels, all 1/4 pixels ..., and all 1/2 nPixel;
(2) form a whole pixel sub image identical with the whole integer pixel in the described integer position subclass with original image size;
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation, all 1/2 pixels in the described 1/2 position subclass further are divided into 3 littler subclass, constitute 3 1/2 pixel sub images identical correspondingly with original image size;
Classify by vertical, the level between each 1/4 pixel and adjacent integer pixel and 1/2 pixel and diagonal position relation and distance relation, all 1/4 pixels in the described 1/4 position subclass further are divided into 12 littler subclass, constitute 12 1/4 pixel sub images identical correspondingly with original image size;
The rest may be inferred, at n more than or equal to 3 o'clock, by each 1/2 nPixel and adjacent integer pixel, 1/2 pixel, 1/4 pixel ..., 1/2 (n-1)Vertical, level between the pixel and diagonal position relation and distance relation are classified, with described 1/2 nIn the subclass of position all 1/2 nPixel further is divided into (2 2n-2 2 (n-1)) individual littler subclass, constitute (2 correspondingly 2n-2 2 (n-1)) individual identical with original image size 1/2 nThe pixel sub image;
(3) described each number of sub images is formed a contiguous memory area stores in memory;
(4) except whole pixel sub image, described each number of sub images of having stored is carried out zero initialization process.
In (3) step of interpolation image memory organization method of the present invention, can by in following three kinds of joining methods any will be described contiguous memory area stores of each number of sub images formation in memory:
A, 2 2n X 1 splicing, i.e. vertical bar splicing;
B, 2 nX 2 nSplicing, i.e. square splicing;
C, 1 X 2 2nSplicing, i.e. horizontal stripe shape splicing.
At 1/4 pixel precision motion prediction, the present invention also provide a kind of according to above-mentioned interpolation image memory organization method, utilize single-instruction multiple-data class speed technology that digital signal processing chip provides to generate the method for each fraction pixel of 4 times of interpolation images, comprising following steps:
(1) utilizes whole pixel sub image SP 0, generate 1/2 pixel sub image SP by the filtering interpolation 4, SP 8
(2) utilize 1/2 pixel sub image SP 4, generate 1/2 pixel sub image SP by the filtering interpolation 12
(3) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by horizontal direction filtering interpolation 1, SP 5, SP 9, SP 13
(4) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by vertical direction filtering interpolation 2, SP 6, SP 10, SP 14
(5) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP with-45 ° of diagonal filtering interpolation by+45 ° 3, SP 7, SP 11, SP 15
At 1/4 pixel precision motion prediction, the present invention also provides a kind of method of each fraction pixel according to 4 times of interpolation images of above-mentioned generation, the single-instruction multiple-data class speed technology of utilizing digital signal processing chip to provide to calculate the method for predicated error index S AD fast, wherein calculates current macro MB according to the following steps 0In motion estimation process, the reference macroblock MB of certain position in the reference frame rBetween predicated error index S AD:
(1) when putting in order the pixel precision motion prediction, according to MB rThe position, from subimage SP 0Middle monoblock reads MB rData, calculate then absolute difference and;
(2) when carrying out 1/2 pixel precision motion prediction, according to MB rThe position, from subimage SP 4, SP 8, SP 12In certain monoblock read MB rData, calculate then absolute difference and;
(3) when carrying out 1/4 pixel precision motion prediction, according to MB rThe position, from subimage SP 1, SP 2, SP 3, SP 5, SP 6, SP 7, SP 9, SP 10, SP 11, SP 13, SP 14, SP 15In certain monoblock read MB rData, calculate then absolute difference and.
Method of the present invention has overcome the problem of being brought when existing 4 times of interpolation images are deposited according to natural order that reads inconvenience in internal memory.Make 1/2 pixel and 1/4 pixel precision motion prediction can make full use of SIMD class DSP accelerating algorithm to raise the efficiency.Wherein, according to attribute each pixel in 4 times of interpolation images is divided into subclass, every subset is divided into the plurality of sub image again, makes each subimage to be read as the monoblock data in SIMD class ordering calculation.Adopt method of the present invention, can H.263/H.263+, H.264, in international standard such as MPEG-4 and AVS 1.0 national standards, generate 4 times of interpolation image computings and estimation computing for the interpolation in 1/2 pixel and the 1/4 pixel precision motion prediction process, can effectively quicken.Especially the time by SIMD class DSP acceleration mechanism.Therefore, can under the constant prerequisite of other condition, improve the frame per second of video coding and decoding.Improve the performance of video communication kind equipment, perhaps, reduce the cost of product by adopting the lower DSP of disposal ability to reach same performance such as video conference or video telephone.
Description of drawings
The invention will be further described below in conjunction with drawings and Examples, in the accompanying drawing:
Fig. 1 is the search procedure of 1/4 pixel precision motion prediction in the existing H.264/AVC standard;
Fig. 2 is integer pixel, 1/2 pixel and the 1/4 pixel relative geometry position relation in 4 times of interpolation images;
Fig. 3 classifies and all kinds of numberings according to attribute to the various pixels in 4 times of interpolation images among the present invention;
Fig. 4 a, Fig. 4 b, Fig. 4 c are respectively 4 times of interpolation images under original image, the conventional store method and 4 times of interpolation images under the storage means of the present invention;
Fig. 5 a, Fig. 5 b, Fig. 5 c be 4 times of interpolation image P of gained of the present invention respectively 4x4In the splicing storage means of 16 number of sub images.
Embodiment
To be that the present invention will be described for example with 1/4 pixel precision motion prediction below.
For 1/4 pixel precision motion prediction, the key of the inventive method is a kind of reorganization and joining method for 4 times of (be width and highly all be 4 times interpolation image of original image) interpolation reference picture contents, thereby guarantees its storage continuously in internal memory.After using method of the present invention, when forming 4 times of interpolation reference pictures by interpolation and carry out 1/2 pixel, the estimation of 1/4 pixel motion, can make full use of the propinquity of pixel data in memory space of being visited in succession and significantly improve computational efficiency.When the SIMD that utilizes DSP to provide quickens function, when quickening to handle as MMX, the SSE etc. of Intel CPU, its performance boost is especially obvious, because this data space propinquity is fit to SIMD very much.The efficient realization of the 1/4 pixel precision motion prediction that requires during H.264/AVC this method is primarily aimed at, but its principle can be used for 1/2 pixel precision motion prediction H.263/H.263+ fully, 1/4 pixel precision motion prediction among the MPEG-4, and 1/4 pixel precision motion prediction in the AVS standard.
Wherein,, wherein pixel is divided into three subclass according to 4 times of interpolation images that will generate, as follows:
Integer position subclass S IP={ whole integer pixel }, the pixel of this subclass is represented with solid big garden in Fig. 3;
1/2 position subclass S HP={ all 1/2 pixels }, the pixel of this subclass is represented with filled squares in Fig. 3;
1/4 position subclass S QP={ all 1/4 pixels }, the pixel of this subclass is represented with open squares in Fig. 3.
As can be seen, at the integer pixel A (being numbered 0) in the upper right corner, correspondingly have 3 1/2 pixels (being numbered 4,8,12), and 12 1/4 pixels (being numbered 1,2,3,5,6,7,9,10,11,13,14,15) are arranged in the frame of broken lines from Fig. 3.Therefore, among the present invention with integer position subclass S IPIn all pixels make to constitute a whole pixel sub image SP 0, this subimage and original image are measure-alike.
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation again, with 1/2 position subclass S HPIn all 1/2 pixels further be divided into 3 subclass, each subclass constitutes 1/2 a pixel sub image identical with original image size, constitutes 3 1/2 pixel sub image SP altogether 4, SP 8, SP 12
Classify by vertical, the level between each 1/4 pixel and adjacent integer pixel and 1/2 pixel and diagonal position relation and distance relation (classification will be considered distance for 1/4 pixel) again, with 1/4 position subclass S HPIn all 1/4 pixels further be divided into 12 subclass, each subclass constitutes 1/4 a pixel sub image identical with original image size, constitutes 12 1/4 pixel sub image SP altogether 1, SP 2, SP 3, SP 5, SP 6, SP 7, SP 9, SP 10, SP 11, SP 13, SP 14, SP 15
Therefore, 4 times of interpolation images finally can be expressed as:
P 4 x 4 = SP 0 , SP 1 , SP 4 , SP 5 SP 2 , SP 3 , SP 6 , SP 7 SP 8 , SP 9 , SP 12 , SP 13 SP 10 , SP 11 , SP 14 , SP 15
In the above for P 4x4According to pixels classify reorganize after, generated 16 with big or small subimage such as original image.These subimages specifically storage in internal memory can have a lot of modes.The simplest method is exactly the storage separately respectively of 16 width of cloth images, but, the great-jump-forward reading of data when can causing calculating the corresponding sad value in each position, this storage means takes place, as in 1/2 pixel search procedure, during the sad value of 1/2 location of pixels correspondence directly over the calculating optimum integer pixel positions, need be from the 8th width of cloth image reading of data, but when calculating the sad value of upper left side and upper right side 1/2 location of pixels correspondence, but need reading of data from the 12nd width of cloth image, if 16 width of cloth images are separate, stored respectively, then the 8th width of cloth image and the 12nd width of cloth image deposit position meeting wide apart in internal memory will certainly influence access speed in these two reciprocal visits in position.Therefore, more scientific methods is the splicing storage, promptly 16 width of cloth images is spliced into the big image of a width of cloth and is stored in the internal memory.For this reason, the present invention proposes three kinds of joining methods,
So-called internal memory splicing mainly is to be used for those to read for internal memory and write data and have according to 2 the integer power multiple byte boundary method to the raising read-write efficiency of the DSP design of inferior requirement.If certain data block of read-write not according to the boundary alignment of certain byte number (such as 32,64 bytes), just needs to read or write some data (zero padding usually) more and gathers into boundary alignment, so nature can influence efficient.
For 1/2 pixel and 1/4 pixel precision motion prediction, the present invention can provide three kinds of joining methods that satisfy boundary alignment.Fig. 5 a, Fig. 5 b, Fig. 5 c show three kinds of internal memory joining methods that are suitable for for 1/4 pixel precision motion prediction situation.Each piece among this figure is represented a number of sub images.Be respectively:
A, 16 X, 1 splicing, promptly 16 number of sub images are spliced into a vertical bar;
B, 4 X, 4 splicings, (promptly 16 number of sub images are spliced into a square);
C, 1 X, 16 splicings, promptly 16 number of sub images are spliced into one (horizontal stripe).
For 1/2 pixel precision motion prediction situation, three kinds of splicing strategies can should be arranged mutually equally also:
A, 4 X, 1 splicing (vertical bar);
B, 2 X, 2 splicings (square);
C, 1 X, 4 splicings (horizontal stripe).
For 1/8 pixel, so bigger n, and these three kinds splicing strategies all are suitable for, but may also have more strategy.Therefore for general n, three kinds of splicing strategies are:
A, 2 2n X 1 splicing, i.e. (splicing of vertical bar shape);
B, 2 nX 2 nSplicing, i.e. square splicing;
C, 1 X 2 2nSplicing, i.e. horizontal stripe shape splicing.
At last, except whole pixel sub image, described each number of sub images of having stored is carried out zero initialization process.
By the process of 4 times of interpolation image P4x4 of original image generation, because the structural facility of P4x4 can adopt SIMD class DSP assisted instruction to finish, the convolution algorithm of filter design, shift operation can be finished for the unit monoblock with the subimage.Why adopt method for numbering serial shown in Figure 3, be for existing H.264 with other conformance to standard)
In estimation, the SAD computing can be instructed by SIMD by the submatrix that takes out from certain number of sub images and predicted macroblock size equates and be quickened.
According to above-mentioned interpolation image memory organization method, the SIMD class speed technology that can utilize DSP to provide generates each fraction pixels of 4 times of interpolation images according to the following steps:
(1) utilizes whole pixel sub image SP 0, generate 1/2 pixel sub image SP by the filtering interpolation 4, SP 8
(2) utilize 1/2 pixel sub image SP 4, generate 1/2 pixel sub image SP by the filtering interpolation 12.
(3) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by horizontal direction filtering interpolation 1, SP 5, SP 9, SP 13
(4) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by vertical direction filtering interpolation 2, SP 6, SP 10, SP 14
(5) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP with-45 ° of diagonal filtering interpolation by+45 ° 3, SP 7, SP 11, SP 15
According to the method for above-mentioned interpolation image memory organization method with each fraction pixel that generates 4 times of interpolation images, the SIMD class speed technology that can utilize DSP to provide is calculated current macro MB according to the following steps 0In motion estimation process, the reference macroblock MB of certain position in the reference frame rBetween predicated error index S AD:
(1) when putting in order the pixel precision motion prediction, according to MB rThe position, from subimage SP 0Middle monoblock reads MB rData, calculate SAD then.
(2) when carrying out 1/2 pixel precision motion prediction, according to MB rThe position, from subimage SP 4, SP 8, SP 12In certain monoblock read MB rData, calculate SAD then.
(3) when carrying out 1/4 pixel precision motion prediction, according to MB rThe position, from subimage SP 1, SP 2, SP 3, SP 5, SP 6, SP 7, SP 9, SP 10, SP 11, SP 13, SP 14, SP 15In certain monoblock read MB rData, calculate SAD then.
Adopt the present invention, can be H.263/H.263+, H.264, in international standard such as MPEG-4 and the AVS1.0 national standard, generate 4 times of interpolation image computings and the estimation computing is effectively quickened for the interpolation in 1/2 pixel and the 1/4 pixel precision motion prediction process.Especially the time by SIMD class DSP acceleration mechanism.Therefore, can under the constant prerequisite of other condition, improve the frame per second of video coding and decoding.Improve the performance of video communication kind equipment, perhaps, reduce the cost of product by adopting the lower DSP of disposal ability to reach same performance such as video conference or video telephone.These two kinds of methods can both improve the competitiveness of product in market.Effect of the present invention is significant, has following experimental data that effect of the present invention can be described:
Experiment 1: adopt the inventive method in conjunction with MMX, SSE2 is optimized and quickens for 1/4 picture element interpolation process, and for classical test pattern sequence C lair, News and Foreman, the result is shown in following table one:
Table one
Figure C200410076759D00191
Experiment 2: adopt the inventive method, carry out the acceleration optimization of whole 1/4 pixel precision motion prediction process for classical test pattern sequence C lair, News and Foreman, the result is shown in following table two, and the data in the table two are coding frame numbers that per second can be finished:
Table two
Figure C200410076759D00192
Method of the present invention is directly applied for and follows H.263/H.263+, H.264, the video encoder and the decoder of MPEG-4 international standard and AVS domestic standard, 1/2 pixel and 1/4 pixel precision motion prediction in can realizing.Also be suitable for video encoder and decoder that other adopts 1/2 pixel and 1/4 pixel precision motion prediction, to realize 1/2 pixel and 1/4 pixel precision motion prediction.
Method of the present invention also is applicable to and realizes 1/8 pixel precision motion prediction, and the inevitable motion prediction that at first carries out 1/2 pixel and 1/4 pixel precision of any 1/8 pixel precision motion prediction.Therefore 1/2 pixel and 1/4 pixel precision motion prediction are the necessary component and the prerequisite of 1/8 pixel precision motion prediction.Also be applicable to the realization of 1/8 pixel precision motion prediction in any other (not necessarily following certain standard) video encoder of employing 1/8 pixel precision motion prediction and the realization of decoder.
Attached: abbreviation that uses in this patent and Key Term
The English Chinese of abbreviation
AVC Audio-Video Coding audio-video coding
The advanced audio frequency and video coded system of AVS Advanced Audio-Video System (country)
DB deci-Bell decibel
DSP Digital Signal Processor digital signal processing chip
DV Displacement Vector displacement vector
MB Macroblock macro block
MV Motion Vector motion vector
MPEG Moving Picture Experts Group Motion Picture Experts Group (International Standards Organization)
PSNR Peak Signal-to Noise Ratio Y-PSNR
SIMD Single Instruction Multiple Data single-instruction multiple-data
MMX MultiMedia Extension Multimedia Xtension
SAD Sum of Absolute Differences absolute difference and
SSEStream SIMD Extension single-instruction multiple-data expansion instruction set

Claims (8)

1, a kind of interpolation image memory organization method that is used for the fraction pixel precision motion prediction is characterized in that, is carrying out 1/2 nDuring the pixel precision motion prediction, wherein, n is a natural number, organizes the internal memory of interpolation image according to the following steps:
(1) according to generate 2 nTimes interpolation image is divided into integer position subclass, 1/2 with wherein pixel 1Position subclass, 1/2 2The position subclass ..., and 1/2 nThe position subclass, comprise respectively in described each subclass whole integer pixel, all 1/2 pixels, all 1/4 pixels ..., and all 1/2 nPixel;
(2) form a whole pixel sub image identical with the whole integer pixel in the described integer position subclass with original image size;
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation, all 1/2 pixels in the described 1/2 position subclass further are divided into 3 littler subclass, constitute 3 1/2 pixel sub images identical correspondingly with original image size;
Classify by vertical, the level between each 1/4 pixel and adjacent integer pixel and 1/2 pixel and diagonal position relation and distance relation, all 1/4 pixels in the described 1/4 position subclass further are divided into 12 littler subclass, constitute 12 1/4 pixel sub images identical correspondingly with original image size;
The rest may be inferred, at n more than or equal to 3 o'clock, by each 1/2 nPixel and adjacent integer pixel, 1/2 pixel, 1/4 pixel ..., 1/2 (n-1)Vertical, level between the pixel and diagonal position relation and distance relation are classified, with described 1/2 nIn the subclass of position all 1/2 nPixel further is divided into (2 2n-2 2 (n-1)) individual littler subclass, constitute (2 correspondingly 2n-2 2 (n-1)) individual identical with original image size 1/2 nThe pixel sub image;
(3) described each number of sub images is formed a contiguous memory area stores in memory;
(4) except whole pixel sub image, described each number of sub images of having stored is carried out zero initialization process.
2, interpolation image memory organization method according to claim 1 is characterized in that, described n equals 1, when carrying out 1/2 pixel precision motion prediction, organizes the internal memory of interpolation image according to the following steps:
(1) according to 2 times of interpolation images that will generate, wherein pixel is divided into integer position subclass and 1/2 position subclass, comprise the whole integer pixel in the described integer position subclass, comprise all 1/2 pixels in the described 1/2 position subclass;
(2) form a whole pixel sub image identical with the whole integer pixel in the described integer position subclass with original image size;
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation, all 1/2 pixels in the described 1/2 position subclass further are divided into 3 littler subclass, constitute 3 1/2 pixel sub images identical correspondingly with original image size;
(3) described each number of sub images is formed a contiguous memory area stores in memory;
(4) except whole pixel sub image, described 3 the 1/2 pixel sub images of having stored are carried out zero initialization process.
3, interpolation image memory organization method according to claim 1 is characterized in that, described n equals 2, when carrying out 1/4 pixel precision motion prediction, organizes the internal memory of interpolation image according to the following steps:
(1), wherein pixel is divided into integer position subclass S according to 4 times of interpolation images that will generate IP, 1/2 position subclass S HP, and 1/4 position subclass S QP, described integer position subclass S IPIn comprise the whole integer pixel, described 1/2 position subclass S HPIn comprise all 1/2 pixels, described 1/4 position subclass S QPIn comprise all 1/4 pixels;
(2) with described integer position subclass S IPIn the whole integer pixel form a whole pixel sub image SP identical with original image size 0
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation, with described 1/2 position subclass S HPIn all 1/2 pixels further be divided into 3 littler subclass, constitute 3 subimage SPs identical correspondingly with original image size 4, SP 8, SP 12
Classify by vertical, the level between each 1/4 pixel and adjacent integer pixel and 1/2 pixel and diagonal position relation and distance relation, with described 1/4 position subclass S QPIn all 1/4 pixels further be divided into 12 littler subclass, constitute 12 subimage SPs identical correspondingly with original image size 1, SP 2, SP 3, SP 5, SP 6, SP 7, SP 9, SP 10, SP 11, SP 13, SP 14, SP 15
(3) described each number of sub images is formed a contiguous memory area stores in memory;
(4) except whole pixel sub image, described 3 1/2 pixel sub images and 12 1/4 pixel sub images of having stored are carried out zero initialization process.
4, interpolation image memory organization method according to claim 1 is characterized in that, described n equals 3, when carrying out 1/8 pixel precision motion prediction, organizes the internal memory of interpolation image according to the following steps:
(1) according to 8 times of interpolation image that will generate, pixel wherein is divided into integer position subclass, 1/2 position subclass, 1/4 position subclass and 1/8 position subclass, comprises whole integer pixel, all 1/2 pixels, all 1/4 pixels and all 1/8 pixels in described each subclass respectively;
(2) form an integer pixel subimage identical with the whole integer pixel in the described integer position subclass with original image size;
Classify by vertical, level between each 1/2 pixel and the adjacent integer pixel and diagonal position relation, all 1/2 pixels in the described 1/2 position subclass further are divided into 3 littler subclass, constitute 3 1/2 pixel sub images identical correspondingly with original image size;
Classify by vertical, the level between each 1/4 pixel and adjacent integer pixel and 1/2 pixel and diagonal position relation and distance relation, all 1/4 pixels in the described 1/4 position subclass further are divided into 12 littler subclass, constitute 12 1/4 pixel sub images identical correspondingly with original image size;
Classify by vertical, the level between each 1/8 pixel and adjacent integer pixel, 1/2 pixel, 1/4 pixel and diagonal position relation and distance relation, all 1/8 pixels in the described 1/8 position subclass further are divided into 48 subclass, constitute 48 1/8 pixel sub images identical correspondingly with original image size;
(3) described each number of sub images is formed a contiguous memory area stores in memory;
(4) except whole pixel sub image, described 3 the 1/2 pixel sub images of having stored, 12 1/4 pixel sub images and 48 1/8 pixel sub images are carried out zero initialization process.
5, according to each described interpolation image memory organization method among the claim 1-4, it is characterized in that, in described (3) step, can by in following three kinds of joining methods any will be described contiguous memory area stores of each number of sub images formation in memory:
A, 2 2nX 1 splicing, i.e. vertical bar splicing;
B, 2 nX 2 nSplicing, i.e. square splicing;
C, 1 X 2 2nSplicing, i.e. horizontal stripe shape splicing.
6, interpolation image memory organization method according to claim 5, it is characterized in that, described n=2, in described (3) step, can by in following three kinds of joining methods any will be described contiguous memory area stores of each number of sub images formation in memory:
A, 16 X, 1 splicing, i.e. vertical bar splicing;
B, 4 X, 4 splicings, i.e. square splicing;
C, 1 X, 16 splicings, i.e. horizontal stripe shape splicing.
7, interpolation image memory organization method according to claim 6 and utilize single-instruction multiple-data class speed technology that digital signal processing chip provides to generate the method for each fraction pixel of 4 times of interpolation images is characterized in that, may further comprise the steps:
(1) utilizes whole pixel sub image SP 0, generate 1/2 pixel sub image SP by the filtering interpolation 4, SP 8
(2) utilize 1/2 pixel sub image SP 4, generate 1/2 pixel sub image SP by the filtering interpolation 12
(3) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by horizontal direction filtering interpolation 1, SP 5, SP 9, SP 13
(4) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP by vertical direction filtering interpolation 2, SP 6, SP 10, SP 14
(5) utilize whole pixel sub image SP 0With 1/2 pixel sub image SP 4, SP 8, SP 12, generate 1/4 pixel sub image SP with-45 ° of diagonal filtering interpolation by+45 ° 3, SP 7, SP 11, SP 15
8, the method for each fraction pixel of 4 times of interpolation images of the generation according to claim 7 and single-instruction multiple-data class speed technology of utilizing digital signal processing chip to provide is calculated the method for predicated error index S AD fast, it is characterized in that, calculate current macro MB according to the following steps 0In motion estimation process, the reference macroblock MB of certain position in the reference frame rBetween predicated error index S AD:
(1) when putting in order the pixel precision motion prediction, according to MB rThe position, from subimage SP 0Middle monoblock reads MB rData, calculate then absolute difference and;
(2) when carrying out 1/2 pixel precision motion prediction, according to MB rThe position, from subimage SP 4, SP 8, SP 12In certain monoblock read MB rData, calculate then absolute difference and;
(3) when carrying out 1/4 pixel precision motion prediction, according to MB rThe position, from subimage SP 1, SP 2, SP 3, SP 5, SP 6, SP 7, SP 9, SP 10, SP 11, SP 13, SP 14, SP 15In certain monoblock read MB rData, calculate then absolute difference and.
CN 200410076759 2004-09-14 2004-09-14 Method for organizing interpolation image memory for fractional pixel precision predication Active CN100502511C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200410076759 CN100502511C (en) 2004-09-14 2004-09-14 Method for organizing interpolation image memory for fractional pixel precision predication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200410076759 CN100502511C (en) 2004-09-14 2004-09-14 Method for organizing interpolation image memory for fractional pixel precision predication

Publications (2)

Publication Number Publication Date
CN1750659A CN1750659A (en) 2006-03-22
CN100502511C true CN100502511C (en) 2009-06-17

Family

ID=36605883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200410076759 Active CN100502511C (en) 2004-09-14 2004-09-14 Method for organizing interpolation image memory for fractional pixel precision predication

Country Status (1)

Country Link
CN (1) CN100502511C (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8265136B2 (en) * 2007-02-20 2012-09-11 Vixs Systems, Inc. Motion refinement engine for use in video encoding in accordance with a plurality of sub-pixel resolutions and methods for use therewith
CN101330614B (en) * 2007-06-21 2011-04-06 中兴通讯股份有限公司 Method for implementing motion estimation of fraction pixel precision using digital signal processor
CN101453646B (en) * 2007-12-04 2012-02-22 华为技术有限公司 Image interpolation method, apparatus and interpolation coefficient obtaining method
US9513905B2 (en) 2008-03-28 2016-12-06 Intel Corporation Vector instructions to enable efficient synchronization and parallel reduction operations
US20090323807A1 (en) * 2008-06-30 2009-12-31 Nicholas Mastronarde Enabling selective use of fractional and bidirectional video motion estimation
US8260002B2 (en) * 2008-09-26 2012-09-04 Axis Ab Video analytics system, computer program product, and associated methodology for efficiently using SIMD operations
CN101729885B (en) * 2008-10-24 2011-06-08 安凯(广州)微电子技术有限公司 Image pixel interpolation method and system
CN101902632B (en) * 2009-05-25 2013-03-20 华为技术有限公司 Pixel interpolation filtering method and device, decoding method and system
CN102486866B (en) * 2010-12-03 2014-04-16 江南大学 Image gray scale mean value algorithm based on digital signal processor
CN102231202B (en) * 2011-07-28 2013-03-27 中国人民解放军国防科学技术大学 SAD (sum of absolute difference) vectorization realization method oriented to vector processor
CN107396165B (en) * 2016-05-16 2019-11-22 杭州海康威视数字技术股份有限公司 A kind of video broadcasting method and device
KR20220044271A (en) 2019-08-10 2022-04-07 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 Subpicture dependent signaling in video bitstreams
KR102609308B1 (en) 2019-10-02 2023-12-05 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 Syntax for subpicture signaling in video bitstreams
EP4032290A4 (en) 2019-10-18 2022-11-30 Beijing Bytedance Network Technology Co., Ltd. Syntax constraints in parameter set signaling of subpictures

Also Published As

Publication number Publication date
CN1750659A (en) 2006-03-22

Similar Documents

Publication Publication Date Title
US11622129B2 (en) Intra prediction method and apparatus using the method
US11272208B2 (en) Intra-prediction method using filtering, and apparatus using the method
US11240498B2 (en) Independently coding frame areas
RU2654525C1 (en) Method for image interpolation using asymmetric interpolation filter and device therefor
CN105955933B (en) Method for executing interpolation based on transformation and inverse transformation
EP1359763B1 (en) Approximate bicubic filter
US7620109B2 (en) Sub-pixel interpolation in motion estimation and compensation
CN103430545B (en) Context adaptive motion compensated filtering for efficient video coding
CN100502511C (en) Method for organizing interpolation image memory for fractional pixel precision predication
EP2373036B1 (en) Methods for motion estimation with adaptive motion accuracy
US7305034B2 (en) Rounding control for multi-stage interpolation
US20030156646A1 (en) Multi-resolution motion estimation and compensation
TW202315408A (en) Block-based prediction
JPH09233477A (en) Motion vector generating method
JPH09502840A (en) Encoder / Decoder for Television Image Subband Compatible Coding and Its Application to Hierarchical Motion Coding with Tree Structure
KR20110126075A (en) Method and apparatus for video encoding and decoding using extended block filtering
Shen et al. Benefits of adaptive motion accuracy in H. 26L video coding
CN111510727A (en) Motion estimation method and device
Maich et al. A Hardware Solution for the HEVC Fractional Motion Estimation Interpolation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant