The video data compression method of optimization searching algorithm and device
Technical field
The present invention relates to the estimation in the video compression algorithm, especially, relate to video data compression method and the device of optimizing the searching algorithm that uses search pattern.
Background technology
Present video compression algorithm mostly is based on estimation, just predict the pixel shift position of present frame according to the pixel of former frame, the error of the motion vectors of generation and predicted value and actual value is handled as the data after compressing and is transmitted.Three kinds of frame of video I, P, B are generally arranged in actual algorithm.The I frame is not carry out motion compensation directly to compress, and the P frame is the estimation of carrying out forward direction, and the B frame occurs in than higher compression level usually, is two-way predictive frame.The amount of calculation maximum is exactly estimation in a video compression algorithm, and estimation is the most time-consuming module.
The core concept of estimation is exactly to want operation times few more, and convergence rate is fast more, finds the position with former frame respective pixel piece coupling.Relatively Chang Yong searching algorithm has three-step approach, gradient method, diamond method etc.
Searching algorithm when three-step approach (TSS) is the intraframe coding estimation.For computation complexity that reduces searching algorithm and the accuracy that improves motion compensation, TSS adopts 9 squares that object block is carried out Optimum Matching, and its basic thought is to utilize the shape of search pattern and size produces material impact to motion estimation algorithm speed and precision characteristic.When search Optimum Matching point, select little search pattern may be absorbed in local optimum, select big search pattern then possibly can't find optimum point.
With reference to Fig. 1, Fig. 1 is the schematic diagram of three-step approach search pattern in the prior art.The search procedure of three-step approach is as follows: the first step is the center with the object block, gets 8 points toward the four direction expansion with a fixed step size, compares; Second step reduced by half step-length, was that 8 points are got in expansion around the middle mind-set with optimum piece before, compared; Repeat first, second step, less than 1, obtain optimum point up to step-length.
Diamond search (DS, Diamond Search) method is gained the name with the search pattern shape, has simple, robust, characteristics of high efficiency, is one of fast search algorithm of existing best performance.The DS algorithm is at the basic law of motion vector in the video image, selected the search pattern of two kinds of shape sizes for use, a kind of is bitellos search pattern (LDSP, Large Diamond SearchPattern), with reference to Fig. 2, Fig. 2 is the schematic diagram of bitellos search pattern in the prior art, and it comprises 9 position candidate; Another kind is melee search pattern (SDSP, Small DiamondSearch Pattern), and with reference to Fig. 3, Fig. 3 is the schematic diagram of the medium and small diamond search template of prior art, and it comprises 5 position candidate.
DS algorithm search process is as follows: the incipient stage is reused the bitellos search pattern earlier, drops on the bitellos center up to best matching blocks.Because the LDSP step-length is big, thereby the hunting zone is wide, can realize coarse positioning, it is local minimum that search can not sunk into.After coarse positioning finishes, can think optimum point just LDSP on every side 8 points enclose in the diamond-shaped area.And then the accurate location of using the melee search pattern to realize best matching blocks, not producing, thereby improve the estimation precision than macrorelief.
In three step searching algorithms and DS searching algorithm, the position of the match point that need read changes, and these positions are discontinuous in internal memory.And in system cache, the address of data is continuous.Therefore operating system is when carrying out buffer memory, and the internal storage data failure rate increases, and directly the visit to internal memory will inevitably cause the visit inefficacy, increases access latency.
Summary of the invention
The technical problem to be solved in the present invention is to shorten the additional wait time that processor or hardware logic cause because of cache invalidation when carrying out searching algorithm calculating.
According to a first aspect of the invention, a kind of video data compression method of optimizing the searching algorithm that uses search pattern is provided, described method comprises: according to search pattern and step-size in search, determine that all the other each search block are with respect to the offset address of current search piece in the search pattern; In search,, calculate the initial address of all the other each search block according to initial address and described each offset address of current search piece; And before carrying out matching operation, the video data of current search piece and the video data of all the other each search block are read in buffer memory successively in advance.
In first aspect, preferably, the searching algorithm of described use search pattern comprises the searching algorithm that uses single search pattern.
Preferably, the searching algorithm of the single search pattern of described use comprises three step searching algorithms.
Preferably, the searching algorithm of described use search pattern comprises the searching algorithm that uses the multi-pass decoding template.
Preferably, the searching algorithm of described use multi-pass decoding template comprises the diamond search algorithm.
Preferably, described current search piece is positioned at the center of template.
According to a second aspect of the invention, a kind of video data compression device of optimizing the searching algorithm that uses search pattern is provided, described device comprises: offset address is determined device, be used for according to search pattern and step-size in search, determine that all the other each search block are with respect to the offset address of current search piece in the search pattern; The initial address calculation element is used for initial address and described each offset address according to the current search piece in search, calculate the initial address of all the other each search block; And the pre-read apparatus of blocks of data, be used for before carrying out matching operation, the video data of current search piece and the video data of all the other each search block are read in buffer memory successively in advance.
In second aspect, preferably, the algorithm of described use search pattern comprises the searching algorithm that uses single search pattern.
Preferably, the searching algorithm of the single search pattern of described use comprises three step searching algorithms.
Preferably, the searching algorithm of described use search pattern comprises the searching algorithm that uses the multi-pass decoding template.
Preferably, the searching algorithm of described use multi-pass decoding template comprises the diamond search algorithm.
Preferably, described current search piece is positioned at the center of template.
Although searching algorithm is discontinuous to reading of piece, but, can utilize the buffer memory read ahead technique to improve single internal memory and read in advance by calculating the address of specified data piece according to the present invention, thereby the raising success rate for memory cache shortens additional wait time of processor or hardware logic in a large number.
Description of drawings
Fig. 1 is the schematic diagram of three-step approach search pattern in the prior art;
Fig. 2 is the schematic diagram of bitellos search pattern in the prior art;
Fig. 3 is the schematic diagram of the medium and small diamond search template of prior art;
Fig. 4 is a flow chart of optimizing the video data compression method of three step searching algorithms according to the present invention;
Fig. 5 is a flow chart of optimizing the video data compression method of diamond search algorithm according to the present invention.
For understanding the present invention better, the invention will be further described below in conjunction with the drawings and specific embodiments.
Embodiment
According to the present invention, the process of optimizing the video data compression method of three step searching algorithms is: at first, offset address is determined search pattern and the step-size in search of device according to three-step approach, determines that all the other each search block are with respect to the offset address of current search piece in the search pattern.Can set up an offset address table of comparisons, corresponding step-size in search and template index can obtain the offset address of all the other each search block with respect to the current search piece, and preferably, the current search piece is positioned at the center of template.With reference to Fig. 1,, be the center with the current search piece for the three-step approach search pattern, from the search block that is positioned at its left side along clockwise direction, the offset address of each search block is respectively as shown in the table, and wherein n is a step-size in search, the offset address of current search piece is (0,0).
(0,0) |
(-n,0) |
(-n,n) |
(0,n) |
(n,n) |
(n,0) |
(n,-n) |
(0,-n) |
(-n,-n) |
|
Secondly, the initial address calculation element according to initial address and each offset address of current search piece, calculates the initial address of all the other each search block in search.As current search piece initial address is (72,36), and step-length is 4, and so the initial address of Dui Ying each search block is respectively (72,36), (72-4,36), (72-4,36+4), and the like.
At last, the pre-read apparatus of blocks of data read in buffer memory successively in advance with the video data of current search piece and the video data of all the other each search block before carrying out matching operation.Here, the matching criterior commonly used of matching operation comprise absolute error and (SAD), mean square error (MSE) and Normalized Cross Correlation Function (NCCF), wherein SAD does not need multiplying, realizes simple.Determined the initial address of piece,, just the video data of piece can have been read in the buffer memory because piece is the block of pixels of 8 * 8 sizes.At embedded platform, as the instruction of specially buffer memory being handled is provided on DSP and the ARM platform.Use the special instruction of processor or DSP, video data can be read buffer memory as the PLD instruction of ARM.Like this, read (Cache preload) in advance with buffer memory and before computing, carry out buffer memory, improved systematic function.
Before the three-step approach first step, read the offset address table of comparisons earlier, the data of assigned address are read in the buffer memory.When calculating an optimum point, according to new step-length, the video data with assigned address reads in buffer memory in advance, constantly repeats.With reference to Fig. 4, Fig. 4 is a flow chart of optimizing the video data compression method of three step searching algorithms according to the present invention.The concrete steps of said process are as follows: 1) set up a TSS template offset address table of comparisons; 2) read blocks of data in the reference frame; 3) block address with correspondence is the center, obtains the initial address of each piece from the offset address table of comparisons successively; 4) when system-computed SAD, blocks of data is read in the buffer memory, relatively the result before preserves less value, repeats 8 times; 5) if the sad value minimum of central point piece, this search end so.Otherwise, central point is moved on to that piece of sad value minimum, after step-length is reduced by half, repeated execution of steps 3), less than 1, obtain optimal value up to step-length.
According to the present invention, the process of optimizing the video data compression method of diamond search algorithm is: at first, offset address is determined the two kind templates of device according to the DS searching algorithm, determines that respectively all the other each search block are with respect to the offset address of current search piece in each template.Can set up two offset address tables of comparisons, corresponding step-size in search and template index can obtain the offset address of each search block with respect to the current search piece, and preferably, the current search piece is positioned at the center of template.With reference to Fig. 2,3, with the current search piece is the center, from be positioned at its left side search block along clockwise direction, the offset address of each search block is respectively as shown in LDSP table, SDSP table in the large and small diamond search template, wherein the offset address of current search piece is (0,0), n is the current search step-length of large form, and the step-size in search of little template is 1.
LDSP |
(0,0) |
(-n,0) |
(-n/2,n/2) |
(0,n) |
(n/2,n/2) |
(n,0) |
(n/2,-n/2) |
(0,-n) |
(-n/2,-n/2) |
SDSP |
(0,0) |
(-1,0) |
(0,1) |
(1,0) |
(0,-1) |
|
|
|
|
Secondly, the initial address calculation element according to initial address and each offset address of current search piece, calculates the initial address of all the other each search block in search.As current search piece initial address is (72,36), and step-length is 2, and so the initial address of each search block is respectively (72,36) in the Dui Ying large form, (72-2,36), (72-2/2,36+2/2), and the like.
At last, the pre-read apparatus of blocks of data read in buffer memory successively in advance with the video data of current search piece and the video data of all the other each search block before carrying out matching operation.Similarly, the matching criterior commonly used of matching operation here also comprise absolute error and (SAD), mean square error (MSE) and Normalized Cross Correlation Function (NCCF).Still can use the special instruction of processor or DSP, instruction reads buffer memory with video data as the PLD of ARM, utilizes buffer memory to read (Cachepreload) in advance equally and carried out buffer memory before computing, has improved systematic function.
With reference to Fig. 5, Fig. 5 is a flow chart of optimizing the video data compression method of diamond search algorithm according to the present invention.The concrete steps of said process are: the blocks of data that 1) reads a reference frame; 2) according to the initial address of first piece, first data are read in the buffer memory, in the LDSP table, determined the initial address of all the other each pieces according to current step-length; When 3) carrying out SAD calculating, second piece read in the buffer memory, relatively also preserved, repeat 8 times than the figure of merit in system; 4) if the sad value minimum of central point piece then enters little template matches link, otherwise central point is moved on to that piece of SAD minimum, adjust step-length, repeated execution of steps 2); 5) determine the initial address of first piece in SDSP table, first video data is read in the buffer memory; When 6) carrying out current block SAD calculating, second piece read in the buffer memory, relatively also preserved, repeat 4 times than the figure of merit in system; 7) if the sad value minimum of central point, then current block search finishes, otherwise central point is moved on to that piece of sad value minimum, repeated execution of steps 5), until obtaining optimum point.
In the process and concrete steps of above-mentioned three step of optimization searching algorithm and diamond search algorithm, the center that described current search piece is positioned at search pattern is an example, and the current search piece also can be arranged in search pattern except that excentral other any positions here.
Preamble is respectively to optimize the three-step approach of using single search pattern and to optimize and use the DS algorithm of two-stage search pattern to be example, the description of property that the invention has been described.In addition, the present invention also can optimize the searching algorithm that uses other single search patterns, and optimizes the searching algorithm that uses other multi-pass decoding templates, and as four step rule, two dimensional logarithmic method etc., this is obvious to those skilled in the art.
Obviously, the present invention described here can have many variations, and this variation can not be thought and departs from the spirit and scope of the present invention.Therefore, the change that all it will be apparent to those skilled in the art all is included within the covering scope of these claims.