CN102413329B - Motion estimation realizing method of configurable speed in video compression - Google Patents

Motion estimation realizing method of configurable speed in video compression Download PDF

Info

Publication number
CN102413329B
CN102413329B CN201110371098.8A CN201110371098A CN102413329B CN 102413329 B CN102413329 B CN 102413329B CN 201110371098 A CN201110371098 A CN 201110371098A CN 102413329 B CN102413329 B CN 102413329B
Authority
CN
China
Prior art keywords
cost
piece
complete
register
motion estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110371098.8A
Other languages
Chinese (zh)
Other versions
CN102413329A (en
Inventor
余宁梅
贾文华
顾梅花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Qing Ji Polytron Technologies Inc
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN201110371098.8A priority Critical patent/CN102413329B/en
Publication of CN102413329A publication Critical patent/CN102413329A/en
Application granted granted Critical
Publication of CN102413329B publication Critical patent/CN102413329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a motion estimation realizing method of a configurable speed in video compression. The motion estimation realizing method specifically comprises the following steps of: reasonably configuring and processing quantity of PE units according to user demands; in the PE units, calculating costs of basic blocks; based on the cost correlation of different sizes of blocks to obtain costs of different blocks under different partition modes; and reading complete all reference data line by line, comparing final costs obtained by all PEs and taking minimum cost to determine optimal motion information MV. According to the invention, times of storage access are effectively reduced, and coding speed completely can meet requirements of real-time coding of high-definition video.

Description

A kind of motion estimation implementing method of configurable speed in video compression
Technical field
The invention belongs to video compression transmission technology technical field, be specifically related to a kind of motion estimation implementing method of configurable speed in video compression.
Background technology
The HD video the most frequently used coded format of encoding has MPEG-2-TS, MPEG-4, VC-1 and H.264/AVC etc.The total feature of these standards is good network compatibility and efficient coding quality, and is easy to hardware realization etc., therefore aspect video compression, is widely used.In the hardware configuration of video encoder, the computation complexity of interframe movement estimation module and memory bandwidth consumption account for 50%~90%, and therefore, the performance of interframe movement estimation module has directly determined the performance of encoder.
The main process of interframe encode is: first original image is drawn to piece, carry out estimation taking piece as unit, in order to improve precision, conventionally these pieces are cut apart again, carry out match search with different sized blocks, main flow coding standard is original image to be divided into 16 × 16 macro block MB (micro block) at present, then be 16 × 16 by this macroblock partitions, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, 4 × 4 these 7 kinds cut apart, totally 41 current blocks, cut apart under pattern this, the movable information of the adjacent block of having encoded by current block is motion vector MV (motion ventor), in its reference frame image, draw a prediction piece, again centered by this piece, to extending out m pixel, draw the search window of estimation, containing k pixel, k=(m*2+16) * (m*2+16).Then allow these 7 kinds 41 sub-blocks cutting apart in this region of search, carry out match search, then determine motion vector MV by the size that compares its cost.
The hard-wired main key technology of estimation: altitude information utilance, the cost relation of low distortion and different masses size.
The hard-wired data reusing technology of estimation, can effectively reduce memory access number of times, thereby effectively reduces hardware resource consumption and system power dissipation.At present, the Hardware Implementation that a kind of altitude information is reused has become study hotspot.In existing document, specified the classification of data reusing degree, A level mechanism is reused the overlapping reference pixel in the adjacent reference block of a current block.B level mechanism is reused the adjacent overlapping reference pixel with reference to band of a current block.C level mechanism is reused the overlapping region of the search window of adjacent current block.D level mechanism is reused the pixel in the whole search window of continuous current block.A level mechanism has minimum storage area but needs maximum memory access number of times, and D level mechanism memory access number of times minimum still consumes memory space on maximum sheet.According to different demands, need to adopt different data reusing mechanism to carry out the contradiction between balance memory space and memory access.Current C DBMS is reused under the restriction of current memory bandwidth the most efficient, and therefore most design adopts C data reusing.
Searching algorithm is another key in estimation, mainly comprises two kinds of modes of full-search algorithm and fast search algorithm.Full-search algorithm is successively to travel through with reference to all positions in window, and this method has the highest fidelity, but has again comparatively speaking maximum hardware consumption.There are at present a lot of fast search algorithms, but all taking distortion as cost, under the prerequisite therefore allowing in system, should select the searching algorithm of distortion factor minimum as far as possible.
The motion estimation algorithm of variable block length, has improved precision, but has also brought very large computation complexity.Less cut size can drop to minimum by the distortion of coding, and the computation complexity therefore bringing can draw by cost combination between its different size piece, effectively reduces computation complexity.
Summary of the invention
The object of this invention is to provide a kind of motion estimation implementing method of configurable speed in video compression, effectively reduce the number of times of memory access, coding rate can meet the requirement of HD video real-time coding completely.
The technical solution adopted in the present invention is, a kind of motion estimation implementing method of configurable speed in video compression, is characterized in that, concrete steps are as follows:
Step 1, according to the PE unit number of user's request reasonable disposition parallel processing;
Step 2, in inside, PE unit, calculate basic block cost;
Step 3, cost correlation based on different size piece, draw the various costs of cutting apart the different masses under pattern;
Step 4, run through whole reference datas line by line, the final cost that each PE is obtained compares, and gets minimum cost and is defined as motion vector MV.
The concrete grammar of step 2 is:
Step 2.1, from on-chip memory, read line by line reference data each pixel of each row of the row data and current macro MB is asked to differential mode, wherein, on-chip memory size is (m*2+a) * (m*2+a) pixels, macroblock size is a*a pixels, (m ,+m) be hunting zone;
Step 2.2, will adhere to a separately 2the a*a an of/b b*b piece differential mode sums up as part cost, and wherein, b is the size of smallest partition piece;
Step 2.3, according to different traversal positions, the validity of determining section cost, thus draw part cost and produce its useful signal.
The concrete grammar of step 3 is:
Step 3.1, the register of storage b*b piece cost memory allocated space are set;
Step 3.2, in each memory space, counter is set, draws the whether cumulative complete signal full of judgment part cost;
Step 3.3, judge the useful signal of the part cost that input step 2.3 obtains, by its corresponding effectively part cost by the cumulative memory space having distributed that enters of diverse location;
Step 3.4, judge that whether full is effective, complete cost is sent to the positional information being drawn by this register label in addition of simultaneously sending;
Step 3.5, return to step 3.2, until obtain the complete cost of a a*a piece b*b sub-block, utilize cost correlation between different masses, splicing draws various whole costs of cutting apart pattern;
Meanwhile, the complete cost of the more current complete cost of sending and last position, selects the cost information storage of the little position of distortion, and current cost is also stored and done other match patterns uses.
The motion estimation implementing method of a kind of configurable speed in video compression of the present invention, utilize fully the correlation on reference data space, under the prerequisite that does not reduce encoding precision, reduce memory access number of times, the in the situation that of configuration and horizontal column position equivalent number PE unit, the data utilization rate of reading from on-chip memory reaches 100%, and read rate drops to 0 again.It is configurable that the inventive method realizes the multiple parameters such as hunting zone, I/O number, coding rate and hardware consumption, met the different demands of different user, and its coding rate can meet the requirement of HD video real-time coding completely.
Brief description of the drawings
Fig. 1 is macroblock partition pattern diagram in the present invention;
Fig. 2 is piece coupling schematic diagram in estimation;
Fig. 3 is reference data spatial coherence schematic diagram in the present invention;
Fig. 4 is the syntagmatic schematic diagram of the cost of different size piece in the present invention;
Fig. 5 is the relation of differential mode useful signal and data in the present invention;
Fig. 6 is part cost and the cumulative process schematic diagram in register in the present invention.
Embodiment
As shown in Figure 2, the process of estimation is the position in reference block that goes out current macro by the fortune merit vector prediction of coded macroblocks, then travels through within the scope of hunting zone centered by predicted position, by judging its residual error cost, determines motion vector.
The present embodiment is taked following configuration: hunting zone (32 ,+32), configuring 5 PE, macroblock size is selected 16*16, smallest partition 4*4, piece is cut apart 7 kinds of patterns (16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8,4 × 4), 41 fritters of meter.The motion estimation implementing method of a kind of configurable speed in video compression of the present invention, concrete steps are as follows:
Step 1, it is 5 according to the PE unit number of user's request reasonable disposition parallel processing.
According to user to I/O resource, in real time processing speed, the requirement of the aspects such as hardware consumption, determines the number of PE, horizontal position is by the PE serial process configuring.
Step 2, in inside, PE unit, calculate basic block cost.
Step 2.1, from on-chip memory, read line by line reference data each pixel of each row of the row data and current macro MB is asked to differential mode, wherein, on-chip memory size is 80*80pixels, and macroblock size is 16*16pixels, (32 ,+32) are hunting zone.Having 65*65 position to need traversal, is respectively P x1y1, P x1y2, P x1y3... P x1y65, P x2y1, P x2y2... .P x65y64, P x65y65.In the time of 5 PE of configuration, as shown in Figure 3, that first PE (being PE1) processes is P x1yi(i=1,2,3 ..., 65), P x6yi(i=1,2,3 ..., 65), that parallel second PE (being PE2) processes is P x2yi(i=1,2,3 ..., 65), P x7yi(i=1,2,3 ..., 65).
Read the first row reference data from on-chip memory time, PE1 gets 16 pixels of 1 row of first position, asks differential mode computing respectively with 16 row pixels of current macro, obtains 256 differential mode results.
Step 2.2,16*16 the differential mode that adheres to 64 4*4 pieces separately summed up as part cost, wherein, b is the size of smallest partition piece.
Shown in Fig. 1 and Fig. 4, the cost value of the known 4*4 of drawing piece just can by the method that merges cut apart the cost value of pattern to other differences, therefore by these 256 differential modes taking affiliated 4*4 piece as base unit sums up, obtain the part cost of 64 4*1 pieces.
Wherein, cost computing formula is: J (s, c (m))=SAD (s, c (m)),
SAD ( s , c ( m ) ) = Σ x = 1 M Σ y = 1 N | s ( x , y ) - c ( x - m x , y - m y ) | ,
Wherein, J is cost function, and s is current initial data of encoding, and c be coding and rebuilding for carrying out the data of reference frame of motion compensation.M, N is the parameter of the ∑ of Matrix Calculating and symbol, is respectively line number and the columns of summed matrix, for the part cost of 4*1 piece, M=4, N=1.
Step 2.3, according to different traversal positions, the validity of determining section cost, thus draw part cost and produce its useful signal.Specific as follows:
From the traversal mode of entirely searching for, the first row data of reference windows only and P x1yithere is correlation, and the second row data and P x1yi, P x2yithere is correlation, by that analogy, the 16 row data and P x1yi, P x2yi... P x16yihave correlation, whether the result thus can determining step 2.2 drawing is effective, so there is the part cost validity that is similar to parallelogram as shown in Figure 5.Known 64 the part costs that produce since the reference data of 16 row that read are all that effectively they belong to respectively different positions.
Step 3, cost correlation based on different size piece, draw the various costs of cutting apart the different masses under pattern.This step is to adopt " retaining " principle to calculate optimal movement information MV.
Step 3.1,16, the register of storage 4*4 piece cost memory allocated space are set;
Step 3.2, in each memory space, counter is set, draws the whether cumulative complete signal full of judgment part cost;
Step 3.3, judge the useful signal of the part cost that input step 2.3 obtains, by its corresponding effectively part cost by the cumulative memory space having distributed that enters of diverse location.Specific as follows:
The data accumulation record of cost register as shown in Figure 6, first row is the label of cost register, and label a line is thereafter the part worth of data receiving, capitalization A~P in figure is 16 row of current macro, and the line number of the reference pixel that reads from on-chip memory of numeral in table, taking No. 1 register as example, first cycle deposits A1 (the part cost of the first row of current macro and reference data the first row) in, second period is by the cumulative B2 register that enters No. 1, C3 afterwards, D4, now, in No. 1 register, cumulative A1B2C3D4 is exactly P x1y1the complete cost of first row 4*4 piece, therefore, full (full is the full signal of the cistern) home position signal in module, the E5 in the 5th cycle deposits and covers register in No. 1, then E5F6G7H8 added up after by full set, it is known that now send is P x1y1the complete cost of second row 4*4 piece, according to this principle, No. 1 every 4 cycle of register send a complete 4*4 cost, 16 cycle can complete the traversal of a point, and can start to receive the data of the 17th position of storage.15 register principles are below with above-mentioned consistent.
Step 3.4, judge that whether full is effective, complete cost is sent to the positional information being drawn by this register label in addition of simultaneously sending.
Step 3.5, return to step 3.2, until obtain the complete cost of the 4*4 sub-block of 16 16*16 pieces, utilize cost correlation between different masses, splicing draws whole costs of cutting apart pattern in 7;
Meanwhile, the complete cost of the more current complete cost of sending and last position, selects the cost information storage of the little position of distortion, and current cost is also stored and done other match patterns uses, 16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8 match pattern.
Step 4, run through whole reference datas line by line, the final cost that each PE is obtained compares, and gets 7 kinds of minimum costs of cutting apart 41 fritters of pattern and determines optimal movement information MV, thereby realized estimation.
When the inventive method is chosen following parameter: hunting zone (32 ,+32), configure 65 PE, macroblock size is selected 16*16, smallest partition 4*4, piece is cut apart 41 fritters of 7 kinds of patterns (16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8,4 × 4), need 80 cycles to complete the coupling of current macro; When 5 PE of configuration, need 1040 cycles to complete the coupling of current macro, the two is under the support of SMIC0.13 μ m CMOS technology library, and the processing speed that can reach is respectively 1920 × 1080@36fps and 1920 × 1080@462fps, has met the demand of HD video real-time coding.
Speed 30 frames using present HD video coding are per second as the minimum standard that realizes, and configure 5 PE unit and process, and under SMIC (SMIS) 0.13 μ m CMOS technology library is supported, circuit performance parameters is as shown in the table:
Hunting zone 65×65
Piece Dimension Types 4×4,4×8,8×4,8×8,16×8,8×16,16×16
Technique SMIC 0.13μm CMOS
Door number 150K
On-chip SRAM 80×80×8bits
Frequency 300MHz
Cycles/MB 1040cycle
Processing speed 1920*1080@36fps
As shown in the table, for utilizing the technology comparing result data of the inventive method and prior art:
Figure BDA0000110615620000081
Can find out, the coding rate of the inventive method can meet the requirement of HD video real-time coding completely.

Claims (1)

1. a motion estimation implementing method for configurable speed in video compression, is characterized in that, concrete steps are as follows:
Step 1, according to the PE unit number of user's request reasonable disposition parallel processing;
Step 2, in inside, PE unit, calculate basic block cost; Concrete grammar is:
Step 2.1, from on-chip memory, read line by line reference data each pixel of each row of the row data and current macro MB is asked to differential mode, wherein, on-chip memory size is (m*2+a) * (m*2+a) pixels, macroblock size is a*a pixels, (m ,+m) be hunting zone;
Step 2.2, will adhere to a separately 2the a*a an of/b b*b piece differential mode sums up as part cost, and wherein, b is the size of smallest partition piece;
Step 2.3, according to different traversal positions, the validity of determining section cost, thus draw part cost and produce its useful signal;
Step 3, cost correlation based on different size piece, draw the various costs of cutting apart the different masses under pattern, and concrete grammar is:
Step 3.1, the register of storage b*b piece cost memory allocated space are set;
Step 3.2, in each memory space, counter is set, draws the whether cumulative complete signal full of judgment part cost;
Step 3.3, judge the useful signal of the part cost that input step 2.3 obtains, by its corresponding effectively part cost by the cumulative memory space having distributed that enters of diverse location;
Step 3.4, judge that whether full is effective, complete cost is sent to the positional information being drawn by register label in addition of simultaneously sending; Described register label is the sequence number of the register of setting in step 3.1;
Step 3.5, return to step 3.2, until obtain the complete cost of the b*b sub-block of a a*a piece, utilize cost correlation between different masses, splicing draws various whole costs of cutting apart pattern;
Meanwhile, the complete cost of the more current complete cost of sending and last position, selects the cost information storage of the little position of distortion, and current cost is also stored and done other match patterns uses;
Step 4, run through whole reference datas line by line, the final cost that each PE is obtained compares, and gets minimum cost and is defined as motion vector MV.
CN201110371098.8A 2011-11-21 2011-11-21 Motion estimation realizing method of configurable speed in video compression Active CN102413329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110371098.8A CN102413329B (en) 2011-11-21 2011-11-21 Motion estimation realizing method of configurable speed in video compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110371098.8A CN102413329B (en) 2011-11-21 2011-11-21 Motion estimation realizing method of configurable speed in video compression

Publications (2)

Publication Number Publication Date
CN102413329A CN102413329A (en) 2012-04-11
CN102413329B true CN102413329B (en) 2014-06-04

Family

ID=45915138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110371098.8A Active CN102413329B (en) 2011-11-21 2011-11-21 Motion estimation realizing method of configurable speed in video compression

Country Status (1)

Country Link
CN (1) CN102413329B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301732B (en) * 2014-10-13 2017-05-17 哈尔滨工业大学深圳研究生院 video coding motion estimation unit hardware circuit

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101778281A (en) * 2010-01-13 2010-07-14 中国移动通信集团广东有限公司中山分公司 Method for estimating H.264-based fast motion on basis of structural similarity
CN102113326A (en) * 2008-08-04 2011-06-29 杜比实验室特许公司 Overlapped block disparity estimation and compensation architecture

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060126739A1 (en) * 2004-12-15 2006-06-15 Stoner Michael D SIMD optimization for H.264 variable block size motion estimation algorithm
US8358695B2 (en) * 2006-04-26 2013-01-22 Altera Corporation Methods and apparatus for providing a scalable motion estimation/compensation assist function within an array processor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102113326A (en) * 2008-08-04 2011-06-29 杜比实验室特许公司 Overlapped block disparity estimation and compensation architecture
CN101778281A (en) * 2010-01-13 2010-07-14 中国移动通信集团广东有限公司中山分公司 Method for estimating H.264-based fast motion on basis of structural similarity

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
H.264中运动估计算法的一种硬件实现架构;白向晖等;《电视技术》;20041117(第11期);第17-19页 *
白向晖等.H.264中运动估计算法的一种硬件实现架构.《电视技术》.2004,(第11期),

Also Published As

Publication number Publication date
CN102413329A (en) 2012-04-11

Similar Documents

Publication Publication Date Title
CN107534770B (en) Image prediction method and relevant device
CN102165780B (en) Video encoder and method, and video decoder and method thereof
CN100471275C (en) Motion estimating method for H.264/AVC coder
CN103931184B (en) Method and apparatus for being coded and decoded to video
CN107734335A (en) Image prediction method and relevant apparatus
CN100415002C (en) Multi-mode multi-viewpoint video signal code compression method
CN103891290A (en) Motion vector processing
CN102934444A (en) Method and apparatus for video encoding and method and apparatus for video decoding
CN102291581B (en) Realizing method of self-adaptive motion estimation supporting frame field
CN104918053A (en) Methods and apparatuses for encoding and decoding motion vector
CN106464908A (en) Method and device for transmitting prediction mode of depth image for interlayer video encoding and decoding
CN103188496A (en) Fast motion estimation video encoding method based on motion vector distribution forecast
CN101986716A (en) Quick depth video coding method
CN103096090A (en) Method of dividing code blocks in video compression
CN102811346A (en) Encoding mode selection method and system
CN103079067A (en) Motion vector predicted value list construction method and video encoding and decoding method and device
CN102148990B (en) Device and method for predicting motion vector
CN102647598A (en) H.264 inter-frame mode optimization method based on maximin MV (Music Video) difference value
CN104919799A (en) Method and apparatus of depth to disparity vector conversion for three-dimensional video coding
CN1703094B (en) Image interpolation apparatus and methods that apply quarter pel interpolation to selected half pel interpolation results
CN101860747B (en) Sub-pixel movement estimation system and method
CN101959067B (en) Decision method and system in rapid coding mode based on epipolar constraint
CN102413329B (en) Motion estimation realizing method of configurable speed in video compression
CN103096064B (en) The method and relevant device of coding and reconstructed pixel block
CN101227616B (en) H.263/AVC integer pixel vectors search method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160829

Address after: 7, building 15, 510665 software Road, Guangzhou, Guangdong, Tianhe District

Patentee after: Guangzhou Qing Ji Polytron Technologies Inc

Address before: 710048 Shaanxi city of Xi'an Province Jinhua Road No. 5

Patentee before: Xi'an University of Technology