CN1481169A - A real-time 1/4 interpolation method based on multi-stage pipeline structure - Google Patents

A real-time 1/4 interpolation method based on multi-stage pipeline structure Download PDF

Info

Publication number
CN1481169A
CN1481169A CNA031525024A CN03152502A CN1481169A CN 1481169 A CN1481169 A CN 1481169A CN A031525024 A CNA031525024 A CN A031525024A CN 03152502 A CN03152502 A CN 03152502A CN 1481169 A CN1481169 A CN 1481169A
Authority
CN
China
Prior art keywords
interpolation
pipeline structure
data
stage pipeline
motion estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA031525024A
Other languages
Chinese (zh)
Other versions
CN1186939C (en
Inventor
晁 黄
黄晁
王荣刚
李锦涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN 03152502 priority Critical patent/CN1186939C/en
Publication of CN1481169A publication Critical patent/CN1481169A/en
Application granted granted Critical
Publication of CN1186939C publication Critical patent/CN1186939C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种基于与多级流水线结构的实时1/4插值方法。将MPEG4协议中规定的1/4插值过程与运动估计过程结合进行,即只对运动估计的过程中需要用到的块进行1/4插值,并将插值过程组织成数据读入、行1/2插值、列1/2插值、双线性插值和数据输出5级流水线结构,这样不但可以节约插值过程中存储器资源的占用,而且可以大大加快插值速度。包括以下步骤:对编解码器运动估计的过程中需要用到的块进行实时插值;插值过程和运动估计过程按流水线顺序结合进行;通过设立中间缓冲区消除数据重复读入;将插值过程组织成5级流水线结构;将1/2插值过程设计成4级流水线结构;本方法可以应用于基于MPEG4协议的视频编码器的设计中。

Figure 03152502

A real-time 1/4 interpolation method based on a multi-stage pipeline structure. The 1/4 interpolation process specified in the MPEG4 protocol is combined with the motion estimation process, that is, only 1/4 interpolation is performed on the blocks that need to be used in the motion estimation process, and the interpolation process is organized into data reading, line 1/ 2 interpolation, column 1/2 interpolation, bilinear interpolation and data output 5-stage pipeline structure, which can not only save the memory resource occupation in the interpolation process, but also greatly speed up the interpolation speed. The method includes the following steps: performing real-time interpolation on the blocks needed in the process of codec motion estimation; combining the interpolation process and the motion estimation process in a pipeline sequence; eliminating repeated data reading by setting up an intermediate buffer; organizing the interpolation process into 5-stage pipeline structure; the 1/2 interpolation process is designed into a 4-stage pipeline structure; this method can be applied to the design of video encoder based on MPEG4 protocol.

Figure 03152502

Description

A kind of real-time 1/4 interpolation method based on multi-stage pipeline arrangement
Technical field
The present invention relates to the video coding and decoding technology field, particularly between frame of video image based on a kind of real-time 1/4 interpolation method of the encoding and decoding technique of time prediction based on multi-stage pipeline arrangement.
Background technology
In video coding, often image is divided into two kinds of I picture and inter frame images.Wherein, inter frame image need adopt motion estimation coding method, and this mainly is because the pixel of adjacent image piece has very big temporal correlation.The main thought of this coding method is to find the piece that mates most with the encoding block predicted value (prediction piece) as encoding block in reference picture, encodes.Encoding block and prediction piece matching degree are high more, and the efficient of coding is high more.In order to improve the matching degree of the two, just need to improve the precision of estimation.What MPEG1 adopted is whole pixel precision, MPEG2 and H.263 employing be half-pixel accuracy, in MPEG4, adopted 1/4 pixel precision, thereby can improve code efficiency.But 1/4 precision estimation need be carried out 1/4 interpolation to reference picture, 1/4 interpolation process computation complexity height, and desire is carried out interpolation to an integral point need carry out 6 rank linear interpolation and bilinear interpolations to 6 * 6 integral points around this point.As shown in Figure 1, an image block through 1/4 interpolation after the size become original 16 times.A whole pixel in the original image piece becomes 16 points after through 1/4 interpolation, process as shown in Figure 2, it is that 16 points need be with 6 * 6 integral points around the A as input that desire will be put the A interpolation, detailed process is:
(1) respectively 6 * 6 each row of data is carried out 6 rank linear interpolations, obtain a little 1 and 5.
(2) respectively an A, 1,4 columns are carried out 6 rank linear interpolations, obtain a little 2,3,4.
(3) respectively to an A, 2, the C place every trade linear interpolation of advancing, obtain an a, b, h, i, o, p.
(4) respectively an A, a, 1, b column are carried out the alignment interpolation and obtain a c, d, e, f, j, k, m, l.
(5) utilize some A, B, C and D that a m is carried out low-pass filtering.
Just obtain putting A 16 some A, a, 1, b, c, d, e, f, 2, h, 3, i, j, k, l, m on every side after 1/4 interpolation through above-mentioned 5 steps.We are through finding to exist following problems to anatomizing of interpolation process:
(1), needs big capacity storage space storage interpolation result if whole frame data are carried out carrying out the data volume that the every frame data amount of estimation will be expanded to 16 frames again after the interpolation.
(2) the complete serial of interpolation and motion estimation process has increased the interframe encode time.
(3) there be a large amount of the repetition in adjacent whole pixel as the interpolation input data of an A, B, C, D.
Summary of the invention
The object of the present invention is to provide a kind of real-time 1/4 interpolation method based on multi-stage pipeline arrangement.It is undertaken by interpolation process is combined with motion estimation process, and interpolation process is organized into multi-stage pipeline arrangement, reduces the time and the space hold of video encoding-decoding process 1/4 interpolation process, thereby improves encoding-decoding efficiency.Reduction takies storage resources, improves the arithmetic speed of 1/4 value, and the speed that improves coding and decoding video is played an important role.
Technical scheme is as follows:
A kind of real-time 1/4 interpolation method based on multi-stage pipeline arrangement, this method combines with motion estimation process 1/4 interpolation process of stipulating in the MPEG4 agreement carries out by the streamline order, and with interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, ranks 1/4 interpolation and 5 grades of Fully-pipelined structures of data output, so not only can save taking of memory resource in the interpolation process, and can accelerate interpolation speed greatly.May further comprise the steps:
(1) current block of using in the motion estimation process is carried out real-time interpolation;
(2) estimation and 1/4 interpolation process are organized into 2 stage pipeline structure;
(3) repeat to read in by setting up intermediate buffer to eliminate data;
(4) 1/4 interpolation process is organized into data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure;
(5) 1/2 interpolation process is designed to 4 stage pipeline structure;
(6) design is scalable: can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
Described method combines motion estimation process and 1/4 interpolation process and carries out, and the current block of using in the motion estimation process is carried out real-time interpolation, thereby can the conserve memory resource occupation.
Described method is organized into 2 stage pipeline structure with estimation and 1/4 interpolation process, reduces the time occupation proportion of interpolation in whole cataloged procedure.
Described method, by setting up intermediate buffer buffer portion input data, eliminate adjacent whole pixel in interpolation process, import data repeat read in, thereby each whole pixel relevant with the interpolated data piece only need be read into once.
Described method, with 1/4 interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure, quicken the arithmetic speed of 1/4 interpolation.
Described method, 6 rank linear interpolations with 1/2 interpolation process adopts are designed to 4 stage pipeline structure, and streamline starts the every bat in back and calculates a result, thereby can improve the arithmetic speed of interpolation greatly.
Described method, the number of interpolation arithmetic unit is configurable, can be different according to using, improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
Description of drawings
Fig. 1 is data block 1/4 an interpolation process schematic diagram;
Fig. 2 is the interpolation process schematic diagram of an A;
Fig. 3 is that an A and B input data repeat schematic diagram; (white portion is a repeat region among the figure)
Fig. 4 is 1/4 slotting overall construction drawing; (clip:min (max (0, s7), 255)) wherein
The method comprises following feature:
(1) current block of using in the motion estimation process is carried out real-time interpolation.
H.264 having 16 * 16 in the agreement, 16 * 8,8 * 16,8 * 8,8 * 4,4 * 8 and 4 * 4 data blocks of totally 7 kinds of shapes, video encoding-decoding process is unit with the macro block, only use 48 * 48 interpolation results (decoding only needs 16 * 16) at every kind of shape piece each macro block in motion estimation process, these results use up and once just no longer repeat to use, if so adopt the way of real-time interpolation, only needed before macro block is carried out estimation, the data that will use are carried out interpolation, the result is placed in the buffering area of 48 * 48 (decoding is 16 * 16), when carrying out the estimation of next macro block, override the interpolated data that former macro block is used with new slotting straight result, the whole motion estimation process of great frame of so no matter encoding only needs the memory space of 48 * 48 bytes (decoding is needs 16 * 16 bytes only) to deposit interpolation result.
(2) estimation and interpolation process are pressed the streamline sequential organization
In order to improve the degree of concurrence of estimation and 1/4 interpolation, the present invention with estimation and 1/4 interpolation process according to the streamline sequential organization, as shown in Figure 3, after interpolation device is calculated delegation's interpolation result, just carry out estimation with this line data, interpolation device calculates the next line interpolation result simultaneously.The time of such 1/4 interpolation process takies basically by " dissolving is " in motion estimation process.
(3) repeat to read in by setting up intermediate buffer to eliminate data
Desire is carried out interpolation to 1 A, need its 6 * 6 data blocks on every side as input as shown in Figure 2, same desire is carried out interpolation to a B also needs its 6 * 6 data blocks on every side as input, because A is adjacent with B, therefore the input of the two existence is a large amount of repeats, and putting the back 5 row inputs of A and preceding 5 row of some B as shown in Figure 3 is repetitions.Same reason, also there are the repetition of 5 row, 6 row in the interpolation input of the some C among Fig. 2 and the input of A.In order to reduce the input that repeats of data, the present invention has adopted an intermediate buffer, preceding 6 line data of elder generation's buffer memory whole data block interpolation input, successively each point in the data block is carried out interpolation then, when like this first whole pixel of going being carried out interpolation, data directly obtain in the memory block internally, do not need the repeated accesses external memory area, increase simultaneously a row buffer again, be used for cushioning the data of reading in of next line (the 7th row), reading in simultaneously of the interpolation of first line data and the 7th line data carried out, when the whole pixel interpolation of first row finishes, the 7th line data also should be read in the buffering area, with 2-6 line data and the 7th line data that just read in interpolation input data as the whole pixel of second row, finish 1/4 interpolation and so forth, the result is deposited in 48 * 48 the intermediate buffer and call for motion estimation process to whole data block.The characteristics of this structure are that the number of times of visit external memory area is low, and each data only need be read in once.
(4) interpolation device is designed to 5 stage pipeline structure
Interpolation device serves as the input unit with 6 * 6 data blocks, is divided into that data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and output five-stage pipeline structure.As shown in Figure 3, every bat is sent into one 6 * 6 data block of interpolation device a point A is carried out interpolation, through after the five-stage pipeline, obtains 16 interpolation results corresponding with this point, after streamline starts fully, and 16 interpolation results of every bat output.
(5) 1/2 interpolation process is designed to 4 stage pipeline structure.
Because at the 6 rank linear interpolations that 1/2 interpolation process adopts, computation complexity height.For example establish continuous 6 data of delegation and be (in0, in1, in2, in3, in4, in5) desire is carried out 6 rank linear interpolations to in2, computational process is: out=min ((max ((in0-5*in1+20*in2+20*in3-5*in4+in5+16)/32,0), 255), if finishing, this computing will reduce the frequency of whole system in one claps, in order to eliminate this critical path, the present invention is designed to 4 stage pipeline structure with this computing, as shown in Figure 4,6 data are sent in every bat, and 4 clap interpolation result of the every bat output in back.
(6) design is scalable: can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
An interpolation arithmetic unit as shown in Figure 4, it can independently finish the interpolation of series of points, can improve interpolation speed by the number that changes the interpolation arithmetic unit or reduce resource occupation.
Practical application
Our Ying Jing is successfully applied to this method in the H.264 decoding proofing chip that I develop, and has obtained significant effect, and the present per second of this chip can decoding standard definition video (720 * 576) 35 frames.
Design scalable: improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.This method can be applied in the design based on the video encoder of MPEG4 agreement.

Claims (7)

1.一种基于多级流水线结构的实时1/4插值方法,将MPEG4协议中规定的1/4插值过程与运动估计过程按流水线顺序结合进行,并将插值过程组织成数据读入、行1/2插值、列1/2插值、行列1/4插值和数据输出5级全流水线结构,包括以下步骤:1. A real-time 1/4 interpolation method based on a multi-stage pipeline structure, which combines the 1/4 interpolation process specified in the MPEG4 protocol with the motion estimation process in pipeline order, and organizes the interpolation process into data read-in, line 1 /2 interpolation, column 1/2 interpolation, row and column 1/4 interpolation and data output 5-stage full pipeline structure, including the following steps: (1)对运动估计过程中用到的当前块进行实时插值;(1) Perform real-time interpolation on the current block used in the motion estimation process; (2)将运动估计和1/4插值过程组织成2级流水线结构;(2) Organize the motion estimation and 1/4 interpolation process into a 2-stage pipeline structure; (3)通过设立中间缓冲区消除数据重复读入;(3) Eliminate repeated reading of data by setting up an intermediate buffer; (4)将1/4插值过程组织成数据读入、行1/2插值、列1/2插值、双线性插值和数据输出5级流水线结构;(4) Organize the 1/4 interpolation process into a 5-stage pipeline structure of data input, row 1/2 interpolation, column 1/2 interpolation, bilinear interpolation and data output; (5)将1/2插值过程设计成4级流水线结构;(5) Design the 1/2 interpolation process into a 4-stage pipeline structure; (6)设计可伸缩:可以根据应用不同,方便的通过增加或减少插值运算单元的数目提高插值速度或降低资源占用。(6) Scalable design: According to different applications, it is convenient to increase or decrease the number of interpolation calculation units to increase interpolation speed or reduce resource occupation. 2.按照权利要求1所述的方法,其特征在于:将运动估计过程和1/4插值过程结合在一起进行,对运动估计过程中用到的当前块进行实时插值,从而可以节约存储资源占用。2. The method according to claim 1, characterized in that: the motion estimation process and the 1/4 interpolation process are combined, and the current block used in the motion estimation process is interpolated in real time, thereby saving storage resource occupation . 3.按照权利要求1所述的方法,其特征在于:将运动估计和1/4插值过程组织成2级流水线结构,减少插值在整个编码过程中的时间占用比例。3. The method according to claim 1, characterized in that: the motion estimation and 1/4 interpolation processes are organized into a two-stage pipeline structure to reduce the time occupation ratio of interpolation in the entire encoding process. 4.按照权利要求1所述的方法,其特征在于:通过设立中间缓冲区缓冲部分输入数据,消除相邻整像素点在插值过程中输入数据的重复读入,从而使每个与插值数据块相关的整像素点只需被读入一次。4. according to the described method of claim 1, it is characterized in that: by setting up intermediate buffer buffer part input data, eliminate the repeated reading of input data of adjacent integer pixel point in interpolation process, thereby make each and interpolation data block The relevant integer pixels need only be read in once. 5.按照权利要求1所述的方法,其特征在于:将1/4插值过程组织成数据读入、行1/2插值、列1/2插值、双线性插值和数据输出5级流水线结构,加速1/4插值的运算速度。5. according to the described method of claim 1, it is characterized in that: 1/4 interpolation process is organized into data read-in, row 1/2 interpolation, column 1/2 interpolation, bilinear interpolation and data output 5-stage pipeline structure , to speed up the operation speed of 1/4 interpolation. 6.按照权利要求1所述的方法,其特征在于:将1/2插值过程采用的6阶线性插值,设计成4级流水线结构,流水线启动后每拍计算出一个结果,从而可以大大提高插值的运算速度。6. The method according to claim 1, characterized in that: the 6-order linear interpolation used in the 1/2 interpolation process is designed into a 4-stage pipeline structure, and a result is calculated for each shot after the pipeline starts, thereby greatly improving the interpolation operating speed. 7.按照权利要求1所述的方法,其特征在于:插值运算单元的数目可配置,可以根据应用不同,方便的通过增加或减少插值运算单元的数目提高插值速度或降低资源占用。7. The method according to claim 1, characterized in that: the number of interpolation calculation units is configurable, and according to different applications, it is convenient to increase or decrease the number of interpolation calculation units to increase interpolation speed or reduce resource occupation.
CN 03152502 2003-08-01 2003-08-01 Real time 1/4 interpolation method based on multistage pipeline architecture Expired - Fee Related CN1186939C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 03152502 CN1186939C (en) 2003-08-01 2003-08-01 Real time 1/4 interpolation method based on multistage pipeline architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 03152502 CN1186939C (en) 2003-08-01 2003-08-01 Real time 1/4 interpolation method based on multistage pipeline architecture

Publications (2)

Publication Number Publication Date
CN1481169A true CN1481169A (en) 2004-03-10
CN1186939C CN1186939C (en) 2005-01-26

Family

ID=34156542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 03152502 Expired - Fee Related CN1186939C (en) 2003-08-01 2003-08-01 Real time 1/4 interpolation method based on multistage pipeline architecture

Country Status (1)

Country Link
CN (1) CN1186939C (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100394797C (en) * 2006-04-13 2008-06-11 上海交通大学 VLSI Implementation Method of Luma Interpolator Based on AVS Motion Compensation
CN100592799C (en) * 2008-06-12 2010-02-24 四川虹微技术有限公司 Rapid reading method of motion compensating data based on H.264 standard
CN102340668A (en) * 2011-09-30 2012-02-01 上海交通大学 A Realization Method of MPEG2 Brightness Interpolation Based on Reconfigurable Technology
CN103425722A (en) * 2012-04-30 2013-12-04 Sap股份公司 Logless atomic data movement

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100394797C (en) * 2006-04-13 2008-06-11 上海交通大学 VLSI Implementation Method of Luma Interpolator Based on AVS Motion Compensation
CN100592799C (en) * 2008-06-12 2010-02-24 四川虹微技术有限公司 Rapid reading method of motion compensating data based on H.264 standard
CN102340668A (en) * 2011-09-30 2012-02-01 上海交通大学 A Realization Method of MPEG2 Brightness Interpolation Based on Reconfigurable Technology
CN102340668B (en) * 2011-09-30 2013-07-17 上海交通大学 Reconfigurable technology-based implementation method of MPEG2 (Moving Pictures Experts Group 2) luminance interpolation
CN103425722A (en) * 2012-04-30 2013-12-04 Sap股份公司 Logless atomic data movement
CN103425722B (en) * 2012-04-30 2017-08-15 Sap欧洲公司 The method and system of atomic data movement

Also Published As

Publication number Publication date
CN1186939C (en) 2005-01-26

Similar Documents

Publication Publication Date Title
US8218642B2 (en) Macro-block video stream encoding
CN1305313C (en) System for discrete cosine transform and inverse discrete cosine transform with pipeline architecture
CN1316856A (en) motion estimator
CN1232125C (en) Method for motion estimation (me) through discrete cosine transform (dct) and an apparatus therefor
CN1956544A (en) Image data processing method and system using continuous/interlaced area prediction
CN1290339C (en) Compressed moving image decompression device and image display device using the same
CN101188761A (en) Method for optimizing DCT quick algorithm based on parallel processing in AVS
CN1245839C (en) Decentralized video data stream decoding method
CN1968420A (en) Image Processing Method Applied to Image Decoder and Encoder
CN1574967A (en) Compressor arrangement for moving images and video taking device using the same
CN1481169A (en) A real-time 1/4 interpolation method based on multi-stage pipeline structure
US20060133512A1 (en) Video decoder and associated methods of operation
CN1154367C (en) Method and device for predictive macroblock data access conversion of motion pictures
CN101031081A (en) Image encoding device and method thereof
CN1625266A (en) Device for calculating absolute difference, motion estimation device and motion picture coding device
CN1287600C (en) Tree block structure and multi-frame-reference motion estimating method and apparatus
CN101854538B (en) Motion image processing method and motion image processor
CN1852442A (en) Layering motion estimation method and super farge scale integrated circuit
CN1710961A (en) Video compression method for mobile devices
Fan et al. Co-Via: A Video Frame Interpolation Accelerator Exploiting Codec Information Reuse
CN1805544A (en) Block Matching for Displacement Estimation in the Frequency Domain
CN1825960A (en) Multi-Pipeline Stage Information Sharing Method Based on Data Cache
CN1283107C (en) Quick movement prediction method and structure thereof
CN1909663A (en) Combined processing method for entropy decoding and converting flow line stage
CN118317115A (en) Data decoding method and device for equal bit precision prediction, mapping and segment coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20050126

Termination date: 20190801

CF01 Termination of patent right due to non-payment of annual fee