CN1481169A - Real time 1/4 interpolation method based on multistage pipeline architecture - Google Patents
Real time 1/4 interpolation method based on multistage pipeline architecture Download PDFInfo
- Publication number
- CN1481169A CN1481169A CNA031525024A CN03152502A CN1481169A CN 1481169 A CN1481169 A CN 1481169A CN A031525024 A CNA031525024 A CN A031525024A CN 03152502 A CN03152502 A CN 03152502A CN 1481169 A CN1481169 A CN 1481169A
- Authority
- CN
- China
- Prior art keywords
- interpolation
- data
- row
- stage pipeline
- motion estimation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The method combines 1/4 interpolation method specified in MPEG4 protocol with motion estimation procedure. That is 1/4 interpolation is carried out only for blocks utilized in motion estimation procedure. Moreover, the interpolation procedure is organized as 5 stages of pipeline architecture: data reading in, row 1/2 interpolation, column 1/2 interpolation, bilinear interpolation and data output. Thus, the method saves storage resources utilized in interpolation procedure and speeds up interpolation. The method includes interpolating blocks utilized in motion estimation procedure by coder/decoder in realtime, carrying out combined interpolation procedure with motion estimation procedure in sequence of pipeline, setting up intermediate buffer to eliminate reading data repeatedly, interpolation procedure organized as 5 stages of pipeline, 1/2 interpolation procedure designed as 4 stages of pipeline. The invention is applied to design of video encoder.
Description
Technical field
The present invention relates to the video coding and decoding technology field, particularly between frame of video image based on a kind of real-time 1/4 interpolation method of the encoding and decoding technique of time prediction based on multi-stage pipeline arrangement.
Background technology
In video coding, often image is divided into two kinds of I picture and inter frame images.Wherein, inter frame image need adopt motion estimation coding method, and this mainly is because the pixel of adjacent image piece has very big temporal correlation.The main thought of this coding method is to find the piece that mates most with the encoding block predicted value (prediction piece) as encoding block in reference picture, encodes.Encoding block and prediction piece matching degree are high more, and the efficient of coding is high more.In order to improve the matching degree of the two, just need to improve the precision of estimation.What MPEG1 adopted is whole pixel precision, MPEG2 and H.263 employing be half-pixel accuracy, in MPEG4, adopted 1/4 pixel precision, thereby can improve code efficiency.But 1/4 precision estimation need be carried out 1/4 interpolation to reference picture, 1/4 interpolation process computation complexity height, and desire is carried out interpolation to an integral point need carry out 6 rank linear interpolation and bilinear interpolations to 6 * 6 integral points around this point.As shown in Figure 1, an image block through 1/4 interpolation after the size become original 16 times.A whole pixel in the original image piece becomes 16 points after through 1/4 interpolation, process as shown in Figure 2, it is that 16 points need be with 6 * 6 integral points around the A as input that desire will be put the A interpolation, detailed process is:
(1) respectively 6 * 6 each row of data is carried out 6 rank linear interpolations, obtain a little 1 and 5.
(2) respectively an A, 1,4 columns are carried out 6 rank linear interpolations, obtain a little 2,3,4.
(3) respectively to an A, 2, the C place every trade linear interpolation of advancing, obtain an a, b, h, i, o, p.
(4) respectively an A, a, 1, b column are carried out the alignment interpolation and obtain a c, d, e, f, j, k, m, l.
(5) utilize some A, B, C and D that a m is carried out low-pass filtering.
Just obtain putting A 16 some A, a, 1, b, c, d, e, f, 2, h, 3, i, j, k, l, m on every side after 1/4 interpolation through above-mentioned 5 steps.We are through finding to exist following problems to anatomizing of interpolation process:
(1), needs big capacity storage space storage interpolation result if whole frame data are carried out carrying out the data volume that the every frame data amount of estimation will be expanded to 16 frames again after the interpolation.
(2) the complete serial of interpolation and motion estimation process has increased the interframe encode time.
(3) there be a large amount of the repetition in adjacent whole pixel as the interpolation input data of an A, B, C, D.
Summary of the invention
The object of the present invention is to provide a kind of real-time 1/4 interpolation method based on multi-stage pipeline arrangement.It is undertaken by interpolation process is combined with motion estimation process, and interpolation process is organized into multi-stage pipeline arrangement, reduces the time and the space hold of video encoding-decoding process 1/4 interpolation process, thereby improves encoding-decoding efficiency.Reduction takies storage resources, improves the arithmetic speed of 1/4 value, and the speed that improves coding and decoding video is played an important role.
Technical scheme is as follows:
A kind of real-time 1/4 interpolation method based on multi-stage pipeline arrangement, this method combines with motion estimation process 1/4 interpolation process of stipulating in the MPEG4 agreement carries out by the streamline order, and with interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, ranks 1/4 interpolation and 5 grades of Fully-pipelined structures of data output, so not only can save taking of memory resource in the interpolation process, and can accelerate interpolation speed greatly.May further comprise the steps:
(1) current block of using in the motion estimation process is carried out real-time interpolation;
(2) estimation and 1/4 interpolation process are organized into 2 stage pipeline structure;
(3) repeat to read in by setting up intermediate buffer to eliminate data;
(4) 1/4 interpolation process is organized into data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure;
(5) 1/2 interpolation process is designed to 4 stage pipeline structure;
(6) design is scalable: can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
Described method combines motion estimation process and 1/4 interpolation process and carries out, and the current block of using in the motion estimation process is carried out real-time interpolation, thereby can the conserve memory resource occupation.
Described method is organized into 2 stage pipeline structure with estimation and 1/4 interpolation process, reduces the time occupation proportion of interpolation in whole cataloged procedure.
Described method, by setting up intermediate buffer buffer portion input data, eliminate adjacent whole pixel in interpolation process, import data repeat read in, thereby each whole pixel relevant with the interpolated data piece only need be read into once.
Described method, with 1/4 interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure, quicken the arithmetic speed of 1/4 interpolation.
Described method, 6 rank linear interpolations with 1/2 interpolation process adopts are designed to 4 stage pipeline structure, and streamline starts the every bat in back and calculates a result, thereby can improve the arithmetic speed of interpolation greatly.
Described method, the number of interpolation arithmetic unit is configurable, can be different according to using, improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
Description of drawings
Fig. 1 is data block 1/4 an interpolation process schematic diagram;
Fig. 2 is the interpolation process schematic diagram of an A;
Fig. 3 is that an A and B input data repeat schematic diagram; (white portion is a repeat region among the figure)
Fig. 4 is 1/4 slotting overall construction drawing; (clip:min (max (0, s7), 255)) wherein
The method comprises following feature:
(1) current block of using in the motion estimation process is carried out real-time interpolation.
H.264 having 16 * 16 in the agreement, 16 * 8,8 * 16,8 * 8,8 * 4,4 * 8 and 4 * 4 data blocks of totally 7 kinds of shapes, video encoding-decoding process is unit with the macro block, only use 48 * 48 interpolation results (decoding only needs 16 * 16) at every kind of shape piece each macro block in motion estimation process, these results use up and once just no longer repeat to use, if so adopt the way of real-time interpolation, only needed before macro block is carried out estimation, the data that will use are carried out interpolation, the result is placed in the buffering area of 48 * 48 (decoding is 16 * 16), when carrying out the estimation of next macro block, override the interpolated data that former macro block is used with new slotting straight result, the whole motion estimation process of great frame of so no matter encoding only needs the memory space of 48 * 48 bytes (decoding is needs 16 * 16 bytes only) to deposit interpolation result.
(2) estimation and interpolation process are pressed the streamline sequential organization
In order to improve the degree of concurrence of estimation and 1/4 interpolation, the present invention with estimation and 1/4 interpolation process according to the streamline sequential organization, as shown in Figure 3, after interpolation device is calculated delegation's interpolation result, just carry out estimation with this line data, interpolation device calculates the next line interpolation result simultaneously.The time of such 1/4 interpolation process takies basically by " dissolving is " in motion estimation process.
(3) repeat to read in by setting up intermediate buffer to eliminate data
Desire is carried out interpolation to 1 A, need its 6 * 6 data blocks on every side as input as shown in Figure 2, same desire is carried out interpolation to a B also needs its 6 * 6 data blocks on every side as input, because A is adjacent with B, therefore the input of the two existence is a large amount of repeats, and putting the back 5 row inputs of A and preceding 5 row of some B as shown in Figure 3 is repetitions.Same reason, also there are the repetition of 5 row, 6 row in the interpolation input of the some C among Fig. 2 and the input of A.In order to reduce the input that repeats of data, the present invention has adopted an intermediate buffer, preceding 6 line data of elder generation's buffer memory whole data block interpolation input, successively each point in the data block is carried out interpolation then, when like this first whole pixel of going being carried out interpolation, data directly obtain in the memory block internally, do not need the repeated accesses external memory area, increase simultaneously a row buffer again, be used for cushioning the data of reading in of next line (the 7th row), reading in simultaneously of the interpolation of first line data and the 7th line data carried out, when the whole pixel interpolation of first row finishes, the 7th line data also should be read in the buffering area, with 2-6 line data and the 7th line data that just read in interpolation input data as the whole pixel of second row, finish 1/4 interpolation and so forth, the result is deposited in 48 * 48 the intermediate buffer and call for motion estimation process to whole data block.The characteristics of this structure are that the number of times of visit external memory area is low, and each data only need be read in once.
(4) interpolation device is designed to 5 stage pipeline structure
Interpolation device serves as the input unit with 6 * 6 data blocks, is divided into that data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and output five-stage pipeline structure.As shown in Figure 3, every bat is sent into one 6 * 6 data block of interpolation device a point A is carried out interpolation, through after the five-stage pipeline, obtains 16 interpolation results corresponding with this point, after streamline starts fully, and 16 interpolation results of every bat output.
(5) 1/2 interpolation process is designed to 4 stage pipeline structure.
Because at the 6 rank linear interpolations that 1/2 interpolation process adopts, computation complexity height.For example establish continuous 6 data of delegation and be (in0, in1, in2, in3, in4, in5) desire is carried out 6 rank linear interpolations to in2, computational process is: out=min ((max ((in0-5*in1+20*in2+20*in3-5*in4+in5+16)/32,0), 255), if finishing, this computing will reduce the frequency of whole system in one claps, in order to eliminate this critical path, the present invention is designed to 4 stage pipeline structure with this computing, as shown in Figure 4,6 data are sent in every bat, and 4 clap interpolation result of the every bat output in back.
(6) design is scalable: can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
An interpolation arithmetic unit as shown in Figure 4, it can independently finish the interpolation of series of points, can improve interpolation speed by the number that changes the interpolation arithmetic unit or reduce resource occupation.
Practical application
Our Ying Jing is successfully applied to this method in the H.264 decoding proofing chip that I develop, and has obtained significant effect, and the present per second of this chip can decoding standard definition video (720 * 576) 35 frames.
Design scalable: improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.This method can be applied in the design based on the video encoder of MPEG4 agreement.
Claims (7)
1. real-time 1/4 interpolation method based on multi-stage pipeline arrangement, 1/4 interpolation process of stipulating in the MPEG4 agreement combined by the streamline order with motion estimation process carry out, and with interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, ranks 1/4 interpolation and 5 grades of Fully-pipelined structures of data output, may further comprise the steps:
(1) current block of using in the motion estimation process is carried out real-time interpolation;
(2) estimation and 1/4 interpolation process are organized into 2 stage pipeline structure;
(3) repeat to read in by setting up intermediate buffer to eliminate data;
(4) 1/4 interpolation process is organized into data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure;
(5) 1/2 interpolation process is designed to 4 stage pipeline structure;
(6) design is scalable: can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
2. in accordance with the method for claim 1, it is characterized in that: motion estimation process and 1/4 interpolation process are combined carry out, the current block of using in the motion estimation process is carried out real-time interpolation, thus can the conserve memory resource occupation.
3. in accordance with the method for claim 1, it is characterized in that: estimation and 1/4 interpolation process are organized into 2 stage pipeline structure, reduce the time occupation proportion of interpolation in whole cataloged procedure.
4. in accordance with the method for claim 1, it is characterized in that: by setting up intermediate buffer buffer portion input data, eliminate adjacent whole pixel in interpolation process, import data repeat read in, thereby each whole pixel relevant with the interpolated data piece only need be read into once.
5. in accordance with the method for claim 1, it is characterized in that: with 1/4 interpolation process be organized into that data are read in, row 1/2 interpolation, row 1/2 interpolation, bilinear interpolation and data export 5 stage pipeline structure, quicken the arithmetic speed of 1/4 interpolation.
6. in accordance with the method for claim 1, it is characterized in that: 6 rank linear interpolations with 1/2 interpolation process adopts, be designed to 4 stage pipeline structure, streamline starts the every bat in back and calculates a result, thereby can improve the arithmetic speed of interpolation greatly.
7. in accordance with the method for claim 1, it is characterized in that: the number of interpolation arithmetic unit is configurable, can be different according to using, and improve interpolation speed or reduce resource occupation by the number that increases or reduce the interpolation arithmetic unit easily.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03152502 CN1186939C (en) | 2003-08-01 | 2003-08-01 | Real time 1/4 interpolation method based on multistage pipeline architecture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 03152502 CN1186939C (en) | 2003-08-01 | 2003-08-01 | Real time 1/4 interpolation method based on multistage pipeline architecture |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1481169A true CN1481169A (en) | 2004-03-10 |
CN1186939C CN1186939C (en) | 2005-01-26 |
Family
ID=34156542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 03152502 Expired - Fee Related CN1186939C (en) | 2003-08-01 | 2003-08-01 | Real time 1/4 interpolation method based on multistage pipeline architecture |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1186939C (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100394797C (en) * | 2006-04-13 | 2008-06-11 | 上海交通大学 | Method of realizing VLSI of brightness interpolator based on AVS movement compensation |
CN100592799C (en) * | 2008-06-12 | 2010-02-24 | 四川虹微技术有限公司 | Rapid reading method of motion compensating data based on H.264 standard |
CN102340668A (en) * | 2011-09-30 | 2012-02-01 | 上海交通大学 | Reconfigurable technology-based implementation method of MPEG2 (Moving Pictures Experts Group 2) luminance interpolation |
CN103425722A (en) * | 2012-04-30 | 2013-12-04 | Sap股份公司 | Logless atomic data movement |
-
2003
- 2003-08-01 CN CN 03152502 patent/CN1186939C/en not_active Expired - Fee Related
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100394797C (en) * | 2006-04-13 | 2008-06-11 | 上海交通大学 | Method of realizing VLSI of brightness interpolator based on AVS movement compensation |
CN100592799C (en) * | 2008-06-12 | 2010-02-24 | 四川虹微技术有限公司 | Rapid reading method of motion compensating data based on H.264 standard |
CN102340668A (en) * | 2011-09-30 | 2012-02-01 | 上海交通大学 | Reconfigurable technology-based implementation method of MPEG2 (Moving Pictures Experts Group 2) luminance interpolation |
CN102340668B (en) * | 2011-09-30 | 2013-07-17 | 上海交通大学 | Reconfigurable technology-based implementation method of MPEG2 (Moving Pictures Experts Group 2) luminance interpolation |
CN103425722A (en) * | 2012-04-30 | 2013-12-04 | Sap股份公司 | Logless atomic data movement |
CN103425722B (en) * | 2012-04-30 | 2017-08-15 | Sap欧洲公司 | The method and system of atomic data movement |
Also Published As
Publication number | Publication date |
---|---|
CN1186939C (en) | 2005-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101376385B1 (en) | Image processing apparatus and image processing method | |
CN1129320C (en) | Improved contour approximation method for representing contour of object | |
CN101610413B (en) | Video coding/decoding method and device | |
CN1305313C (en) | System for discrete cosine transforms/inverse discrete cosine transforms based on pipeline architecture | |
US20060133512A1 (en) | Video decoder and associated methods of operation | |
JP2011530222A (en) | Video encoder with integrated temporal filter for noise removal | |
CN1325220A (en) | Motion-vector coding method | |
CN1665299A (en) | Method for designing architecture of scalable video coder decoder | |
CN1852442A (en) | Layering motion estimation method and super farge scale integrated circuit | |
CN1245839C (en) | Decode system and method for distributed video data stream | |
CN1968420A (en) | Methods of image processing for video encoder and decoder | |
CN1779716A (en) | Realization of rapid coding-decoding circuit with run-length | |
CN1186939C (en) | Real time 1/4 interpolation method based on multistage pipeline architecture | |
KR20050012733A (en) | Adaptive method and system for mapping parameter values to codeword indexes | |
CN1428051A (en) | Approximate IDCT for scalable video and image decoding of computational complexity | |
CN1941903A (en) | Code-transferring system and method for realizing multiple code flow output simultaneouslly | |
CN1745587A (en) | Method of video coding for handheld apparatus | |
CN1625266A (en) | Apparatus for calculating absolute difference value, and motion estimation apparatus and motion picture encoding apparatus | |
CN1520187A (en) | System and method for video data compression | |
CN1254979C (en) | Conversion method of coded video data | |
CN1825960A (en) | Multi-pipeline phase information sharing method based on data buffer storage | |
CN105635731A (en) | Intra-frame prediction reference point preprocessing method for high efficiency video coding | |
CN1306823C (en) | Method and device for concurrent processing run-length coding, inverse scanning inverse quantization | |
CN1909663A (en) | Combined processing method for entropy decoding and converting flow line stage | |
Yang et al. | An effective dictionary-based display frame compressor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20050126 Termination date: 20190801 |