CN101986709B - Video decoding method for high-efficiency fetching of matching block and circuit - Google Patents

Video decoding method for high-efficiency fetching of matching block and circuit Download PDF

Info

Publication number
CN101986709B
CN101986709B CN 201010534938 CN201010534938A CN101986709B CN 101986709 B CN101986709 B CN 101986709B CN 201010534938 CN201010534938 CN 201010534938 CN 201010534938 A CN201010534938 A CN 201010534938A CN 101986709 B CN101986709 B CN 101986709B
Authority
CN
China
Prior art keywords
macro block
peek
syntactic element
resolve
residual error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010534938
Other languages
Chinese (zh)
Other versions
CN101986709A (en
Inventor
陈恒明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rockchip Electronics Co Ltd
Original Assignee
Fuzhou Rockchip Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Rockchip Electronics Co Ltd filed Critical Fuzhou Rockchip Electronics Co Ltd
Priority to CN 201010534938 priority Critical patent/CN101986709B/en
Publication of CN101986709A publication Critical patent/CN101986709A/en
Application granted granted Critical
Publication of CN101986709B publication Critical patent/CN101986709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a video decoding method for high-efficiency fetching of a matching block and a circuit. In the method, a plurality of macroblocks are combined into a macroblock set; fetching is performed by taking a macroblock set as a unit; and after fetching information of the plurality of macroblocks is acquired, the fetching information is combined and optimized. Meanwhile, because two stages of macroblocks are used for decoding, the previous stage of macroblock decoding hardware can serve the next macroblock decoding after a next macroblock starts combination and the fetch so that the macroblock parallel effect is achieved to a great extent. The spatial locality of reference data is utilized to the greatest extent, access in one macroblock set is optimized and combined, and the bus utilization ratio is improved by using long-burst transmission of a high-bit width bus.

Description

Method and the circuit of the video decode of the efficient peek of a kind of match block
Technical field
The invention belongs to the video decode technical field, specifically refer to method and the circuit of the video decode of the efficient peek of a kind of match block.
Background technology
In video decoding process, use the macro block (macroblock) of inter prediction (inter) pattern in motion compensation (motion compensation) process, need to from the image of having decoded, take out match block, as predicting the outcome, then on the basis that predicts the outcome, the residual error data that stack parses from code stream, the reconstruct macro block (macroblock) that obtains reducing.
The picture size of video is in increase at full speed in recent years, between several years from the QCIF of 174x144, be increased to the 1080p of 1920x1080, ever-increasing picture size, so that decoder reading out data from external memory can produce huge access bandwidth, and the high access delay of bringing for solving high latency DDR internal memory that the access bandwidth problem uses, become the performance bottleneck of a lot of decoders.
The development of simultaneous video protocols, in agreement H.264, in order to improve compression ratio, a technology that macro block uses the difformity match block of the diverse location taking-up of different reference frames to predict has appearred, add that the diverse location fractional pixel interpolation expands the required different pixels situation in limit, this is complex so that the peek process becomes, be easy to produce the characteristic of extremely irregular access external memory, the bit wide of current chip bus is improving constantly in addition, 64bit even 128bit have been brought up to from 16bit, a large amount of scattered 8bit level non-alignment small data quantity access, can greatly waste the broadband of bus, the efficient that causes peeking is extremely low, also so that high-performance decoders peek process need has diverse framework to solve these problems.
The flow process of basic video decode comprises the steps: to resolve the relevant syntactic element of macro block (mb) type; Resolve the relevant syntactic element of motion compensation, the process relevant information obtains peeking; Resolve the relevant syntactic element of residual error coefficient, through counter-scanning (Inverse Scan), inverse transformation (Inverse Transform), inverse quantization (InverseQuantization) obtains the residual error pixel data; The peek module is peeked from external memory according to peek process relevant information; Motion compensating module carries out interpolation and weight estimation according to the reference block data of getting, and obtains predict pixel; Stack predict pixel and residual error pixel obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.In the flow process of general macro block hardware decoding, as shown in Figure 1, can be in type module decoded macroblock type and sub-block type, be judged as be the interframe type after, obtain ref_idx and mv in the Get_MV module, the address that obtains peeking and size, giving the fetch module peeks, remaining coefficient code stream is sent into the Get_Residual module, parse residual error coefficient, carry out counter-scanning (Inverse Scan), inverse transformation (Inverse Transform), inverse quantization (Inverse Quantization) obtains the residual error pixel.With a little whiles, the fetch module is peeked from external memory, the number of getting is sent into motion compensation (Motion Compensation) module carry out interpolation and weighting, obtains predict pixel.After residual error pixel and the predict pixel stack, obtain reconstructed pixel, send into again last block elimination filtering (deblock) module filtered and output.
In common decoding process, from the MC path that the fetch module begins, owing to there is inefficient irregular external memory access, the MC path is the residual error path that is much more slowly than the hardware inner high speed in speed, so the bandwidth that the speed of whole hardware can accessed external memory is limited.
Chinese invention patent discloses a kind of video image motion compensator No. 20041009125.4, and this scheme arrives the interprocedual adding interpolation calculation of storage after the access number outside, but the efficient of peek process is not had to improve.
Chinese invention patent discloses a kind of image storage method for compressing video frequency signal decode No. 20051000487.2, reference picture during this scheme is deposited outside uses the special format storage, access procedure is complicated, in particular cases also may lower efficiency at some.
Chinese invention patent discloses a kind of moving compensating data device for loading and method No. 20051009873.7, and this scheme is two dimension peek merger the one dimension peek, and part has improved the efficient of peek, but only uses for single macro block.
Chinese invention patent discloses method and the device of the data parallel read-write of control strip built-in storage in a kind of decoding device for No. 20071004671, this scheme is peeked parallel under the piecemeal yardstick under the macro block, can't produce a desired effect under the scene of the contour access delay internal memory of DDR.
Chinese invention patent discloses a kind of data pre-fetching system in video processing No. 20071004692.9, and this scheme is applicable to the non-cache mechanism data prefetched instruction of the processor classes such as CPU or DSP.
Chinese invention patent discloses a kind of data rapidly-reading method based on the compensation of standard movement H.264 for No. 20081030211.6, this scheme can increase line feed extra on the coefficient bus and an expense of bursting actually greatly with the piecemeal reading out data in small, broken bits of 9x9.
Summary of the invention
Technical problem to be solved by this invention is to provide method and the circuit of the efficient video decode of peeking of a kind of match block, and the method is optimized the efficient of peeking on the bus, improves the effect of external memory access speed, and integral body has improved the efficient of video decode.
The present invention solves the problems of the technologies described above by the following technical solutions:
The method of the video decode of the efficient peek of a kind of match block is provided, comprises the steps:
Step 10: several macro blocks are merged into a macro block group;
Step 20: resolve the relevant syntactic element of the first macro block macro block (mb) type in the macro block group;
Step 30: resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information;
Step 40: resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization obtains the first macro block residual error pixel data;
Step 50: resolve the relevant syntactic element of the second macro block macro block (mb) type in the macro block group;
Step 60: resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Afterwards, step 70 is carried out simultaneously with step 80 to step 90, and described step 90 is obtained the reference block data from step 80; Then change step 100 over to;
Step 70: resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization obtains the second macro block residual error pixel data;
Step 80: the peek module merges optimization to the peek process relevant information of the first macro block and the second macro block, uses the peek information of optimizing to peek from external memory, and described step 20,30,40,50,60 in no particular order;
Step 90: motion compensating module carries out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of the first macro block and the second macro block;
Step 100: predict pixel and the residual error pixel of superpose respectively the first macro block and the second macro block obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
A kind of video decode circuit is provided, comprises:
The macro block merge cells is used for several macro blocks are merged into a macro block group;
The first resolution unit is used for resolving the relevant syntactic element of macro block group the first macro block macro block (mb) type; Resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information; Resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the first macro block residual error pixel data;
The second resolution unit is used for resolving the relevant syntactic element of macro block group the second macro block macro block (mb) type; Resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the second macro block residual error pixel data;
Fetch unit is used for the peek process relevant information of the first macro block and the second macro block is merged optimization, uses the peek information of optimizing to peek from external memory;
Motion compensation units is used for carrying out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of the first macro block and the second macro block;
Reconfiguration unit, predict pixel and the residual error pixel of be used for superposeing respectively the first macro block and the second macro block obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
The invention has the advantages that: several macro blocks are merged into a macro block group, peek take a macro block group as unit, at utmost utilize the spatial locality of reference data, optimize the access that merges in the macro block group, utilize the long sudden transmission of high-bit width bus, improve total line use ratio.
Description of drawings
The invention will be further described in conjunction with the embodiments with reference to the accompanying drawings.
Fig. 1 is the basic decoding process figure of prior art inter macroblocks.
Fig. 2 is the basic decoding process figure that the present invention merges a macro block group optimizing the peek process.
Fig. 3 is macro block group parallel procedure schematic diagram of the present invention.
Embodiment
In the high-performance decoders decode procedure, the delay of internal storage access is often much larger than parsing time of a macro block code stream, and in the high-performance decoders design, performance is the first perpetual object, it is not matter of utmost importance that resource is used, and simultaneously, the external memory access can separate with the code stream decoding process, that is to say, can shift to an earlier date the position that a lot of time resolutions obtain the current internal memory that will access.
So the present invention proposes, and several macro blocks are merged into a macro block group, peeks take a macro block group as unit, at utmost utilize the spatial locality of reference data, optimize the access that merges in the macro block group, utilize the long sudden transmission of high-bit width bus, improve total line use ratio.
Details are as follows for concrete steps of the present invention:
Step 10: several macro blocks are merged into a macro block group;
Step 20: resolve the relevant syntactic element of the first macro block macro block (mb) type in the macro block group;
Step 30: resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information;
Step 40: resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization obtains the first macro block residual error pixel data;
Step 50: resolve the relevant syntactic element of the second macro block macro block (mb) type in the macro block group;
Step 60: resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Afterwards, carry out simultaneously step 70,80,90; Then change step 100 over to;
Step 70: resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization obtains the second macro block residual error pixel data;
Step 80: the peek module merges optimization to the peek process relevant information of the first macro block and the second macro block, uses the peek information of optimizing to peek from external memory.
Step 90: motion compensating module carries out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of two macro blocks;
Step 100: predict pixel and the residual error pixel of two macro blocks that superpose respectively obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
Wherein step 80 is carried out with the 20th to 70 step of next time circulation is parallel simultaneously.
See also Fig. 2 and shown in Figure 3, core thinking of the present invention is to obtain after the peek information of a plurality of macro blocks, and these peek information are merged optimization, reaches to optimize the efficient of peeking on the bus, improves the effect of external memory access speed.Simultaneously, owing to use the decoding of two-stage macro block, when a rear macro block began to merge peek, previous stage macro block decoding hardware can be served for next macro block decoding, has reached the parallel effect of macro-block level to a greater extent.
The P_skip macro block that often occurs continuously in the static scene situation is as example, if in 1/4th pixel situations, under the 16x16 macro block of normal condition, each macro block 23x23 point of need to peeking, 23 pixels of every row peek, get 23 row, under 32 buses, use two macro blocks of 32 burst4 transmission, need every enforcement to transmit with two burst4, amount to 92 transmission, total secured transmission of payload data rate is (2*23*23)/((4*4) * 2*23*2)=71.9%.Optimize peek if use the many macro blocks of the present invention to merge, use 32 burst4 transmission just equally, (2*23*23)/((4*4*3) * 23)=95.8%, the utilance in bus broadband is improved greatly.
The present invention also provides a kind of video decode circuit embodiments, comprising:
The macro block merge cells is used for several macro blocks are merged into a macro block group;
The first resolution unit is used for resolving the relevant syntactic element of macro block group the first macro block macro block (mb) type; Resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information; Resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the first macro block residual error pixel data;
The second resolution unit is used for resolving the relevant syntactic element of macro block group the second macro block macro block (mb) type; Resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the second macro block residual error pixel data;
Fetch unit is used for the peek process relevant information of the first macro block and the second macro block is merged optimization, uses the peek information of optimizing to peek from external memory;
Motion compensation units is used for carrying out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of the first macro block and the second macro block;
Reconfiguration unit, predict pixel and the residual error pixel of be used for superposeing respectively the first macro block and the second macro block obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
Use one group of two macro block as macro block group minimum unit, use said frame to decode, in different film source situations, the external memory access bandwidth reduces 20 ~ 40% than former decoder, and whole decoding speed has on average improved more than 10%.

Claims (2)

1. the method for the video decode of the efficient peek of match block is characterized in that: comprise the steps:
Step 10: several macro blocks are merged into a macro block group;
Step 20: resolve the relevant syntactic element of the first macro block macro block (mb) type in the macro block group;
Step 30: resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information;
Step 40: resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the first macro block residual error pixel data;
Step 50: resolve the relevant syntactic element of the second macro block macro block (mb) type in the macro block group;
Step 60: resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Afterwards, step 70 is carried out simultaneously with step 80 to step 90, and described step 90 is obtained the reference block data from step 80; Then change step 100 over to;
Step 70: resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization obtains the second macro block residual error pixel data;
Step 80: the peek module merges optimization to the peek process relevant information of the first macro block and the second macro block, uses the peek information of optimizing to peek from external memory, and described step 20,30,40,50,60 in no particular order;
Step 90: motion compensating module carries out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of the first macro block and the second macro block;
Step 100: predict pixel and the residual error pixel of superpose respectively the first macro block and the second macro block obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
2. a video decode circuit is characterized in that, comprising:
The macro block merge cells is used for several macro blocks are merged into a macro block group;
The first resolution unit is used for resolving the relevant syntactic element of macro block group the first macro block macro block (mb) type; Resolve the relevant syntactic element of the first macro block motion compensation, obtain the first macro block peek process relevant information; Resolve the relevant syntactic element of the first macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the first macro block residual error pixel data;
The second resolution unit is used for resolving the relevant syntactic element of macro block group the second macro block macro block (mb) type; Resolve the relevant syntactic element of the second macro block motion compensation, obtain the second macro block peek process relevant information; Resolve the relevant syntactic element of the second macro block residual error coefficient, through counter-scanning, inverse transformation, inverse quantization, obtain the second macro block residual error pixel data;
Fetch unit is used for the peek process relevant information of the first macro block and the second macro block is merged optimization, uses the peek information of optimizing to peek from external memory;
Motion compensation units is used for carrying out interpolation and weight estimation according to the reference block data of getting, and obtains respectively the predict pixel of the first macro block and the second macro block;
Reconfiguration unit, predict pixel and the residual error pixel of be used for superposeing respectively the first macro block and the second macro block obtain reconstructed pixel, and reconstructed pixel is done block elimination filtering and output.
CN 201010534938 2010-11-08 2010-11-08 Video decoding method for high-efficiency fetching of matching block and circuit Active CN101986709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010534938 CN101986709B (en) 2010-11-08 2010-11-08 Video decoding method for high-efficiency fetching of matching block and circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010534938 CN101986709B (en) 2010-11-08 2010-11-08 Video decoding method for high-efficiency fetching of matching block and circuit

Publications (2)

Publication Number Publication Date
CN101986709A CN101986709A (en) 2011-03-16
CN101986709B true CN101986709B (en) 2013-04-10

Family

ID=43711007

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010534938 Active CN101986709B (en) 2010-11-08 2010-11-08 Video decoding method for high-efficiency fetching of matching block and circuit

Country Status (1)

Country Link
CN (1) CN101986709B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101399978A (en) * 2007-09-28 2009-04-01 上海杰得微电子有限公司 Reference frame data reading method in hardware decoder and apparatus thereof
JP2009123088A (en) * 2007-11-16 2009-06-04 Toshiba Tec Corp Data code reader and its method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101399978A (en) * 2007-09-28 2009-04-01 上海杰得微电子有限公司 Reference frame data reading method in hardware decoder and apparatus thereof
JP2009123088A (en) * 2007-11-16 2009-06-04 Toshiba Tec Corp Data code reader and its method

Also Published As

Publication number Publication date
CN101986709A (en) 2011-03-16

Similar Documents

Publication Publication Date Title
US7924925B2 (en) Flexible macroblock ordering with reduced data traffic and power consumption
JP5290429B2 (en) Intelligent buffering of decoded pictures
US20150092834A1 (en) Context re-mapping in cabac encoder
Kang et al. MPEG4 AVC/H. 264 decoder with scalable bus architecture and dual memory controller
US9948934B2 (en) Estimating rate costs in video encoding operations using entropy encoding statistics
US9392292B2 (en) Parallel encoding of bypass binary symbols in CABAC encoder
JP5969914B2 (en) Video compression / decompression device
CN101527849B (en) Storing system of integrated video decoder
CN101252694A (en) Address mapping system and frame storage compression of video frequency decoding based on blocks
CN103634604A (en) Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method
Xu et al. Methods for power/throughput/area optimization of H. 264/AVC decoding
US20150092841A1 (en) Method and Apparatus for Multi-core Video Decoder
CN101986709B (en) Video decoding method for high-efficiency fetching of matching block and circuit
KR100891116B1 (en) Apparatus and method for bandwidth aware motion compensation
CN114727116A (en) Encoding method and device
CN100438630C (en) Multi-pipeline phase information sharing method based on data buffer storage
JP2003230148A (en) Image data coding unit
Song et al. High-performance memory interface architecture for high-definition video coding application
JP2009152710A (en) Unit and method for image processing
WO2022206199A1 (en) Method and apparatus for performing image processing in video decoding apparatus, and system
Sanghvi et al. A 28nm programmable and low power ultra-HD video codec engine
Zhou et al. A frame-parallel 2 gpixel/s video decoder chip for uhdtv and 3-dtv/ftv applications
Li et al. Design of memory sub-system with constant-rate bumping process for H. 264/AVC decoder
KR100821922B1 (en) Local memory controller for mpeg decoder
Gao et al. Design and implementation of motion compensator in memory reduced HDTV decoder with embedded compression engine

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 350000 Fuzhou Gulou District, Fujian, software Avenue, building 89, No. 18

Patentee after: FUZHOU ROCKCHIP ELECTRONICS CO., LTD.

Address before: 350003 Fujian city of Fuzhou Province Copper Road Software Park A District No. 18

Patentee before: Fuzhou Rockchip Semiconductor Co., Ltd.

CP01 Change in the name or title of a patent holder

Address after: 350000 building, No. 89, software Avenue, Gulou District, Fujian, Fuzhou 18, China

Patentee after: Ruixin Microelectronics Co., Ltd

Address before: 350000 building, No. 89, software Avenue, Gulou District, Fujian, Fuzhou 18, China

Patentee before: Fuzhou Rockchips Electronics Co.,Ltd.

CP01 Change in the name or title of a patent holder