CN101090504B - Coding decoding apparatus for video standard application - Google Patents

Coding decoding apparatus for video standard application Download PDF

Info

Publication number
CN101090504B
CN101090504B CN 200710119326 CN200710119326A CN101090504B CN 101090504 B CN101090504 B CN 101090504B CN 200710119326 CN200710119326 CN 200710119326 CN 200710119326 A CN200710119326 A CN 200710119326A CN 101090504 B CN101090504 B CN 101090504B
Authority
CN
China
Prior art keywords
data
unit
storage
thumchip
macro block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200710119326
Other languages
Chinese (zh)
Other versions
CN101090504A (en
Inventor
孙义和
张延军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN 200710119326 priority Critical patent/CN101090504B/en
Publication of CN101090504A publication Critical patent/CN101090504A/en
Application granted granted Critical
Publication of CN101090504B publication Critical patent/CN101090504B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses a Codec facing to the application of video standard in a video communication field, in which, said Codec is a THUM chip digital signal processor core matched to super-long instructions including an instruction storage, an instruction reading unit, an instruction distributing unit, a matched functional unit, a global register reactor, an interruption control module, a control state register, a data storage and a user drafted functional unit, which meets H.263 low code rate video coding standard and gets higher image quality at low code rate.

Description

A kind of codec towards video standard applies
Technical field
The invention belongs to field of video communication, particularly a kind of codec towards video standard applies.Be based on can under the lower situation of code check, obtaining of configurable Digital Signal Processor Core specifically than higher image quality.Be applied to low bit-rate video communications such as video mobile phones, meet the low bit-rate video coding standard.
Background technology
H.263 be a kind of low bit-rate video coding standard; Gingko application specific processor structure is the configurable processor structure of Tsing-Hua University's exploitation, and Fig. 1 is the block diagram of this configurable processor structure.As shown in the figure, in order to improve performance of processors, the core of this processor structure is the configurable processor nuclear based on the very long instruction word structure, the characteristics that the processor designer uses by evaluating objects, can be targetedly be configured, thereby be met the processor core of application demand configurable processor nuclear.Owing in the multimedia signal dispose algorithm, exist part to be not suitable for the functional module (as Variable Length Code, variable-length decoding etc.) that adopts digital signal processor to finish, also designed user-defined dedicated functional unit interface in this processor structure, the processor designer can design the special module of realizing these functions voluntarily, is integrated in the target processor by the special module interface in this structure.
Configurable Digital Signal Processor Core is the core of total.In order to satisfy the demand of the macrooperation amount that multimedia signal dispose uses, this processor core has adopted very long instruction word (VLIW) structure.Entire process device nuclear mainly is made up of command memory, data storage, instruction reading unit, instruction Dispatching Unit, instruction execution unit (configurable functionality unit), general-purpose register, interrupt control logic, control register and external device access port parts such as (customization functional units).In order to design target processor neatly at various application, various piece in this structure has all been taked configurable design, and the processor designer can be according to the characteristics of the target application implementation of the various piece of select target processor reasonably.
The customization functional unit is by the characteristics specialized designs of user according to target application, is mainly used in to realize being not suitable in the target application function that adopts digital signal processor to finish.Digital Signal Processor Core is controlled the customization functional unit by control register, and by data storage finish and the customization functional unit between exchanges data.
Processor core employing control register and status register are realized the control to the User Defined functional unit.Each dedicated functional unit all has the control register of some receiving processor nuclear control instructions, and the number of control register is determined according to the requirement of custom feature unit by the designer.Processor core is controlled the custom feature unit by the content of rewriting in the corresponding control register.The state of User Defined functional unit has returned two kinds of patterns, a kind of is mode by status register, a status register can be applied in each User Defined unit, this register can only be rewritten by its corresponding User Defined unit, for processor core, this status register is a read-only register, and processor core can be determined the state of User Defined unit by the content in the read status register.Another kind of state echo plex mode is that the User Defined unit adopts the external interrupt of processor core to trigger processor core.When adopting this mode, the task that the User Defined module is finished the processor core distribution at every turn just triggers external interrupt later on one time, informs processor core distribution next task.
Summary of the invention
It is a kind of codec towards video standard applies that purpose of the present invention provides.By reasonable disposition Gingko processor structure, and design some dedicated hardware units towards H.263, realize a high performance H.263 code stream encoding and decoding processor.It is characterized in that, this codec is a configurable VLIW THUMchip Digital Signal Processor Core, comprises command memory (program storage), instruction reading unit, instruction Dispatching Unit, configurable functionality unit (instruction execution unit), global register heap, interruption controls module, state of a control register, data storage and customization functional unit (external device access port).
Described configurable functionality unit is the THUMchip core processing unit, the THUMchip data path has disposed eight data paths, sub-clustering Managed Solution according to the Gingko structure, eight functional units are divided into four bunches of A, B, C, D, each bunch comprises two functional units and corresponding local register file, and eight functional units all are selected from the configurable functionality unit that the Gingko structure provides.
Described data storage, in THUMchip, adopted the private memory pattern, in video encoding-decoding process, mainly need the data of storage that following components is arranged: the initial data of video and reconstruct data, through discrete cosine transform or quantize the motion vector data etc. of later data, image.And,, comprising for the different phase in the video encoding-decoding process has designed special-purpose data storage cell according to the characteristics of standard H.263:
Main storage is used to deposit the initial data and the reconstruct data of video image, and in encryption algorithm H.263, coded object is a macro block, need be with last frame data as a reference to the coding of current macro.In THUMchip, adopt main storage to preserve the initial data of the video image that needs coding and the reconstruct data of former frame image.
Scratch-pad storage is mainly used in the intermediate object program of depositing the current macro block of handling, needs in the macro block encoding-decoding process through computings such as discrete cosine transform, quantification, inverse quantization, inverse discrete cosine transformations.In these calculating processes, the intermediate object program of computing all will be stored in the scratch-pad storage, and in order to keep the precision of data processing, intermediate object program adopts 16 Bit datas to preserve.
Motion vector store be used to preserve each macro block motion vector (motion vection, MV), motion vector comprises four part: x vector integer parts, y vector integer part, x vector fractional part and y vector fractional part; In THUMchip, open up special-purpose memory space and stored the motion vector of each macro block.
The size of described scratch-pad storage equates with the data volume of a macro block.Each macro block is made of four brightness (Y) signals and two colourities (U and V) signal.
Described every part vector adopts 8 bit space to deposit, and each macro block needs the space of one 32 bit to deposit its motion vector, in standard H.263, need deposit the motion vector of all macro blocks in the piece group.
Described customization functional unit comprises Variable Length Code module, variable-length decoding module-specific DMA unit; Mainly comprise parts such as dct transform, estimation and prediction, quantification, Variable Length Code in the cataloged procedure of standard H.263, decode procedure mainly contains parts such as idct transform, motion compensation, inverse quantization and variable-length decoding.In the THUMchip design process, computings such as DCT, IDCT, motion estimation and compensation and quantification and inverse quantization all adopt processor core to finish; Among the THUMchip, Variable Length Code and variable-length decoding adopt the mode of special cell to realize, have designed at the special-purpose DMA unit of standard H.263 carrying out the transmission of data.
Described special-purpose DMA unit adopts special-purpose DMA unit to carry out data and transmits among the THUMchip.When THUMchip was operated in coding mode, the main effect of DMA was that signal number to be encoded is reportedly delivered in the respective memory of encoder, and the later data of will encoding are sent to outside the sheet; When THUMchip was operated in decoding schema, the main effect of DMA was that the code stream of will encode is sent in the respective memory of decoder, and the later data of will decoding are sent to outside the sheet.
The structure of described Variable Length Code module comprises the storage address generation module, finite state machine, coding, Multi-connection unit and buffer memory; In coding standard H.263, video data need carry out Variable Length Code to reduce the code check of output code flow through discrete cosine transform, estimation and after quantizing; In THUMchip, processor core produces the read memory request dateout of on-chip memory address and finite state machine transmission according to the storage address generation module, coding unit is given Multi-connection unit later on to this digital coding, and Multi-connection unit meets the H.263 code stream of standard with code stream index signal and the data behind the coding that finite state machine produces according to standard multiple connection generation.
Described variable-length decoding module comprises the storage address generation module, finite state machine, decoding, tap unit and buffer memory; The decode procedure that is used for THUMchip, the course of work of variable-length decoding can be regarded the inverse process of Variable Length Code as, it gives later processor core with data tapping to be decoded, decoding, processor core carries out just can obtaining the later code stream of decoding after the operations such as inverse quantization, motion compensation, inverse discrete cosine transformation again.
The invention has the beneficial effects as follows that the present invention meets the H.263 codec based on configurable Digital Signal Processor Core of low bit-rate video coding standard; Can under the lower situation of code check, obtain than higher image quality.Be applied to low bit-rate video communications such as video mobile phones, meet the low bit-rate video coding standard.
Description of drawings
Fig. 1 is a Gingko processor overall structure schematic diagram.
Fig. 2 is a THUMchip processor structure schematic diagram.
Fig. 3 is the scratch-pad storage form.
Fig. 4 is the Variable Length Code modular structure.
Fig. 5 is the variable-length decoding modular structure.
Embodiment
The invention provides a kind of codec towards video standard applies.By configurable Gingko processor structure shown in Figure 1, and design some dedicated hardware units towards H.263, realize a high performance H.263 code stream encoding and decoding processor.This codec is a configurable VLIW THUMchip Digital Signal Processor Core (as shown in Figure 2), comprises command memory (program storage), instruction reading unit, instruction Dispatching Unit, configurable functionality unit (instruction execution unit), global register heap, interruption controls module, state of a control register, data storage and customization functional unit (external device access port).
Described configurable functionality unit is the THUMchip core processing unit, the THUMchip data path has disposed eight data paths, sub-clustering Managed Solution according to the Gingko structure, eight functional units are divided into four bunches of A, B, C, D, each bunch comprises two functional units and corresponding local register file, and eight functional units all are selected from the configurable functionality unit that the Gingko structure provides.
Described data storage, in THUMchip, adopted the private memory pattern, in video encoding-decoding process, mainly need the data of storage that following components is arranged: the initial data of video and reconstruct data, through discrete cosine transform or quantize the motion vector data etc. of later data, image.And,, comprising for the different phase in the video encoding-decoding process has designed special-purpose data storage cell according to the characteristics of standard H.263:
Main storage is used to deposit the initial data and the reconstruct data of video image, and in encryption algorithm H.263, coded object is a macro block, need be with last frame data as a reference to the coding of current macro.In THUMchip, adopt main storage to preserve the initial data of the video image that needs coding and the reconstruct data of former frame image.
Scratch-pad storage is mainly used in the intermediate object program of depositing the current macro block of handling, needs in the macro block encoding-decoding process through computings such as discrete cosine transform, quantification, inverse quantization, inverse discrete cosine transformations.In these calculating processes, the intermediate object program of computing all will be stored in the scratch-pad storage, and in order to keep the precision of data processing, intermediate object program adopts 16 Bit datas to preserve.
Motion vector store be used to preserve each macro block motion vector (motion vection, MV), motion vector comprises four part: x vector integer parts, y vector integer part, x vector fractional part and y vector fractional part; In THUMchip, open up special-purpose memory space and stored the motion vector of each macro block.
The size of described scratch-pad storage equates with the data volume of a macro block.Each macro block constitutes (as shown in Figure 3) by four brightness (Y) signals and two colourities (U and V) signal.Above-mentioned every part vector adopts 8 bit space to deposit, and each macro block needs the space of one 32 bit to deposit its motion vector, in standard H.263, need deposit the motion vector of all macro blocks in the piece group.In THUMchip, in order to improve the throughput of data, when the data of a macro block are being encoded, special-purpose dma module will be prepared the data of next macro block for processor, therefore, designed two scratch-pad storage DM1 and DM2 in THUMchip, when in the some memories of processor processing data the time, special-purpose dma module is saved in the data of the macro block that the next one need be encoded in another memory module.
Described customization functional unit comprises Variable Length Code module, variable-length decoding module-specific DMA unit; Mainly comprise parts such as dct transform, estimation and prediction, quantification, Variable Length Code in the cataloged procedure of standard H.263, decode procedure mainly contains parts such as idct transform, motion compensation, inverse quantization and variable-length decoding.In the THUMchip design process, computings such as DCT, IDCT, motion estimation and compensation and quantification and inverse quantization all adopt processor core to finish; Among the THUMchip, Variable Length Code and variable-length decoding adopt the mode of special cell to realize, have designed at the special-purpose DMA unit of standard H.263 carrying out the transmission of data.
Described special-purpose DMA unit adopts special-purpose DMA unit to carry out data and transmits among the THUMchip.When THUMchip was operated in coding mode, the main effect of DMA was that signal number to be encoded is reportedly delivered in the respective memory of encoder, and the later data of will encoding are sent to outside the sheet; When THUMchip was operated in decoding schema, the main effect of DMA was that the code stream of will encode is sent in the respective memory of decoder, and the later data of will decoding are sent to outside the sheet.Special-purpose DMA is controlled by control register by processor core the unit, when needs carry out the data transmission, processor core will need the initial address of the data that transmit and number that needs transmit data to send to special-purpose DMA unit by control register, special-purpose DMA unit receives and begins to carry out data after the instruction and transmit, and by status register the current state of DMA unit is returned to processor core.Special-purpose dma module adopts the mode reference to storage of request-reply.When special-purpose dma module need be from memory during reading of data, at first send the address and the reading request signal of memory, when receiving request signal, the Memory Management Unit in the processor core judges the state of memory, if memory is in idle condition, just come out to give the DMA unit, and provide answer signal the data read in the memory.Need be in memory during write data when special-purpose dma module, at first provide written request signal, and be ready to the address of memory and the data that need write, Memory Management Unit will write the corresponding memory address with data in the memory idle condition.
The structure of described Variable Length Code module comprises the storage address generation module, finite state machine, coding, Multi-connection unit and buffer memory; In coding standard H.263, video data need carry out Variable Length Code to reduce the code check of output code flow through discrete cosine transform, estimation and after quantizing; In THUMchip, processor core produces the read memory request dateout of on-chip memory address and finite state machine transmission according to the storage address generation module, coding unit is given Multi-connection unit later on to this digital coding, and Multi-connection unit meets the H.263 code stream of standard with code stream index signal and the data behind the coding that finite state machine produces according to standard multiple connection generation.
Described variable-length decoding module comprises the storage address generation module, finite state machine, decoding, tap unit and buffer memory; The decode procedure that is used for THUMchip, the course of work of variable-length decoding can be regarded the inverse process of Variable Length Code as, it gives later processor core with data tapping to be decoded, decoding, processor core carries out just can obtaining the later code stream of decoding after the operations such as inverse quantization, motion compensation, inverse discrete cosine transformation again.
In addition, in video encoding-decoding process, processor need read the data flow before the coding from the outside, and decoded data flow is sent to display module, if transmitting task, these data adopt processor core to finish, will reduce the efficient of processor core, in THUMchip, design at the special-purpose DMA unit of standard H.263 carrying out the transmission of data.

Claims (3)

1. one kind towards the codec of standard H.263, in video encoding-decoding process, need the data of storage that following components is arranged: the initial data of video image and reconstruct data, through dct transform or quantize later data and the motion vector data of video image is characterized in that
This codec is a configurable VLIW THUMchip Digital Signal Processor Core, comprises program storage, instruction reading unit, instruction Dispatching Unit, configurable functionality unit, global register heap, interruption controls module, state of a control register, data storage and customization functional unit;
Described configurable functionality unit is the THUMchip core processing unit, the THUMchip data path has disposed eight data paths, sub-clustering Managed Solution according to the Gingko structure, eight functional units are divided into four bunches of A, B, C, D, each bunch comprises two functional units and corresponding local register file, and eight functional units all are selected from the configurable functionality unit that the Gingko structure provides;
Described customization functional unit comprises Variable Length Code module, variable-length decoding module and special-purpose DMA unit;
Described data storage has adopted the private memory pattern in the THUMchip Digital Signal Processor Core, and according to standard H.263, for the different phase in the video encoding-decoding process has designed special-purpose data storage cell, comprising:
Main storage is used to deposit the initial data and the reconstruct data of video image, and in encryption algorithm H.263, coded object is a macro block, need be with last frame data as a reference to the coding of current macro; In the THUMchip Digital Signal Processor Core, adopt main storage to preserve the initial data of the video image that needs coding and the reconstruct data of former frame video image;
Scratch-pad storage is used to deposit the intermediate object program of the current macro block of handling, and needs in the macro block encoding-decoding process through dct transform, quantification, inverse quantization, idct transform computing; In these calculating processes, the intermediate object program of computing all will be stored in the scratch-pad storage, and in order to keep the precision of data processing, intermediate object program adopts 16 Bit datas to preserve;
Motion vector store is used to preserve the motion vector of each macro block, and motion vector comprises four part: x vector integer parts, y vector integer part, x vector fractional part and y vector fractional part; In THUMchip, open up special-purpose memory space and stored the motion vector of each macro block.
2. described towards the codec of standard H.263 according to claim 1, it is characterized in that the size of described scratch-pad storage equates that with the data volume of a macro block each macro block is made of four brightness signal Y and two carrier chrominance signal U, V.
3. described towards the codec of standard H.263 according to claim 1, it is characterized in that, each part in four parts that described motion vector comprises adopts 8 bit space to deposit, each macro block needs the space of one 32 bit to deposit its motion vector, in standard H.263, need deposit the motion vector of all macro blocks in the piece group.
CN 200710119326 2007-07-20 2007-07-20 Coding decoding apparatus for video standard application Expired - Fee Related CN101090504B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200710119326 CN101090504B (en) 2007-07-20 2007-07-20 Coding decoding apparatus for video standard application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200710119326 CN101090504B (en) 2007-07-20 2007-07-20 Coding decoding apparatus for video standard application

Publications (2)

Publication Number Publication Date
CN101090504A CN101090504A (en) 2007-12-19
CN101090504B true CN101090504B (en) 2010-06-23

Family

ID=38943625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200710119326 Expired - Fee Related CN101090504B (en) 2007-07-20 2007-07-20 Coding decoding apparatus for video standard application

Country Status (1)

Country Link
CN (1) CN101090504B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101054644B1 (en) * 2008-11-25 2011-08-08 에스케이 텔레콤주식회사 Motion Estimation Based Image Coding / Decoding Apparatus and Method
CN102118537B (en) * 2009-12-31 2015-04-15 深圳富泰宏精密工业有限公司 Picture error concealment system and method
CN102223530A (en) * 2010-04-13 2011-10-19 承景科技股份有限公司 Edge filter with shared framework and method for sharing edge filter
CN102592640A (en) * 2011-12-27 2012-07-18 长春希达电子技术有限公司 Decoding chain customizing system and method
CN105611295B (en) * 2015-12-23 2018-10-02 中国航天时代电子公司 A kind of system and method for realizing video sampling and compressing transmission on SOC

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1189058A (en) * 1996-08-19 1998-07-29 三星电子株式会社 Video data handling procedure and unit
CN1289212A (en) * 2000-10-27 2001-03-28 清华大学 Hierarchy programmable parallel video signal processor structure for motion estimation algorithm
CN1820503A (en) * 2003-07-09 2006-08-16 广阔逻辑网络技术股份有限公司 Method and system for providing a high speed multi-stream mpeg processor
CN1912925A (en) * 2006-08-29 2007-02-14 西安交通大学 Design and implementing method of multimedia expansion instructionof flow input read

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1189058A (en) * 1996-08-19 1998-07-29 三星电子株式会社 Video data handling procedure and unit
CN1289212A (en) * 2000-10-27 2001-03-28 清华大学 Hierarchy programmable parallel video signal processor structure for motion estimation algorithm
CN1820503A (en) * 2003-07-09 2006-08-16 广阔逻辑网络技术股份有限公司 Method and system for providing a high speed multi-stream mpeg processor
CN1912925A (en) * 2006-08-29 2007-02-14 西安交通大学 Design and implementing method of multimedia expansion instructionof flow input read

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Yanjun Zhang, ect.A new Register File Access Architecture for SoftwarePipelining in VLIW Processors.Proceedings of the 2005 Asia and Sourth Pacific Design Automation Conference.2005,627-630.
Yanjun Zhang, ect.A new Register File Access Architecture for SoftwarePipelining in VLIW Processors.Proceedings of the 2005 Asia and Sourth Pacific Design Automation Conference.2005,627-630. *

Also Published As

Publication number Publication date
CN101090504A (en) 2007-12-19

Similar Documents

Publication Publication Date Title
CN101908035B (en) Video coding and decoding method, GPU (Graphics Processing Unit) and its interacting method with CPU (Central Processing Unit), and system
CN101116341B (en) Caching method and apparatus for video motion compensation
US9351003B2 (en) Context re-mapping in CABAC encoder
CN101090504B (en) Coding decoding apparatus for video standard application
US7634776B2 (en) Multi-threaded processing design in architecture with multiple co-processors
CN103609117B (en) Code and decode the method and device of image
CN101527849B (en) Storing system of integrated video decoder
US9143793B2 (en) Video processing system, computer program product and method for managing a transfer of information between a memory unit and a decoder
CN105791823A (en) Methods and apparatus for adaptive template matching prediction for video encoding and decoding
CN102088603B (en) Entropy coder for video coder and implementation method thereof
CN102231830A (en) Arithmetic unit used for context arithmetic encoding and decoding
Nachtergaele et al. System-level power optimization of video codecs on embedded cores: a systematic approach
CN105323586B (en) A kind of shared drive interface for multi-core parallel concurrent Video coding and decoding
CN103778086B (en) Coarse-grained dynamic reconfigurable system based multi-mode data access device and method
CN104798373A (en) Video coding including shared motion estimation between multiple independent coding streams
CN107209663B (en) Data format conversion device, buffer chip and method
CN105103192B (en) Method and apparatus for vertex error correction
CN104113757A (en) Color Buffer Compression
Chaoui et al. Open multimedia application platform: enabling multimedia applications in third generation wireless terminals through a combined risc/dsp architecture
CN100456832C (en) Method of video coding for handheld apparatus
CN102404561A (en) Method for achieving moving picture experts group (MPEG) 4I frame encoding on compute unified device architecture (CUDA)
CN101795408B (en) Dual stage intra-prediction video encoding system and method
CN102547291A (en) Field programmable gate array (FPGA)-based joint photographic experts group (JPEG) 2000 image decoding device and method
CN103235717A (en) Processor with polymorphic instruction set architecture
US20220109838A1 (en) Methods and apparatus to process video frame pixel data using artificial intelligence video frame segmentation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100623

Termination date: 20110720