CN101720040B

CN101720040B - Video decoding optimizing method fusing high speed memory and DMA channel

Info

Publication number: CN101720040B
Application number: CN 200910216191
Authority: CN
Inventors: 徐锦亮; 刘柏良
Original assignee: Sichuan Changhong Electric Co Ltd
Current assignee: Sichuan Changhong Electric Co Ltd
Priority date: 2009-11-11
Filing date: 2009-11-11
Publication date: 2011-05-11
Anticipated expiration: 2029-11-11
Also published as: CN101720040A

Abstract

The invention discloses a video decoding optimizing method fusing high speed memory and DMA channel. The invention is characterized in that the method includes: a decoder acquires initial storage address PDATA in memory in which the data to be read at the next decoding step is stored and data length LDATA in advance; according to the mapped concrete address space of SRAM on chip, the initial address of the address space is acquired; controller of idle DMA channel on the chip is selected to transmit the decoded data to designated SRAM location from designated memory location through the DMA channel; the decoder accesses the SRAM to carry out data decoding operation; and the result of decoding operation is returned to the memory from the SRAM by DMA technology to be saved. The method disclosed by the invention effectively improves speed of video decoding.

Description

The video decoding optimization method of fusing high speed memory and DMA passage

Technical field

The present invention relates to a kind of video decoding optimization method of embedded platform, especially relate to the fusing high speed memory of employing Samsung ARM platform and the video decoding optimization method of DMA passage.

Background technology

The portable multimedia product is that the exploitation focus in Embedded Application field also is the focus of market comsupton, and audio-visual broadcast has proposed higher configuration requirement as requisite multimedia function to processor, memory and other resource of embedded system.More in the market portable multimedia product adopts the chip system solution (SOC) based on the ARM kernel, as the S3C2451 of Samsung etc.The S3C2451 chip adopts the ARM926EJ kernel, and dominant frequency is up to 533MHZ, supports 128MB DDR2 internal memory, 16KB data and instruction Cache.The hardware configuration of these SOC chips has satisfied the basic resources requirement of audio-visual displaying video real-time decoding, but, only utilize these hardware resources to be difficult to reach the requirement of real-time video decoding to some specific video decoding situations such as high decoding complex degree, high-resolution video code flow.Therefore need to adopt various means that video decode is optimized.

It is a key factor of restriction video decode speed that the read-write that frequently produces because of the access memory data in video decoding process postpones.Avoid the common video decoding optimization method of this unfavorable factor that program optimization based on the decoding storehouse is arranged, adopt on the SOC chip distinctive assisted instruction collection etc.Adopt as far as possible to the decoding variable distributes the method for register based on the program optimization in decoding storehouse and to reduce visit internal memory, though will compare the access of internal memory to the access of register fast, but the quantity of register is very limited on the chip, is difficult to accomplish that the decoding variable for numerous distributes register one by one.Adopt the video decoding optimization method of peculiar assisted instruction collection on the chip to adopt the method for single-instruction multiple-data once to reduce the number of times of memory read-write usually to a plurality of decoded datas of memory access, reducing memory read-write postpones, but the data variable number of carrying in the single instrction also is very limited, can not reduce the number of times of memory read-write significantly.At the shortcoming of above-mentioned these optimization methods,, the present invention postpone to improve video decode speed thereby proposing the memory read-write that the video decoding optimization method of fusing high speed memory and DMA passage effectively reduces in the video decoding process.

Summary of the invention

Purpose of the present invention is fully to excavate the hardware resource that the SOC chip platform is disposed, for the video decode storehouse that operates on the platform provides a kind of effective decoding optimization method.

General SOC chip all disposes the SRAM memory, can store the data of a constant volume and realize the zero access of data, these characteristics that the present invention utilizes chip are with in the video decoding process data access of internal memory being switched to data access to the SRAM memory, to improve the data access speed of decoding.A plurality of DMA passages are generally all provided on the SOC chip simultaneously, be used for system bus inner or and peripheral bus between exchanges data to improve the CPU operating efficiency, therefore the present invention utilizes the DMA passage on the chip to realize that the transfer of data between internal memory and SRAM memory to reduce the occupancy of data transfer process to cpu resource, improves decoding speed.Implementation method mainly is divided into following step:

1) decoder obtains the data place memory initial address P that next decoding step will read in advance _DATAWith data length L _DATA

2) the specific address space that is mapped to according to SRAM memory on the SOC chip obtains the initial address P of this address space _SRAM

3) select DMA passage idle on the chip, the register of dma controller carried out the register parameters setting, with decoded data from the core position of appointment by the SRAM memory location of DMA channel transfer to appointment;

4) run to need read above decoded data the time when decoder, first judgment data whether by the DMA channel transfer in the SRAM memory:

A) if data all read in the SRAM memory, the then only reading of data computing of decoding from the SRAM memory of decoder, operation result is kept in the SRAM memory equally;

B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data, then execution and a) identical operations.

5) utilize the DMA passage that operation result directly is transferred to from the SRAM memory of finite capacity in the internal memory to realize the final storage of result of decoding operation once more.

Description of drawings

The present invention will illustrate by example and with reference to the mode of accompanying drawing, wherein:

Fig. 1 is a concrete process chart of the present invention.

Embodiment

Disclosed all features in this specification, or the step in disclosed all methods or the process except mutually exclusive feature and/or step, all can make up by any way.

Disclosed arbitrary feature in this specification (comprising any accessory claim, summary and accompanying drawing) is unless special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, unless special narration, each feature is an example in a series of equivalences or the similar characteristics.

Below provide a video decoding optimization based on the S3C2451 of Samsung chip platform so that realization of the present invention is described in detail:

1) decoder obtains the data place memory initial address P that next decoding step will read in advance in video decoding process _DATAWith data length L _DATA

2) the specific address space that is mapped to according to SRAM holder on the chip obtains the initial address P of this address space _SRAM

3) select DMA passage idle on the chip to realize the transmission of data from internal memory to the SRAM memory; As select the 8th DMA passage on the chip, that each register of dma controller correspondence specifically is provided with is following (if select other dma controller to realize transfer of data, the setting of each register of its correspondence is with time identical):

● initial source control register DISRCC: it is 00 that bit [1:0] is set, and makes dma controller read source data by the mode that the address increases from the system bus.

● initial destination register DIDST: the value that bit [30:0] is set is address value P _SRAM, making dma controller that source data is transferred to initial address is P _SRAMThe SRAM storage space.

● initial purpose control register DIDSTC: it is 00 that bit [1:0] is set, and makes dma controller store data by the mode that the address increases on system bus.

● control register DCON:

A) bit [31:30] being set is 11, makes dma controller select to shake hands mould (handshakemode), and makes DREQ and DACK and the HCLK clock synchronization of DMA.

B) bit [29:28] being set is 00, makes dma controller not produce interruption when all DTDs, and adopts the burst pattern when the single atomic operation.

C) value that bit [27] is set is 1, makes dma controller select full service mode (wholeservicemode).

D) value that bit [22] is set is 1, and the DMA passage is closed when all DTDs.

E) bit [21:20] being set is 10, and making the data bit width of dma controller when the single atomic operation is 32.

F) value that bit [19:0] is set is L _DATA/ 4, i.e. the value of initialization DMA transmission counter is L _DATA/ 4, when becoming 0, the value of transmission counter represents that all data are by the DMA channel end.

B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by DMA up to data, then execution and a) identical operations.

The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.

Claims

1. the video decoding optimization method of fusing high speed memory and DMA passage is characterized in that, comprising:

The data place memory initial address P that the next decoding step of step S1, decoder obtaining in advance will read _DATAWith data length L _DATA

Step S2 according to the specific address space that SRAM memory on the chip is mapped to, obtains the initial address P of this address space _SRAM

Step S3 selects DMA passage idle on the chip, is P with the described data that will decode from initial address _DATAMemory headroom be P by described DMA channel transfer to initial address _SRAMThe SRAM storage space;

Step S4, decoder accesses SRAM memory carries out data decoding operation;

Step S5 is transmitted back to internal memory and storage by the DMA technology with result of decoding operation from the SRAM memory.

2. video decoding optimization method according to claim 1 is characterized in that step S3 specifically also comprises: the register to dma controller carries out the register parameters setting.

3. video decoding optimization method according to claim 1 is characterized in that step S4 specifically also comprises: judge whether the described data that will decode all are transferred in the SRAM memory.

4. video decoding optimization method according to claim 3 is characterized in that, comprises

Step S41, if data all are transferred in the SRAM memory, then decoder reading of data from SRAM memory computing of decoding, operation result is kept in the SRAM memory;

Step S42, if data all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data.