CN101720040B - Video decoding optimizing method fusing high speed memory and DMA channel - Google Patents

Video decoding optimizing method fusing high speed memory and DMA channel Download PDF

Info

Publication number
CN101720040B
CN101720040B CN 200910216191 CN200910216191A CN101720040B CN 101720040 B CN101720040 B CN 101720040B CN 200910216191 CN200910216191 CN 200910216191 CN 200910216191 A CN200910216191 A CN 200910216191A CN 101720040 B CN101720040 B CN 101720040B
Authority
CN
China
Prior art keywords
data
memory
sram
dma
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 200910216191
Other languages
Chinese (zh)
Other versions
CN101720040A (en
Inventor
徐锦亮
刘柏良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN 200910216191 priority Critical patent/CN101720040B/en
Publication of CN101720040A publication Critical patent/CN101720040A/en
Application granted granted Critical
Publication of CN101720040B publication Critical patent/CN101720040B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a video decoding optimizing method fusing high speed memory and DMA channel. The invention is characterized in that the method includes: a decoder acquires initial storage address PDATA in memory in which the data to be read at the next decoding step is stored and data length LDATA in advance; according to the mapped concrete address space of SRAM on chip, the initial address of the address space is acquired; controller of idle DMA channel on the chip is selected to transmit the decoded data to designated SRAM location from designated memory location through the DMA channel; the decoder accesses the SRAM to carry out data decoding operation; and the result of decoding operation is returned to the memory from the SRAM by DMA technology to be saved. The method disclosed by the invention effectively improves speed of video decoding.

Description

The video decoding optimization method of fusing high speed memory and DMA passage
Technical field
The present invention relates to a kind of video decoding optimization method of embedded platform, especially relate to the fusing high speed memory of employing Samsung ARM platform and the video decoding optimization method of DMA passage.
Background technology
The portable multimedia product is that the exploitation focus in Embedded Application field also is the focus of market comsupton, and audio-visual broadcast has proposed higher configuration requirement as requisite multimedia function to processor, memory and other resource of embedded system.More in the market portable multimedia product adopts the chip system solution (SOC) based on the ARM kernel, as the S3C2451 of Samsung etc.The S3C2451 chip adopts the ARM926EJ kernel, and dominant frequency is up to 533MHZ, supports 128MB DDR2 internal memory, 16KB data and instruction Cache.The hardware configuration of these SOC chips has satisfied the basic resources requirement of audio-visual displaying video real-time decoding, but, only utilize these hardware resources to be difficult to reach the requirement of real-time video decoding to some specific video decoding situations such as high decoding complex degree, high-resolution video code flow.Therefore need to adopt various means that video decode is optimized.
It is a key factor of restriction video decode speed that the read-write that frequently produces because of the access memory data in video decoding process postpones.Avoid the common video decoding optimization method of this unfavorable factor that program optimization based on the decoding storehouse is arranged, adopt on the SOC chip distinctive assisted instruction collection etc.Adopt as far as possible to the decoding variable distributes the method for register based on the program optimization in decoding storehouse and to reduce visit internal memory, though will compare the access of internal memory to the access of register fast, but the quantity of register is very limited on the chip, is difficult to accomplish that the decoding variable for numerous distributes register one by one.Adopt the video decoding optimization method of peculiar assisted instruction collection on the chip to adopt the method for single-instruction multiple-data once to reduce the number of times of memory read-write usually to a plurality of decoded datas of memory access, reducing memory read-write postpones, but the data variable number of carrying in the single instrction also is very limited, can not reduce the number of times of memory read-write significantly.At the shortcoming of above-mentioned these optimization methods,, the present invention postpone to improve video decode speed thereby proposing the memory read-write that the video decoding optimization method of fusing high speed memory and DMA passage effectively reduces in the video decoding process.
Summary of the invention
Purpose of the present invention is fully to excavate the hardware resource that the SOC chip platform is disposed, for the video decode storehouse that operates on the platform provides a kind of effective decoding optimization method.
General SOC chip all disposes the SRAM memory, can store the data of a constant volume and realize the zero access of data, these characteristics that the present invention utilizes chip are with in the video decoding process data access of internal memory being switched to data access to the SRAM memory, to improve the data access speed of decoding.A plurality of DMA passages are generally all provided on the SOC chip simultaneously, be used for system bus inner or and peripheral bus between exchanges data to improve the CPU operating efficiency, therefore the present invention utilizes the DMA passage on the chip to realize that the transfer of data between internal memory and SRAM memory to reduce the occupancy of data transfer process to cpu resource, improves decoding speed.Implementation method mainly is divided into following step:
1) decoder obtains the data place memory initial address P that next decoding step will read in advance DATAWith data length L DATA
2) the specific address space that is mapped to according to SRAM memory on the SOC chip obtains the initial address P of this address space SRAM
3) select DMA passage idle on the chip, the register of dma controller carried out the register parameters setting, with decoded data from the core position of appointment by the SRAM memory location of DMA channel transfer to appointment;
4) run to need read above decoded data the time when decoder, first judgment data whether by the DMA channel transfer in the SRAM memory:
A) if data all read in the SRAM memory, the then only reading of data computing of decoding from the SRAM memory of decoder, operation result is kept in the SRAM memory equally;
B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data, then execution and a) identical operations.
5) utilize the DMA passage that operation result directly is transferred to from the SRAM memory of finite capacity in the internal memory to realize the final storage of result of decoding operation once more.
Description of drawings
The present invention will illustrate by example and with reference to the mode of accompanying drawing, wherein:
Fig. 1 is a concrete process chart of the present invention.
Embodiment
Disclosed all features in this specification, or the step in disclosed all methods or the process except mutually exclusive feature and/or step, all can make up by any way.
Disclosed arbitrary feature in this specification (comprising any accessory claim, summary and accompanying drawing) is unless special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, unless special narration, each feature is an example in a series of equivalences or the similar characteristics.
Below provide a video decoding optimization based on the S3C2451 of Samsung chip platform so that realization of the present invention is described in detail:
1) decoder obtains the data place memory initial address P that next decoding step will read in advance in video decoding process DATAWith data length L DATA
2) the specific address space that is mapped to according to SRAM holder on the chip obtains the initial address P of this address space SRAM
3) select DMA passage idle on the chip to realize the transmission of data from internal memory to the SRAM memory; As select the 8th DMA passage on the chip, that each register of dma controller correspondence specifically is provided with is following (if select other dma controller to realize transfer of data, the setting of each register of its correspondence is with time identical):
● initial source control register DISRCC: it is 00 that bit [1:0] is set, and makes dma controller read source data by the mode that the address increases from the system bus.
● initial destination register DIDST: the value that bit [30:0] is set is address value P SRAM, making dma controller that source data is transferred to initial address is P SRAMThe SRAM storage space.
● initial purpose control register DIDSTC: it is 00 that bit [1:0] is set, and makes dma controller store data by the mode that the address increases on system bus.
● control register DCON:
A) bit [31:30] being set is 11, makes dma controller select to shake hands mould (handshakemode), and makes DREQ and DACK and the HCLK clock synchronization of DMA.
B) bit [29:28] being set is 00, makes dma controller not produce interruption when all DTDs, and adopts the burst pattern when the single atomic operation.
C) value that bit [27] is set is 1, makes dma controller select full service mode (wholeservicemode).
D) value that bit [22] is set is 1, and the DMA passage is closed when all DTDs.
E) bit [21:20] being set is 10, and making the data bit width of dma controller when the single atomic operation is 32.
F) value that bit [19:0] is set is L DATA/ 4, i.e. the value of initialization DMA transmission counter is L DATA/ 4, when becoming 0, the value of transmission counter represents that all data are by the DMA channel end.
4) run to need read above decoded data the time when decoder, first judgment data whether by the DMA channel transfer in the SRAM memory:
A) if data all read in the SRAM memory, the then only reading of data computing of decoding from the SRAM memory of decoder, operation result is kept in the SRAM memory equally;
B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by DMA up to data, then execution and a) identical operations.
5) utilize the DMA passage that operation result directly is transferred to from the SRAM memory of finite capacity in the internal memory to realize the final storage of result of decoding operation once more.
The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.

Claims (4)

1. the video decoding optimization method of fusing high speed memory and DMA passage is characterized in that, comprising:
The data place memory initial address P that the next decoding step of step S1, decoder obtaining in advance will read DATAWith data length L DATA
Step S2 according to the specific address space that SRAM memory on the chip is mapped to, obtains the initial address P of this address space SRAM
Step S3 selects DMA passage idle on the chip, is P with the described data that will decode from initial address DATAMemory headroom be P by described DMA channel transfer to initial address SRAMThe SRAM storage space;
Step S4, decoder accesses SRAM memory carries out data decoding operation;
Step S5 is transmitted back to internal memory and storage by the DMA technology with result of decoding operation from the SRAM memory.
2. video decoding optimization method according to claim 1 is characterized in that step S3 specifically also comprises: the register to dma controller carries out the register parameters setting.
3. video decoding optimization method according to claim 1 is characterized in that step S4 specifically also comprises: judge whether the described data that will decode all are transferred in the SRAM memory.
4. video decoding optimization method according to claim 3 is characterized in that, comprises
Step S41, if data all are transferred in the SRAM memory, then decoder reading of data from SRAM memory computing of decoding, operation result is kept in the SRAM memory;
Step S42, if data all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data.
CN 200910216191 2009-11-11 2009-11-11 Video decoding optimizing method fusing high speed memory and DMA channel Active CN101720040B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910216191 CN101720040B (en) 2009-11-11 2009-11-11 Video decoding optimizing method fusing high speed memory and DMA channel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910216191 CN101720040B (en) 2009-11-11 2009-11-11 Video decoding optimizing method fusing high speed memory and DMA channel

Publications (2)

Publication Number Publication Date
CN101720040A CN101720040A (en) 2010-06-02
CN101720040B true CN101720040B (en) 2011-05-11

Family

ID=42434542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910216191 Active CN101720040B (en) 2009-11-11 2009-11-11 Video decoding optimizing method fusing high speed memory and DMA channel

Country Status (1)

Country Link
CN (1) CN101720040B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9979983B2 (en) 2015-03-16 2018-05-22 Microsoft Technology Licensing, Llc Application- or context-guided video decoding performance enhancements
US10129566B2 (en) 2015-03-16 2018-11-13 Microsoft Technology Licensing, Llc Standard-guided video decoding performance enhancements
CN106354556B (en) * 2016-08-26 2020-03-27 深圳市优必选科技有限公司 Audio transmission method and electronic device
CN110765721B (en) * 2019-10-15 2023-04-28 深圳忆联信息系统有限公司 SOC chip acceleration verification method and device, computer equipment and storage medium
CN113704026B (en) * 2021-10-28 2022-01-25 北京时代正邦科技股份有限公司 Distributed financial memory database security synchronization method, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219724B1 (en) * 1997-11-29 2001-04-17 Electronics And Telecommunications Research Institute Direct memory access controller
CN1374802A (en) * 2001-03-09 2002-10-16 汤姆森特许公司 Video-unit, especially video decoder and its storage control process
CN1794214A (en) * 2005-12-22 2006-06-28 北京中星微电子有限公司 Method of direct storage access for non-volatibility storage and its device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219724B1 (en) * 1997-11-29 2001-04-17 Electronics And Telecommunications Research Institute Direct memory access controller
CN1374802A (en) * 2001-03-09 2002-10-16 汤姆森特许公司 Video-unit, especially video decoder and its storage control process
CN1794214A (en) * 2005-12-22 2006-06-28 北京中星微电子有限公司 Method of direct storage access for non-volatibility storage and its device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JP特开2007-148900A 2007.06.14

Also Published As

Publication number Publication date
CN101720040A (en) 2010-06-02

Similar Documents

Publication Publication Date Title
US10635593B2 (en) Create page locality in cache controller cache allocation
US20190179531A1 (en) Common platform for one-level memory architecture and two-level memory architecture
CN105453044B (en) Techniques for distributed processing task portion assignment
US10331582B2 (en) Write congestion aware bypass for non-volatile memory, last level cache (LLC) dropping from write queue responsive to write queue being full and read queue threshold wherein the threshold is derived from latency of write to LLC and main memory retrieval time
KR101248246B1 (en) Command queue for peripheral component
US6748507B2 (en) Single-chip microcomputer with integral clock generating unit providing clock signals to CPU, internal circuit modules and synchronously controlling external dynamic memory
US20190188142A1 (en) Defragmented and efficient micro-operation cache
KR101511972B1 (en) Methods and apparatus for efficient communication between caches in hierarchical caching design
CN101720040B (en) Video decoding optimizing method fusing high speed memory and DMA channel
EP2546757B1 (en) Flexible flash commands
CN101221543A (en) Descriptor prefetch mechanism for high latency and out of order DMA device
CN102640226A (en) Memory having internal processors and methods of controlling memory access
US10169245B2 (en) Latency by persisting data relationships in relation to corresponding data in persistent memory
TWI467513B (en) Apparatus and method for memory-hierarchy aware producer-consumer instruction
EP1535169B1 (en) Improved inter-processor communication system for communication between processors
WO2010000101A1 (en) Device and method for extending memory space of embedded system
CN102521179A (en) Achieving device and achieving method of direct memory access (DMA) reading operation
US20150346795A1 (en) Multi-host power controller (mhpc) of a flash-memory-based storage device
CN102033818A (en) Buffering in media and pipelined processing components
US9032099B1 (en) Writeback mechanisms for improving far memory utilization in multi-level memory architectures
CN103778086B (en) Coarse-grained dynamic reconfigurable system based multi-mode data access device and method
CN104021097A (en) Data transmission method and device and direct memory access
US20140208031A1 (en) Apparatus and method for memory-hierarchy aware producer-consumer instructions
US20180137053A1 (en) Sequential data writes to increase invalid to modified protocol occurrences in a computing system
EP3885942A1 (en) Methods, apparatus, articles of manufacture to perform accelerated matrix multiplication

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant