CN101720040B - Video decoding optimizing method fusing high speed memory and DMA channel - Google Patents
Video decoding optimizing method fusing high speed memory and DMA channel Download PDFInfo
- Publication number
- CN101720040B CN101720040B CN 200910216191 CN200910216191A CN101720040B CN 101720040 B CN101720040 B CN 101720040B CN 200910216191 CN200910216191 CN 200910216191 CN 200910216191 A CN200910216191 A CN 200910216191A CN 101720040 B CN101720040 B CN 101720040B
- Authority
- CN
- China
- Prior art keywords
- data
- memory
- sram
- dma
- decoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Abstract
The invention discloses a video decoding optimizing method fusing high speed memory and DMA channel. The invention is characterized in that the method includes: a decoder acquires initial storage address PDATA in memory in which the data to be read at the next decoding step is stored and data length LDATA in advance; according to the mapped concrete address space of SRAM on chip, the initial address of the address space is acquired; controller of idle DMA channel on the chip is selected to transmit the decoded data to designated SRAM location from designated memory location through the DMA channel; the decoder accesses the SRAM to carry out data decoding operation; and the result of decoding operation is returned to the memory from the SRAM by DMA technology to be saved. The method disclosed by the invention effectively improves speed of video decoding.
Description
Technical field
The present invention relates to a kind of video decoding optimization method of embedded platform, especially relate to the fusing high speed memory of employing Samsung ARM platform and the video decoding optimization method of DMA passage.
Background technology
The portable multimedia product is that the exploitation focus in Embedded Application field also is the focus of market comsupton, and audio-visual broadcast has proposed higher configuration requirement as requisite multimedia function to processor, memory and other resource of embedded system.More in the market portable multimedia product adopts the chip system solution (SOC) based on the ARM kernel, as the S3C2451 of Samsung etc.The S3C2451 chip adopts the ARM926EJ kernel, and dominant frequency is up to 533MHZ, supports 128MB DDR2 internal memory, 16KB data and instruction Cache.The hardware configuration of these SOC chips has satisfied the basic resources requirement of audio-visual displaying video real-time decoding, but, only utilize these hardware resources to be difficult to reach the requirement of real-time video decoding to some specific video decoding situations such as high decoding complex degree, high-resolution video code flow.Therefore need to adopt various means that video decode is optimized.
It is a key factor of restriction video decode speed that the read-write that frequently produces because of the access memory data in video decoding process postpones.Avoid the common video decoding optimization method of this unfavorable factor that program optimization based on the decoding storehouse is arranged, adopt on the SOC chip distinctive assisted instruction collection etc.Adopt as far as possible to the decoding variable distributes the method for register based on the program optimization in decoding storehouse and to reduce visit internal memory, though will compare the access of internal memory to the access of register fast, but the quantity of register is very limited on the chip, is difficult to accomplish that the decoding variable for numerous distributes register one by one.Adopt the video decoding optimization method of peculiar assisted instruction collection on the chip to adopt the method for single-instruction multiple-data once to reduce the number of times of memory read-write usually to a plurality of decoded datas of memory access, reducing memory read-write postpones, but the data variable number of carrying in the single instrction also is very limited, can not reduce the number of times of memory read-write significantly.At the shortcoming of above-mentioned these optimization methods,, the present invention postpone to improve video decode speed thereby proposing the memory read-write that the video decoding optimization method of fusing high speed memory and DMA passage effectively reduces in the video decoding process.
Summary of the invention
Purpose of the present invention is fully to excavate the hardware resource that the SOC chip platform is disposed, for the video decode storehouse that operates on the platform provides a kind of effective decoding optimization method.
General SOC chip all disposes the SRAM memory, can store the data of a constant volume and realize the zero access of data, these characteristics that the present invention utilizes chip are with in the video decoding process data access of internal memory being switched to data access to the SRAM memory, to improve the data access speed of decoding.A plurality of DMA passages are generally all provided on the SOC chip simultaneously, be used for system bus inner or and peripheral bus between exchanges data to improve the CPU operating efficiency, therefore the present invention utilizes the DMA passage on the chip to realize that the transfer of data between internal memory and SRAM memory to reduce the occupancy of data transfer process to cpu resource, improves decoding speed.Implementation method mainly is divided into following step:
1) decoder obtains the data place memory initial address P that next decoding step will read in advance
DATAWith data length L
DATA
2) the specific address space that is mapped to according to SRAM memory on the SOC chip obtains the initial address P of this address space
SRAM
3) select DMA passage idle on the chip, the register of dma controller carried out the register parameters setting, with decoded data from the core position of appointment by the SRAM memory location of DMA channel transfer to appointment;
4) run to need read above decoded data the time when decoder, first judgment data whether by the DMA channel transfer in the SRAM memory:
A) if data all read in the SRAM memory, the then only reading of data computing of decoding from the SRAM memory of decoder, operation result is kept in the SRAM memory equally;
B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data, then execution and a) identical operations.
5) utilize the DMA passage that operation result directly is transferred to from the SRAM memory of finite capacity in the internal memory to realize the final storage of result of decoding operation once more.
Description of drawings
The present invention will illustrate by example and with reference to the mode of accompanying drawing, wherein:
Fig. 1 is a concrete process chart of the present invention.
Embodiment
Disclosed all features in this specification, or the step in disclosed all methods or the process except mutually exclusive feature and/or step, all can make up by any way.
Disclosed arbitrary feature in this specification (comprising any accessory claim, summary and accompanying drawing) is unless special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, unless special narration, each feature is an example in a series of equivalences or the similar characteristics.
Below provide a video decoding optimization based on the S3C2451 of Samsung chip platform so that realization of the present invention is described in detail:
1) decoder obtains the data place memory initial address P that next decoding step will read in advance in video decoding process
DATAWith data length L
DATA
2) the specific address space that is mapped to according to SRAM holder on the chip obtains the initial address P of this address space
SRAM
3) select DMA passage idle on the chip to realize the transmission of data from internal memory to the SRAM memory; As select the 8th DMA passage on the chip, that each register of dma controller correspondence specifically is provided with is following (if select other dma controller to realize transfer of data, the setting of each register of its correspondence is with time identical):
● initial source control register DISRCC: it is 00 that bit [1:0] is set, and makes dma controller read source data by the mode that the address increases from the system bus.
● initial destination register DIDST: the value that bit [30:0] is set is address value P
SRAM, making dma controller that source data is transferred to initial address is P
SRAMThe SRAM storage space.
● initial purpose control register DIDSTC: it is 00 that bit [1:0] is set, and makes dma controller store data by the mode that the address increases on system bus.
● control register DCON:
A) bit [31:30] being set is 11, makes dma controller select to shake hands mould (handshakemode), and makes DREQ and DACK and the HCLK clock synchronization of DMA.
B) bit [29:28] being set is 00, makes dma controller not produce interruption when all DTDs, and adopts the burst pattern when the single atomic operation.
C) value that bit [27] is set is 1, makes dma controller select full service mode (wholeservicemode).
D) value that bit [22] is set is 1, and the DMA passage is closed when all DTDs.
E) bit [21:20] being set is 10, and making the data bit width of dma controller when the single atomic operation is 32.
F) value that bit [19:0] is set is L
DATA/ 4, i.e. the value of initialization DMA transmission counter is L
DATA/ 4, when becoming 0, the value of transmission counter represents that all data are by the DMA channel end.
4) run to need read above decoded data the time when decoder, first judgment data whether by the DMA channel transfer in the SRAM memory:
A) if data all read in the SRAM memory, the then only reading of data computing of decoding from the SRAM memory of decoder, operation result is kept in the SRAM memory equally;
B) if institute's data of wanting all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by DMA up to data, then execution and a) identical operations.
5) utilize the DMA passage that operation result directly is transferred to from the SRAM memory of finite capacity in the internal memory to realize the final storage of result of decoding operation once more.
The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.
Claims (4)
1. the video decoding optimization method of fusing high speed memory and DMA passage is characterized in that, comprising:
The data place memory initial address P that the next decoding step of step S1, decoder obtaining in advance will read
DATAWith data length L
DATA
Step S2 according to the specific address space that SRAM memory on the chip is mapped to, obtains the initial address P of this address space
SRAM
Step S3 selects DMA passage idle on the chip, is P with the described data that will decode from initial address
DATAMemory headroom be P by described DMA channel transfer to initial address
SRAMThe SRAM storage space;
Step S4, decoder accesses SRAM memory carries out data decoding operation;
Step S5 is transmitted back to internal memory and storage by the DMA technology with result of decoding operation from the SRAM memory.
2. video decoding optimization method according to claim 1 is characterized in that step S3 specifically also comprises: the register to dma controller carries out the register parameters setting.
3. video decoding optimization method according to claim 1 is characterized in that step S4 specifically also comprises: judge whether the described data that will decode all are transferred in the SRAM memory.
4. video decoding optimization method according to claim 3 is characterized in that, comprises
Step S41, if data all are transferred in the SRAM memory, then decoder reading of data from SRAM memory computing of decoding, operation result is kept in the SRAM memory;
Step S42, if data all are not transferred in the SRAM memory, then decoder is waited for always, all is transferred in the SRAM memory by the DMA passage up to data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910216191 CN101720040B (en) | 2009-11-11 | 2009-11-11 | Video decoding optimizing method fusing high speed memory and DMA channel |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910216191 CN101720040B (en) | 2009-11-11 | 2009-11-11 | Video decoding optimizing method fusing high speed memory and DMA channel |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101720040A CN101720040A (en) | 2010-06-02 |
CN101720040B true CN101720040B (en) | 2011-05-11 |
Family
ID=42434542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200910216191 Active CN101720040B (en) | 2009-11-11 | 2009-11-11 | Video decoding optimizing method fusing high speed memory and DMA channel |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101720040B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9979983B2 (en) | 2015-03-16 | 2018-05-22 | Microsoft Technology Licensing, Llc | Application- or context-guided video decoding performance enhancements |
US10129566B2 (en) | 2015-03-16 | 2018-11-13 | Microsoft Technology Licensing, Llc | Standard-guided video decoding performance enhancements |
CN106354556B (en) * | 2016-08-26 | 2020-03-27 | 深圳市优必选科技有限公司 | Audio transmission method and electronic device |
CN110765721B (en) * | 2019-10-15 | 2023-04-28 | 深圳忆联信息系统有限公司 | SOC chip acceleration verification method and device, computer equipment and storage medium |
CN113704026B (en) * | 2021-10-28 | 2022-01-25 | 北京时代正邦科技股份有限公司 | Distributed financial memory database security synchronization method, device and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6219724B1 (en) * | 1997-11-29 | 2001-04-17 | Electronics And Telecommunications Research Institute | Direct memory access controller |
CN1374802A (en) * | 2001-03-09 | 2002-10-16 | 汤姆森特许公司 | Video-unit, especially video decoder and its storage control process |
CN1794214A (en) * | 2005-12-22 | 2006-06-28 | 北京中星微电子有限公司 | Method of direct storage access for non-volatibility storage and its device |
-
2009
- 2009-11-11 CN CN 200910216191 patent/CN101720040B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6219724B1 (en) * | 1997-11-29 | 2001-04-17 | Electronics And Telecommunications Research Institute | Direct memory access controller |
CN1374802A (en) * | 2001-03-09 | 2002-10-16 | 汤姆森特许公司 | Video-unit, especially video decoder and its storage control process |
CN1794214A (en) * | 2005-12-22 | 2006-06-28 | 北京中星微电子有限公司 | Method of direct storage access for non-volatibility storage and its device |
Non-Patent Citations (1)
Title |
---|
JP特开2007-148900A 2007.06.14 |
Also Published As
Publication number | Publication date |
---|---|
CN101720040A (en) | 2010-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10635593B2 (en) | Create page locality in cache controller cache allocation | |
US20190179531A1 (en) | Common platform for one-level memory architecture and two-level memory architecture | |
CN105453044B (en) | Techniques for distributed processing task portion assignment | |
US10331582B2 (en) | Write congestion aware bypass for non-volatile memory, last level cache (LLC) dropping from write queue responsive to write queue being full and read queue threshold wherein the threshold is derived from latency of write to LLC and main memory retrieval time | |
KR101248246B1 (en) | Command queue for peripheral component | |
US6748507B2 (en) | Single-chip microcomputer with integral clock generating unit providing clock signals to CPU, internal circuit modules and synchronously controlling external dynamic memory | |
US20190188142A1 (en) | Defragmented and efficient micro-operation cache | |
KR101511972B1 (en) | Methods and apparatus for efficient communication between caches in hierarchical caching design | |
CN101720040B (en) | Video decoding optimizing method fusing high speed memory and DMA channel | |
EP2546757B1 (en) | Flexible flash commands | |
CN101221543A (en) | Descriptor prefetch mechanism for high latency and out of order DMA device | |
CN102640226A (en) | Memory having internal processors and methods of controlling memory access | |
US10169245B2 (en) | Latency by persisting data relationships in relation to corresponding data in persistent memory | |
TWI467513B (en) | Apparatus and method for memory-hierarchy aware producer-consumer instruction | |
EP1535169B1 (en) | Improved inter-processor communication system for communication between processors | |
WO2010000101A1 (en) | Device and method for extending memory space of embedded system | |
CN102521179A (en) | Achieving device and achieving method of direct memory access (DMA) reading operation | |
US20150346795A1 (en) | Multi-host power controller (mhpc) of a flash-memory-based storage device | |
CN102033818A (en) | Buffering in media and pipelined processing components | |
US9032099B1 (en) | Writeback mechanisms for improving far memory utilization in multi-level memory architectures | |
CN103778086B (en) | Coarse-grained dynamic reconfigurable system based multi-mode data access device and method | |
CN104021097A (en) | Data transmission method and device and direct memory access | |
US20140208031A1 (en) | Apparatus and method for memory-hierarchy aware producer-consumer instructions | |
US20180137053A1 (en) | Sequential data writes to increase invalid to modified protocol occurrences in a computing system | |
EP3885942A1 (en) | Methods, apparatus, articles of manufacture to perform accelerated matrix multiplication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |