CN106325822A - Method for improving DSP (digital signal processor) operation performance - Google Patents
Method for improving DSP (digital signal processor) operation performance Download PDFInfo
- Publication number
- CN106325822A CN106325822A CN201610719389.4A CN201610719389A CN106325822A CN 106325822 A CN106325822 A CN 106325822A CN 201610719389 A CN201610719389 A CN 201610719389A CN 106325822 A CN106325822 A CN 106325822A
- Authority
- CN
- China
- Prior art keywords
- code
- processor
- read
- memory
- dsp
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 230000001737 promoting effect Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3808—Instruction prefetching for instruction reuse, e.g. trace cache, branch target cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0862—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7807—System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
- G06F15/7817—Specially adapted for signal processing, e.g. Harvard architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7839—Architectures of general purpose stored program computers comprising a single central processing unit with memory
- G06F15/7842—Architectures of general purpose stored program computers comprising a single central processing unit with memory on one IC chip (single chip microcontrollers)
- G06F15/7846—On-chip cache and off-chip main memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/452—Instruction code
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention relates to a software running method of a DSP (digital signal processor), in particular to a method for improving the DSP operation performance. The method for improving the DSP operation performance is characterized in an original inside memory address space of the processor, a cache region special for a program instruction is opened up; when the processor is started, firstly, code is read into an internal memory mounted around the processor from an outside nonvolatile memory, i.e., the code is read into the memory; after the reading is completed, the code starts to execute the functional code in the memory from the first instruction; a part of code is read into the cache region, and the part of code runs in the cache region. According to the method, in the operation process, the cache reading speed is much higher than the memory access speed; the cache speed is most similar to the operation main frequency of the processor, so that the execution speed of the code can be greatly improved. The capability of completing the processing of mass data can be realized in a short time.
Description
Technical field
The present invention relates to the software running method of digital signal processor, a kind of side promoting DSP operation performance
Method.
Background technology
Digital signal processor, is called for short DSP.Generally use it to do some complex calculation, do complex calculation and relate to
Operational performance.The aspect generally relating to operational performance has the most several: algorithm, code optimization, dominant frequency, memorizer.Pin
The memorizer of DSP there is following classification: outside flash, outside DDR, internal flash, internal RAM, internal cache, inside are posted
Storage.Read-write speed sorts as follows successively: internal register > internal cache > internal RAM > internal flash > outside DDR > outside
flash。
Digital Signal Processing disclosure satisfy that required operational performance mostly at present, but on the process rank of some keys
Section, likely will be affected by the read-write speed of memorizer and make operational performance cannot meet computing requirement.Such as time
Between gap the shortest when to dispose some critical datas timely.High performance calculating system needed in the shortest time
Dispose the observation data of up to a hundred passages the most timely.Owing to passage is a lot, DSP can devote a tremendous amount of time in circulation,
So DSP can some same instructions of repeated accesses.Owing to the time is extremely short, number of channels big, just need the data volume of process in time
Bigger.If processing the computing that will affect some follow-up equations not in time, overall performance is caused to glide.
Summary of the invention
The technical problem to be solved is to provide a kind of lifting DSP fortune that can improve DSP operation processing speed
The method calculating performance.
The method promoting DSP operation performance of the present invention is:
At the address space of the original internal storage of processor, hew out the buffer zone that programmed instruction is special;Work as processor
After startup, first code is read carry internal storage around processor from external non-volatile memory, i.e.
Read code to internal memory;Read complete post code to start to perform to use functional code from Article 1 instruction in internal memory;Wherein
Partial code is read into above-mentioned buffer zone, and this partial code operates in buffer zone.
The code run in described buffer zone is Partial key code or repeats code.
In the method running of the present invention, because the reading rate of caching is far longer than the reading rate of internal memory, caching
Speed closest to the operation dominant frequency of processor, so the execution speed of code will have greatly improved.Make extremely short
Timeslice in had mass data process ability.
Detailed description of the invention
Embodiment:
The embodiment of the inventive method is applied to OMAPL138 processor, its internal 256M space by default allocation to ram space,
It is used for depositing global variable, local variable or the macrodefinition needing in code running to use.And described in above-mentioned
A little variable and macrodefinitions etc. equally leave in the internal memory of DDR etc.By the method for segmentation, the half in internal memory is made
For the buffer zone of code, the code of service data repeatability in 0.505 millisecond read is put in this buffer zone
Row operation.
Claims (2)
1. the method promoting DSP operation performance, is characterized in that:
At the address space of the original internal storage of processor, hew out the buffer zone that programmed instruction is special;Work as processor
After startup, first code is read carry internal storage around processor from external non-volatile memory, i.e.
Read code to internal memory;Read complete post code to start to perform to use functional code from Article 1 instruction in internal memory;Wherein
Partial code is read into above-mentioned buffer zone, and this partial code operates in buffer zone.
The method of lifting DSP operation performance the most according to claim 1, is characterized in that: run in described buffer zone
Code is Partial key code or repeats code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610719389.4A CN106325822A (en) | 2016-08-25 | 2016-08-25 | Method for improving DSP (digital signal processor) operation performance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610719389.4A CN106325822A (en) | 2016-08-25 | 2016-08-25 | Method for improving DSP (digital signal processor) operation performance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106325822A true CN106325822A (en) | 2017-01-11 |
Family
ID=57790732
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610719389.4A Pending CN106325822A (en) | 2016-08-25 | 2016-08-25 | Method for improving DSP (digital signal processor) operation performance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106325822A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020184040A1 (en) * | 2001-05-30 | 2002-12-05 | Hideo Okano | Data reproducing apparatus, data processing apparatus, and data transfer system |
CN1828538A (en) * | 2006-03-31 | 2006-09-06 | 浙江大学 | Method for realizing operating procedure directly from file system in embedded system |
CN1838078A (en) * | 2005-01-19 | 2006-09-27 | 威盛电子股份有限公司 | Method and system for swapping code in a digital signal processor |
CN101251810A (en) * | 2008-03-11 | 2008-08-27 | 浙江大学 | Method for optimizing embedded type operating system process scheduling based on SPM |
CN101266577A (en) * | 2008-03-27 | 2008-09-17 | 上海交通大学 | Programmable on-chip memorizer interface NOR flash memory reading quickening control method |
CN101295240A (en) * | 2008-06-03 | 2008-10-29 | 浙江大学 | Method for instruction buffering based on SPM in embedded system |
CN101526924A (en) * | 2009-04-22 | 2009-09-09 | 东南大学 | Method for accessing optimal digital signal processing chip data |
-
2016
- 2016-08-25 CN CN201610719389.4A patent/CN106325822A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020184040A1 (en) * | 2001-05-30 | 2002-12-05 | Hideo Okano | Data reproducing apparatus, data processing apparatus, and data transfer system |
CN1838078A (en) * | 2005-01-19 | 2006-09-27 | 威盛电子股份有限公司 | Method and system for swapping code in a digital signal processor |
CN1828538A (en) * | 2006-03-31 | 2006-09-06 | 浙江大学 | Method for realizing operating procedure directly from file system in embedded system |
CN101251810A (en) * | 2008-03-11 | 2008-08-27 | 浙江大学 | Method for optimizing embedded type operating system process scheduling based on SPM |
CN101266577A (en) * | 2008-03-27 | 2008-09-17 | 上海交通大学 | Programmable on-chip memorizer interface NOR flash memory reading quickening control method |
CN101295240A (en) * | 2008-06-03 | 2008-10-29 | 浙江大学 | Method for instruction buffering based on SPM in embedded system |
CN101526924A (en) * | 2009-04-22 | 2009-09-09 | 东南大学 | Method for accessing optimal digital signal processing chip data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7533242B1 (en) | Prefetch hardware efficiency via prefetch hint instructions | |
US20210232329A1 (en) | Apparatuses and methods to change data category values | |
KR101817397B1 (en) | Inter-architecture compatability module to allow code module of one architecture to use library module of another architecture | |
CN109416636B (en) | Shared machine learning data structure | |
US10261796B2 (en) | Processor and method for executing in-memory copy instructions indicating on-chip or off-chip memory | |
EP3005127A1 (en) | Systems and methods for preventing unauthorized stack pivoting | |
US20200117475A1 (en) | Function evaluation using multiple values loaded into registers by a single instruction | |
CN103279428A (en) | Explicit multi-core Cache consistency active management method facing flow application | |
US11921634B2 (en) | Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host | |
US8886535B2 (en) | Utilizing multiple processing units for rapid training of hidden markov models | |
CN107844380A (en) | A kind of multi-core buffer WCET analysis methods for supporting instruction prefetch | |
Carroll et al. | An improved abstract gpu model with data transfer | |
CN106649143B (en) | Cache access method and device and electronic equipment | |
CN114661442B (en) | Processing method and device, processor, electronic equipment and storage medium | |
CN103106097A (en) | Stack operation optimization method in just-in-time compiling system | |
CN106325822A (en) | Method for improving DSP (digital signal processor) operation performance | |
US11327768B2 (en) | Arithmetic processing apparatus and memory apparatus | |
JPS5995660A (en) | Data processor | |
Nishimura et al. | Accelerating the Smith-waterman algorithm using bitwise parallel bulk computation technique on GPU | |
US11921626B2 (en) | Processing-in-memory and method and apparatus with memory access | |
KR20240007582A (en) | System, pim device, and cuckoo hash querying method based on pim device | |
CN115858417A (en) | Cache data processing method, device, equipment and storage medium | |
CN102663051A (en) | Method and system for searching content addressable memory | |
Sakr et al. | High performance iris recognition system on GPU | |
CN107169313A (en) | The read method and computer-readable recording medium of DNA data files |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170111 |