CN105975048A - DSP chip and construction method thereof - Google Patents

DSP chip and construction method thereof Download PDF

Info

Publication number
CN105975048A
CN105975048A CN201610290943.1A CN201610290943A CN105975048A CN 105975048 A CN105975048 A CN 105975048A CN 201610290943 A CN201610290943 A CN 201610290943A CN 105975048 A CN105975048 A CN 105975048A
Authority
CN
China
Prior art keywords
task
data
passage
dsp chip
address bus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610290943.1A
Other languages
Chinese (zh)
Inventor
高靳旭
谷晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201610290943.1A priority Critical patent/CN105975048A/en
Publication of CN105975048A publication Critical patent/CN105975048A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken

Abstract

The invention discloses a DSP chip and a construction method thereof. The DSP chip comprises a plurality of task channels for completing arithmetic tasks assigned by a host CPU. Each task channel comprises a DMA controller, an arithmetic unit, and a memory that are used for completing the arithmetic tasks independently; each task channel is connected by using a data bus to a plurality of interface modules corresponding to and used for completing the algorithm tasks; and each task channel is connected to a preset memory management unit, and the memory management unit is connected to a data memory by using the data bus. The DSP chip of the invention realizes the parallel multi-task system from the hardware, so that task switch is no longer needed, the work main frequency is lower, and the energy consumption is reduced.

Description

A kind of dsp chip and building method thereof
Technical field
The invention belongs to chip field, more specifically, relate to a kind of dsp chip and building method thereof.
Background technology
In high performance Digital Signal Processing (DSP) chip, in order to realize Digital Signal Processing fortune rapidly Calculating, dsp chip the most all uses special software and hardware structure.Mesh is introduced as a example by classical TMS320 The basic structure of the dsp chip of front main flow:
1, Harvard structure
Harvard structure is different from the parallel architecture of traditional Feng Nuoman (Von Neuman) structure, It is mainly characterized by being stored in different memory spaces program and data, i.e. program storage and data are deposited Reservoir is two separate memorizeies, and each memorizer independently addresses, independent access.
2, instruction execution pipeline
Relevant to Harvard structure, the widely used streamline of dsp chip is to reduce time for each instruction, thus increases The strong disposal ability of processor.The pipeline depth of TMS320 series processors is from 2~6 grades.Also That is, processor can be with parallel processing 2~6 instructions, the different phase that every instruction is on streamline. In three class pipeline operates, fetching, decoding and perform operation and can be independently processed from, this can make instruction hold Row can be completely overlapped.Within each instruction cycle, three different instructions are active, each instruction It is in the different stages.Such as, when n-th instruction fetching, that is the N-1 instruction of previous instruction Decoding, the N-2 instruction is then carrying out.
3, specialized hardware
In the FIR filter of general type, multiplication is the important component part of DSP.To each filtering Device tap, it is necessary to do a multiplication and a sub-addition.Multiplication speed is the fastest, and the performance of dsp processor is more High.In general microprocessor, multiplying order is realized by a series of additions, therefore needs many to refer to The cycle is made to complete.Comparatively speaking, the feature of dsp chip is exactly a special hardware multiplier.? In TMS320 series, owing to having special hardware multiplier, multiplication can complete within an instruction cycle.
4, special instruction
Another feature of dsp chip is to use special instruction.Such as: DMOV be exactly one special DSP instruction, it completes data shift function.In digital signal processing, postpone operation extremely important, This delay is realized by DMOV.Another special instruction in TMS32010 is LTD, It completes LT, DMOV and APAC tri-instruction within an instruction cycle.LTD and MPY instructs FIR filter tap calculation can be reduced to 2 instructions from 4 instructions.In second filial generation processor, as TMS320C25, adds 2 and more specifically instructs, i.e. RPT and MACD instruct, use this 2 Bar special instruction, can reduce to 1 by the operational order number of each tap from 2 further.
5, the quick instruction cycle
Collection is added in Harvard structure, pile line operation, special hardware multiplier, special DSP instruction Become the optimization design of circuit, the instruction cycle of dsp chip can be made to be greatly shortened.Along with integrated circuit technology Progress, the instruction cycle of general dsp processor has already decreased to ns level.
Each feature of summary so that dsp chip is long-range to the disposal ability of the algorithm of DSP class In general processor, it is possible to the many real-time Embedded Application of real-time implementation.
Owing to for highly versatile, being easy to the principle that software development is easy-to-use, present DSP remains edge By the basic structure of the general processor core with CPU as hardware systems.Such structure makes DSP adapts to the motility of different application and is maximized, and programmer is prone to application.
Dsp processor progress can be summed up and have a both direction:
Improve the operand that an instruction completes as far as possible.Harvard structure, special instruction, specialized hardware are all Belong to this direction.
Shorten the time of an instruction operation as far as possible.Instruction pipeline, technique progress broadly fall into this side To.
Chip design professional quarters it can be seen that, the effort in the two direction all makes computing faster, But it is difficult to be greatly improved the utilization rate of energy consumption.That is: the trigger upset needed for same operand is completed Number is not greatly decreased.For the application that some power consumption requirements are harsher, general dsp cannot be from joint The angle saving power consumption does more deep optimization.The underlying cause be following some:
CPU is that the architecture of core is difficult to make DSP algorithm realize reaching peak efficiency
Although the particularity for DSP application has done many optimization designs, but fundamentally remains with central authorities The instruction of processor carries out computing, as universal cpu based on performing.So, the execution of computing During, about CPU company the most without a stop.Fetching, decoding, perform, the process of execution again may be used Be divided into reading data storage, ALU or specialized hardware calculating, data write back memorizer.
But the feature great majority of DSP algorithm are all: arithmetic type is the most fixing, and data to be operated Amount is relatively big, and the data showing as a big section storage perform a same instruction.So, CPU is come Saying, each computing is all taking same instruction, is translating the such useless operation of same code, such behaviour Necessarily cause the waste of power consumption.
Depend on and promote the strategy of dominant frequency and cause the extra power consumption of hardware:
Chip improves the dominant frequency of work immediately following the progress of integrated circuit technology, and the thing followed is and specifically transports The power consumption of unrelated accessory circuits increases notable.
The most typical with clock trees.The design of digit chip mostly uses Synchronization Design, and clock trees technology is permissible Effectively overcome clock drift, but power consumption cost is the biggest.There may come a time when the 20% to 30% of the most whole chip, And along with dominant frequency and improve chip-scale increase, this ratio has the trend of increase.
" multitask " is originally a Concept of Software in operating system, refers to that computer or CPU perform simultaneously The ability of multiple tasks.
The typical method of general-purpose processor system be by different task between frequent switching, make single cpu " seem " to perform multiple task at the same time. actually this is the software magic that operating system is played.Only The frequency wanting task to switch is sufficiently high (general each second more than 100 times), it is possible to the sensation of the people that out-tricks. By the way, but one of the three of multiple task management operating system big basic functions.
Powerful CPU coordinates produced multitask effect with being stationed silent the cutting of operating system software therein Really, in the data flow processing system of extremely low power dissipation, it is difficult to the work of smoothness.One of trouble goes out in office In business switching, task switches the on-the-spot preservation of inevitable requirement task and recovery frequently, causes the most managerial Operation overhead.In the system processing continuous data stream, this problem is seriously weakened.The two of trouble go out In system work dominant frequency, the multitask system of single CPU, certainly will require that the work dominant frequency of system is greatly improved, Thus on SOC design, must be introduced into multi-level buffer, data pipeline, synchronised clock tree and signal Powerful driving etc., these all will cause the rising of individual part energy consumption.
Therefore, the problem that in prior art, dsp chip causes because of general processor energy consumption is too high.
Summary of the invention
The open a kind of dsp chip of the present invention and building method thereof, be used for solving dsp chip in prior art The problem that the energy consumption that causes because of general processor is too high.
, it is provided that a kind of dsp chip, and use for achieving the above object, according to an aspect of the present invention Following technical scheme:
A kind of dsp chip includes: a plurality of task passage, has been used for the algorithm task of master cpu distribution; Every task passage all includes dma controller, arithmetical unit and memorizer, for calculating described in complete independently Method task;Every task passage all multiple is connect by what data/address bus had connected corresponding to described algorithm task Mouth die block;Every task passage is all connected with the MMU preset, described MMU memory management unit Data storage is connected by data/address bus.
Further, described a plurality of task passage includes: first passage, and described first passage is total by data Line connects described master cpu respectively, FPU Float Point Unit, PWM special purpose interface, USB interface and GIPO manages module;Second channel, described second channel connects a DMA respectively by data/address bus Controller, First Series statistical module, the first ALU and table transform module;Third channel, Described third channel connects the second dma controller respectively by data/address bus, second series statistical module, Second ALU;Fourth lane, described fourth lane connects the 3rd DMA respectively by data/address bus Controller, the 3rd series of statistical module, multiply accumulating array arithmetical unit;Five-channel, described Five-channel consolidates Gu data/address bus connects the 4th dma controller, ADC, DAC module, sequence flows IO respectively Interface.
Further, described third channel is one or more.
Further, described a plurality of task passage also includes: one or more clematis stem road, described clematis stem Road includes the 5th dma controller and the port for external SD RAM Interface.
Further, described PWM special purpose interface is 8 road PWM special purpose interfaces.
Further, multiplying accumulating array arithmetical unit described in is that 8*8 multiplies accumulating array arithmetical unit.
Further, described ADC and described DAC module are 16Bit.
Further, described data/address bus is local bus.
According to another aspect of the present invention, it is provided that the building method of a kind of dsp chip, and use such as Lower technical scheme:
The building method of a kind of dsp chip includes: constructs a plurality of task passage, has been used for master cpu The algorithm task of distribution;Control every task passage all and dma controller, arithmetical unit and memorizer phase Connect, for algorithm task described in complete independently;Control every task passage all to have been connected by data/address bus Become the multiple interface modules corresponding to described algorithm task;Control every task passage all with preset memorizer Administrative unit is connected, and described MMU memory management unit connects data storage by data/address bus.
In the inventive solutions, each independent algorithm task is controlled by one group of passage, DMA Device, arithmetical unit and memorizer are constituted;Such structure has a following advantage for the realization of low-power consumption:
Open different number of task passage according to task amount number, do not have the passage of task to hang up completely, firmly The power consumption of part and the strict positive correlation of operand.
The data/address bus of each hardware corridor is local bus, parasitic electric capacity and required average driving Electric current is greatly reduced, and hard-wired extra power consumption is few.
The task of CPU is as just the manager of commander Yu peripheral hardware, in big data quantity calculating process, CPU Can completely hang up, there is no useless fetching, decoding process.
Multitask is realized by multiple simple controllers, for general dsp, core can be greatly reduced The dominant frequency of sheet, the lower power consumption thus brought is the most considerable.
Accompanying drawing explanation
Accompanying drawing is used for providing a further understanding of the present invention, constitutes the part of the application, and the present invention shows Meaning property embodiment and explanation thereof are used for explaining the present invention, are not intended that inappropriate limitation of the present invention.At accompanying drawing In:
Fig. 1 represents the structural representation of a kind of dsp chip described in the embodiment of the present invention;
Fig. 2 represents the schematic flow sheet of the building method of the dsp chip described in the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with accompanying drawing, embodiments of the invention are described in detail, but the present invention can be wanted by right The multitude of different ways limited and cover is asked to implement.
Fig. 1 represents the structural representation of a kind of dsp chip described in the embodiment of the present invention..
Shown in Figure 1, a kind of dsp chip includes: a plurality of task passage, and in Fig. 1, first leads to Road 10, second channel 20, third channel 30, fourth lane 40, Five-channel 50, clematis stem road 60, For completing the algorithm task of master cpu 11 distribution;Every task passage all includes dma controller, Arithmetical unit and memorizer, for algorithm task described in complete independently;Every task passage is all total by data Line has connected the multiple interface modules corresponding to described algorithm task;Every task passage is all deposited with default Reservoir administrative unit 1 is connected, and described MMU memory management unit 1 connects data storage 2 by data/address bus.
The angle that the present embodiment requires from extremely low power dissipation design, it is proposed that a kind of new dsp chip Hardware configuration, decreases the invalid operation that in calculating process, CPU performs, and the multitask of hardware level is without greatly The Context switches of amount, greatly improves the operation efficiency that DSP algorithm realizes, thus dominant frequency is greatly reduced, The most further reduce the extra power consumption brought due to simultaneous techniques.Harshness is required in low-power consumption Under applied environment, it it is a kind of hardware configuration preferably meeting power consumption requirements.
Preferably, described a plurality of task passage includes: first passage 10, and described first passage 10 is by number Described master cpu 11, FPU Float Point Unit 12, PWM special purpose interface 13, USB is connected respectively according to bus Interface 14 and GIPO manages module 15;Second channel 20, described second channel 20 is total by data Line connects the first dma controller 21, First Series statistical module 22, the first ALU respectively 23 and table transform module 24;Third channel 30, described third channel 30 is connected respectively by data/address bus Connect the second dma controller 31, second series statistical module 32, the second ALU 33;4th Passage 40, described fourth lane 40 connects the 3rd dma controller 41 respectively by data/address bus, and the 3rd Series of statistical module 42, multiplies accumulating array arithmetical unit 43;Five-channel 50, described Five-channel 50 leads to Cross data/address bus and connect the 4th dma controller 51, ADC 52, DAC module 54 respectively, be Row stream I/O interface 53.
This enforcement provides a typical case achieving hardware level parallel multi-task system and realizes logic diagram, to hardware Level parallel multi-task is explained as follows:
System can support six independent tasks simultaneously, and the controller of first passage 10 is master cpu 11, with Manage and be configured to main, also supporting the task with master cpu 11 as arithmetical unit;Second channel 20 is with data It is transmitted as main;Fourth lane 40 is that 8 × 8 multiplier arrays are special;Five-channel 50 is fixed cycle number Special according to stream, it is typically allocated to ADC52 module or DAC module 54;Clematis stem road 60 is external Sdram interface 62 and the designated lane of data RAM exchange data in sheet;Other arithmetical units walk the 3rd Passage 30, third channel 30 can be one or more.
Preferably, described PWM special purpose interface 13 is 8 road PWM special purpose interfaces.
Preferably, multiply accumulating array arithmetical unit 43 described in and multiply accumulating array arithmetical unit for 8*8.
Preferably, described ADC 52 is 16Bit with described DAC module 54.
Preferably, described data/address bus is local bus.
Fig. 2 represents the schematic flow sheet of the building method of the dsp chip described in the embodiment of the present invention.
Shown in Figure 2, the building method of a kind of dsp chip includes:
S101: construct a plurality of task passage, has been used for the algorithm task of master cpu distribution;
S103: control every task passage and be all connected with dma controller, arithmetical unit and memorizer, For algorithm task described in complete independently;
S105: control every task passage and all connected by data/address bus, corresponding to described algorithm task Multiple interface modules;
S107: control every task passage and be all connected with the MMU preset, described storage tube Reason unit connects data storage by data/address bus.
In the technical scheme of the present embodiment, in step S101, a plurality of task passage of structure hardware level, And it being equipped with multiplex data bus so that each task passage all can be with the calculation of complete independently master cpu distribution Method task;So that the algorithm task that task passage every day all can distribute with complete independently master cpu, In step s 103, every task passage is controlled all and dma controller, arithmetical unit and memorizer phase Connecting, for algorithm task described in complete independently, this reservoir of will seeking survival needs multi-channel port multiple to coordinate Task passage, and the number of special arithmetical unit is not only multiplier, but according to algorithm task custom hardware Arithmetic element;In step S105 to S107, it is all according to algorithm task, every task passage to be passed through Data/address bus connects corresponding hardware cell.Additionally the multitask of hardware level needs programmer distribution memorizer Use, the task passage of reasonable arrangement hardware.
The method that the present invention proposes is to use hardware level parallel multi-task system.It it is exactly briefly each task Take alone a hardware space, switch completely without task.Many group hardware are in concurrent working, and work Dominant frequency is relatively low.Practice have shown that, work dominant frequency is power saving on tens orders of magnitude.Low-power consumption MCU's Just that's about the size of it for dominant frequency, the cortex-M0 of such as ARM.And for digital signal processing algorithm, Basic operator is all fairly simple, at all need not be by the control process of CPU level.Complicated and uncomfortable Control mechanism can increase power consumption.
In the inventive solutions, each independent algorithm task is controlled by one group of passage, DMA Device, arithmetical unit and memorizer are constituted;Such structure has a following advantage for the realization of low-power consumption:
Open different number of task passage according to task amount number, do not have the passage of task to hang up completely, firmly The power consumption of part and the strict positive correlation of operand.
The data/address bus of each hardware corridor is local bus, parasitic electric capacity and required average driving Electric current is greatly reduced, and hard-wired extra power consumption is few.
The task of CPU is as just the manager of commander Yu peripheral hardware, in big data quantity calculating process, CPU Can completely hang up, there is no useless fetching, decoding process.
Multitask is realized by multiple simple controllers, for general dsp, core can be greatly reduced The dominant frequency of sheet, the lower power consumption thus brought is the most considerable.The above is the side of being preferable to carry out of the present invention Formula, it is noted that for those skilled in the art, without departing from of the present invention former On the premise of reason, it is also possible to make some improvements and modifications, it is new that these improvements and modifications also should be regarded as this practicality The protection domain of type.

Claims (9)

1. a dsp chip, it is characterised in that including:
A plurality of task passage, has been used for the algorithm task of master cpu distribution;
Every task passage all includes dma controller, arithmetical unit and memorizer, for complete independently institute State algorithm task;
Every task passage has all connected the multiple interfaces corresponding to described algorithm task by data/address bus Module;
Every task passage is all connected with the MMU preset, and described MMU memory management unit is passed through Data/address bus connects data storage.
2. dsp chip as claimed in claim 1, it is characterised in that described a plurality of task passage includes:
First passage, described first passage connects described master cpu respectively by data/address bus, and floating-point is transported Calculate unit, PWM special purpose interface, USB interface and GIPO and manage module;
Second channel, described second channel by data/address bus connect respectively the first dma controller, first Series of statistical module, the first ALU and table transform module;
Third channel, described third channel connects the second dma controller respectively by data/address bus, and second Series of statistical module, the second ALU;
Fourth lane, described fourth lane connects the 3rd dma controller respectively by data/address bus, and the 3rd Series of statistical module, multiplies accumulating array arithmetical unit;
Five-channel, described Five-channel connects the 4th dma controller, ADC respectively by data/address bus Module, DAC module, sequence flows I/O interface.
3. dsp chip as claimed in claim 2, it is characterised in that described third channel is or many Bar.
4. dsp chip as claimed in claim 2, it is characterised in that described a plurality of task passage also includes:
One or more clematis stem road, described clematis stem road includes the 5th dma controller and for external The port of SD RAM Interface.
5. dsp chip as claimed in claim 2, it is characterised in that described PWM special purpose interface is 8 road PWM special purpose interfaces.
6. dsp chip as claimed in claim 2, it is characterised in that described in multiply accumulating array arithmetical unit Array arithmetical unit is multiplied accumulating for 8*8.
7. dsp chip as claimed in claim 2, it is characterised in that described ADC is with described DAC module is 16Bit.
8. the dsp chip as described in any one of claim 1-7, it is characterised in that described data/address bus is equal For local bus.
9. the building method of a dsp chip, it is characterised in that including:
Construct a plurality of task passage, be used for the algorithm task of master cpu distribution;
Control every task passage to be all connected with dma controller, arithmetical unit and memorizer, for solely Found described algorithm task;
Control every task passage all by data/address bus connected corresponding to described algorithm task multiple Interface module;
Control every task passage to be all connected with the MMU preset, described MMU memory management unit Data storage is connected by data/address bus.
CN201610290943.1A 2016-05-05 2016-05-05 DSP chip and construction method thereof Pending CN105975048A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610290943.1A CN105975048A (en) 2016-05-05 2016-05-05 DSP chip and construction method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610290943.1A CN105975048A (en) 2016-05-05 2016-05-05 DSP chip and construction method thereof

Publications (1)

Publication Number Publication Date
CN105975048A true CN105975048A (en) 2016-09-28

Family

ID=56994657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610290943.1A Pending CN105975048A (en) 2016-05-05 2016-05-05 DSP chip and construction method thereof

Country Status (1)

Country Link
CN (1) CN105975048A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363615A (en) * 2017-09-18 2018-08-03 清华大学无锡应用技术研究院 Method for allocating tasks and system for reconfigurable processing system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1126340A (en) * 1994-02-17 1996-07-10 皮尔金顿德国第二有限公司 Re-configurable application specific device
CN1556956A (en) * 2001-09-21 2004-12-22 ض� Multi-channel interface for communications between devices
CN101042684A (en) * 2006-03-21 2007-09-26 国际商业机器公司 System and method for improving system DMA mapping while substantially reducing memory fragmentation
CN102508643A (en) * 2011-11-16 2012-06-20 刘大可 Multicore-parallel digital signal processor and method for operating parallel instruction sets
CN104637483A (en) * 2015-02-03 2015-05-20 中国电子科技集团公司第五十八研究所 Multichannel-based low-speed voice coding/decoding system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1126340A (en) * 1994-02-17 1996-07-10 皮尔金顿德国第二有限公司 Re-configurable application specific device
CN1556956A (en) * 2001-09-21 2004-12-22 ض� Multi-channel interface for communications between devices
CN101042684A (en) * 2006-03-21 2007-09-26 国际商业机器公司 System and method for improving system DMA mapping while substantially reducing memory fragmentation
CN102508643A (en) * 2011-11-16 2012-06-20 刘大可 Multicore-parallel digital signal processor and method for operating parallel instruction sets
CN104637483A (en) * 2015-02-03 2015-05-20 中国电子科技集团公司第五十八研究所 Multichannel-based low-speed voice coding/decoding system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363615A (en) * 2017-09-18 2018-08-03 清华大学无锡应用技术研究院 Method for allocating tasks and system for reconfigurable processing system
CN108363615B (en) * 2017-09-18 2019-05-14 清华大学 Method for allocating tasks and system for reconfigurable processing system
US10705878B2 (en) 2017-09-18 2020-07-07 Wuxi Research Institute Of Applied Technologies Tsinghua University Task allocating method and system capable of improving computational efficiency of a reconfigurable processing system

Similar Documents

Publication Publication Date Title
CN108805266B (en) Reconfigurable CNN high-concurrency convolution accelerator
CN104317768B (en) Matrix multiplication accelerating method for CPU+DSP (Central Processing Unit + Digital Signal Processor) heterogeneous system
CN102508643A (en) Multicore-parallel digital signal processor and method for operating parallel instruction sets
CN1142484C (en) Vector processing method of microprocessor
CN102306139A (en) Heterogeneous multi-core digital signal processor for orthogonal frequency division multiplexing (OFDM) wireless communication system
CN103970720A (en) Embedded reconfigurable system based on large-scale coarse granularity and processing method of system
MY122682A (en) System and method for performing context switching and rescheduling of a processor
CN101387952A (en) Single-chip multi-processor task scheduling and managing method
CN102402415B (en) Device and method for buffering data in dynamic reconfigurable array
CN103984677A (en) Embedded reconfigurable system based on large-scale coarseness and processing method thereof
CN101504599A (en) Special instruction set micro-processing system suitable for digital signal processing application
CN102306141B (en) Method for describing configuration information of dynamic reconfigurable array
Zhong et al. An optimized mapping algorithm based on simulated annealing for regular NoC architecture
Metzlaff et al. A real-time capable many-core model
CN101789044A (en) Method of implementing cooperative work of software and hardware of genetic algorithm
CN105975048A (en) DSP chip and construction method thereof
CN102023846B (en) Shared front-end assembly line structure based on monolithic multiprocessor system
Tan et al. A pipelining loop optimization method for dataflow architecture
Abdelhamid et al. Condensing an overload of parallel computing ingredients into a single architecture recipe
CN108228242B (en) Configurable and flexible instruction scheduler
CN111008042A (en) Efficient general processor execution method and system based on heterogeneous pipeline
CN108196849A (en) A kind of low latency instruction scheduler
Awatramani et al. Perf-Sat: Runtime detection of performance saturation for GPGPU applications
CN104699520B (en) A kind of power-economizing method based on virtual machine (vm) migration scheduling
CN202281998U (en) Scalar floating-point operation accelerator

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160928

RJ01 Rejection of invention patent application after publication