CN107103113B - The Automation Design method, apparatus and optimization method towards neural network processor - Google Patents
The Automation Design method, apparatus and optimization method towards neural network processor Download PDFInfo
- Publication number
- CN107103113B CN107103113B CN201710178281.3A CN201710178281A CN107103113B CN 107103113 B CN107103113 B CN 107103113B CN 201710178281 A CN201710178281 A CN 201710178281A CN 107103113 B CN107103113 B CN 107103113B
- Authority
- CN
- China
- Prior art keywords
- neural network
- data
- hardware
- network processor
- automation design
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Geometry (AREA)
- Devices For Executing Special Programs (AREA)
- Design And Manufacture Of Integrated Circuits (AREA)
Abstract
The present invention proposes a kind of the Automation Design method, apparatus and optimization method towards neural network processor, the method comprising the steps of 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints parameter includes hardware resource size and object run speed;Step 2, file and the hardware resource constraints parameter are described according to the neural network model, the searching unit library from the neural network component library constructed, and the hardware description language code for corresponding to the neural network processor of the neural network model is generated according to the cell library;Step 3, the hardware description language code is converted to the hardware circuit of the neural network processor.
Description
Technical field
The present invention relates to neural network processor architecture technique fields, in particular to towards neural network processor
The Automation Design method, apparatus and optimization method.
Background technique
The rapid development of deep learning and nerual network technique handles task for large-scale data and provides new solution way
Diameter, various new neural network models have outstanding performance on handling complicated abstract problem, in visual pattern processing, voice
The new application in the fields such as identification and intelligent robot emerges one after another.
It is analyzed currently with deep neural network progress real-time task and relies on extensive high-performance processor or general mostly
Graphics processor, these equipment cost high power consumptions are big, towards portable intelligent device in application, there are circuit scales big, energy
A series of problems, such as consumption height and valuable product.Therefore, it is answered for embedded device and small low-cost data center etc.
The application handled in real time with high energy efficiency in field is accelerated using dedicated neural network processor rather than carries out mind by the way of software
A kind of more effective solution is calculated as through network model, however the topological structure of neural network model and parameter designing meeting
Changed according to different application scenarios, in addition quickly, providing one kind can be towards for the development change speed of neural network model
The various application scenarios and Universal efficient neural network processor for covering various neural network models is extremely difficult, this is answered for high level
With developer for the hardware-accelerated solution of different application Demand Design bring greatly it is constant.
Current existing neural network hardware acceleration technique includes specific integrated circuit (Application Specific
Integrated Circuit, ASIC) chip and field programmable gate array (Field Programmable Gate Array,
FPGA) two ways.Under same process conditions, the asic chip speed of service is fast and low in energy consumption, but design cycle is complicated, throws piece
Period is long, development cost is high, can not adapt to the characteristics of neural network model quickly updates;FPGA is flexible with circuit configuration, opens
Period short feature is sent out, but the speed of service is relatively low, hardware spending and power consumption are relatively large.Which kind of no matter added using above-mentioned hardware
Fast technology is required to neural network model and algorithm development personnel and grasps while awareness network topology and pattern of traffic firmly
The links such as part development technique, including processor architecture design, hardware identification code are write, simulating, verifying and placement-and-routing, these technologies
For being absorbed in researching neural network model and structure design, the higher layer applications developer without having hardware design ability
Development difficulty is higher.Therefore, in order to make high-rise developer efficiently carry out nerual network technique application and development, provide it is a kind of towards
The neural network processor the Automation Design method and tool of a variety of neural network models are very urgent.
Summary of the invention
In view of the deficiencies of the prior art, the present invention proposes the Automation Design method, apparatus towards neural network processor
And optimization method.
The present invention proposes a kind of the Automation Design method towards neural network processor, comprising:
Step 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints
Parameter includes hardware resource size and object run speed;
Step 2, file and the hardware resource constraints parameter are described according to the neural network model, from the mind constructed
Through searching unit library in networking component library, and the neural network for corresponding to the neural network model is generated according to the cell library
The hardware description language code of processor;
Step 3, the hardware description language code is converted to the hardware circuit of the neural network processor.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file,
Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute
Stating link information includes connection name, connection direction, connection type.
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list
Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list
Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to
Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape
The mode of state machine describes;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code,
Generate data storage mapping and control instruction stream.
The invention also includes a kind of the Automation Design device towards neural network processor, comprising:
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein described hard
Part resource constraint parameter includes hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and hardware money according to the neural network model
Source constrained parameters, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to institute
State the hardware description language code of the neural network processor of neural network model;
Hardware circuit module is generated, for converting the neural network processor for the hardware description language code
Hardware circuit.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file,
Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute
Stating link information includes connection name, connection direction, connection type.
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list
Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list
Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to
Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape
The mode of state machine describes;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code,
Generate data storage mapping and control instruction stream.
The present invention also proposes a kind of based on a kind of the Automation Design method towards neural network processor as mentioned
Optimization method, comprising:
Step 1, defining convolution kernel size is k*k, stepping s, memory width d, and datagram number is t, if k^2
Data are divided into the data block of k*k size by=d^2, and data width is consistent with memory width, guarantee data in memory
Coutinuous store;
Step 2, if k^2!=d^2, and stepping s is the greatest common divisor of k Yu memory width d, and data are divided
For the data block of s*s size, guarantee in a datagram data Coutinuous store in memory;
Step 3, if above two are all unsatisfactory for, the greatest common divisor f of stepping s, k, memory width d is found out, will be counted
According to the data block that size is f*f is divided into, t datagrams are alternately stored.
As it can be seen from the above scheme the present invention has the advantages that
The present invention neural network model can be mapped as hardware circuit and according to hardware resource constraints and network characterization from
Dynamic optimization circuit structure and data storage method, while corresponding control instruction stream is generated, realize neural network hardware acceleration
The hardware of device and software automation collaborative design improve neural network while shortening the neural network processor design cycle
Processor operation energy efficiency.
Detailed description of the invention
Fig. 1 is the automatic implementation tool work flow diagram of FPGA of neural network processor provided by the invention;
Fig. 2 is the neural network processor system schematic that invention can automatically generate;
Fig. 3 is the neural network reusable cell library schematic diagram that the present invention uses;
Fig. 4 is the address generating circuit interface diagram that the present invention uses.
Specific embodiment
It is logical below in conjunction with attached drawing in order to keep the purpose of the present invention, technical solution, design method and advantage more clear
Crossing specific embodiment, the present invention is described in more detail, it should be understood that specific embodiment described herein is only to explain
The present invention is not intended to limit the present invention.
The present invention is intended to provide the Automation Design method, apparatus and optimization method towards neural network processor, the dress
Set including a hardware generator and a compiler, the hardware generator can according to neural network type and hardware resource constraints from
The dynamic hardware description language code for generating neural network processor, subsequent designer are logical using existing hardware circuit design method
It crosses hardware description language and generates processor hardware circuit;The compiler can be generated according to neural network processor circuit structure and be controlled
System and data dispatch command stream.
Fig. 1 is that neural network processor provided by the invention automates generation technique schematic diagram, specific steps are as follows:
Step 1, apparatus of the present invention read neural network model and describe file, include in description file network topology structure and
Each operation layer definition;
Step 2, apparatus of the present invention read in hardware resource constraints parameter, and hardware constraints parameter includes hardware resource size and mesh
Speed of service etc. is marked, apparatus of the present invention can generate corresponding circuit structure according to hardware constraints parameter;
Step 3, apparatus of the present invention are according to the neural network model description script and hardware resource constraints from having been built up
Suitable cell library is indexed in good neural network component library, the hardware circuit generator which is included utilizes said units
Library generates the neural network processor hardware description language code of the corresponding neural network model;
Step 4, the compiler that apparatus of the present invention are included is constrained and is generated hard according to neural network model, logical resource
Part description language code building data storage mapping and control instruction stream;
Step 5, hardware circuit is converted by hardware description language by existing hardware design methods.
The neural network processor that the present invention can automatically generate is based on storage-control-calculating structure;
Storage organization is used to store data, neural network weight and the coprocessor operation instruction for participating in calculating;
Control structure includes that decoding circuit and control logic circuit generate for parsing operational order and control signal, the letter
Scheduling and storage and neural computing process number for data in control sheet;
Calculating structure includes computing unit, for participating in the operation of the neural computing in the processor.
Fig. 2 is 101 schematic diagram of neural network processor system that the present invention can automatically generate, the neural network processor system
101 frameworks of uniting are made of seven parts, including input data storage unit 102, control unit 103, output data storage unit
104, weight storage unit 105, the location of instruction 106, computing unit 107.
Input data storage unit 102 is used to store the data for participating in calculating, the data include primitive character diagram data with
Participate in the data that middle layer calculates;Output data storage unit 104 stores the neuron response being calculated;Instruction storage is single
106 storage of member participates in the command information calculated, and instruction is resolved to control stream to dispatch neural computing;Weight storage unit
105 for storing trained neural network weight;
Control unit 103 respectively with output data storage unit 104, weight storage unit 105, the location of instruction 106,
Computing unit 107 is connected, and control unit 103 obtains the instruction being stored in the location of instruction 106 and parses the instruction, controls
Unit 103 processed can carry out neural computing according to the control signal control computing unit analyzed the instruction.
Computing unit 107 is used to execute corresponding neural computing according to the control signal that control unit 103 generates.
Computing unit 107 is associated with one or more storage units, and computing unit 107 can be deposited from input data associated there
Data storage part in storage unit 102 obtains data to be calculated, and can deposit to output data associated there
Data are written in storage unit 104.Computing unit 107 completes most of operation in neural network algorithm, i.e. multiply-add operation of vector etc..
The present invention describes neural network model feature by providing the neural network description file format, this describes file
Content includes essential attribute, parameter description and link information three parts, and wherein essential attribute includes layer name and channel type, parameter
Description includes the output number of plies, convolution kernel size and step size, and link information includes connection name, connection direction, connection class
Type.
In order to adapt to the hardware design of various neural network models, neural network reusable unit provided by the invention
Library such as Fig. 3, cell library include hardware description file and configuration script two parts.Reusable cell library provided by the invention include but
It is not limited to: neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, look-up table
Unit, scalar/vector, control unit etc..
The present invention is when constituting neural network processor system using above-mentioned reusable cell library, by reading neural network
Model describes file and hardware resource constraints reasonably optimizing call unit library.
In the neural network processor course of work, processor needs the automatic ground for obtaining on piece and chip external memory data
Location stream, in the present invention, storage address stream determines generation by compiler, the memory access mould determined by storage address stream
For formula by text interaction to hardware generator, memory access patterns include that main access module, data access patterns and weight are visited
Ask mode etc..
Hardware generator is according to the memory access patterns scalar/vector (AGU).
The neural network processor circuit packet designed using neural network processor design aids provided by the invention
Include the scalar/vector of three types, comprising: main scalar/vector, data address generation unit and weight address generate single
Member, wherein main scalar/vector is responsible for the data exchange between on-chip memory and chip external memory, and data address generates single
Member is responsible for reading data to computing unit from on-chip memory and by computing unit results of intermediate calculations and final calculation result
Store to this two parts data exchange of storage unit, weight scalar/vector be responsible for reading from on-chip memory weighted data to
Computing unit.
In the present invention, hardware circuit generator and compiler, which cooperate, realizes the design of address generating circuit, specifically
Algorithm for design step are as follows:
Step 1, the neural network model and hardware constraints that apparatus of the present invention are specified according to designer determine data path,
And data resource sharing mode is determined according to neural network middle layer feature;
Step 2, compiler generates storage address access stream, the address access stream according to hardware configuration and network characterization
It is described by way of finite state machine by compiler;
Step 3, the finite state machine is mapped as address generating circuit hardware description language, Jin Ertong by hardware generator
It crosses hardware circuit design method and is mapped as hardware circuit.
Fig. 4 is address generating circuit universal architecture schematic diagram provided by the invention.Address generating circuit tool of the present invention
There is universal signal interface, the interface signal which includes has:
Starting address signal, i.e. data first address;
Block size signal takes the data volume of a data;
Memory flag signal determines that the memory for storing data is numbered;
Operating mode signal is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full volume
Product module formula etc.;
Convolution kernel size signal defines convolution kernel size;
Length signals, definition output picture size;
Input number of layers signal, label input number of layers;
Export number of layers signal, label output number of layers;
Reset signal, when which is 1, initialization address generative circuit;
Write enable signal specifies accessed memory to carry out write operation;
Enable signal is read, accessed memory is specified to carry out read operation;
Address signal provides access storage address;
End signal accesses end signal.
The parameter ensures that AGU supports multiple-working mode and guarantees in different working modes and neural network communication process
In can generate correct read/write address stream.
For different target networks, tool is chosen necessary parameter building address generator and is provided from the template
On piece and chip external memory access module.
Neural network processor provided by the invention constructs processor architecture using the mode of data-driven, therefore described
Location generative circuit access address is not only provided and also the different nervous layers of driving and and layer data block execution.
Due to the limitation of resource constraint, neural network model can not describe shape according to its model when being mapped as hardware circuit
Formula is completely unfolded, thus design aids proposed by the present invention using cooperative work of software and hardware method optimizing data storage and
Access mechanism, including two parts content: firstly, the calculating handling capacity and on-chip memory of compiler analysis neural network processor
Neural network characteristics data and weighted data are divided into set of data blocks appropriate and store and access by size;Secondly, according to meter
It calculates unit scale, memory and data bit width and carries out data segmentation in data block.
The present invention is based on above-mentioned Optimization Mechanisms to propose a kind of optimization method data storage and accessed, specific implementation step
Are as follows:
Step 1, defining convolution kernel size is k*k, stepping s, memory width d, and datagram number is t, if k^2
Data are divided into the data block of k*k size by=d^2, and data width is consistent with memory width, guarantee data in memory
Coutinuous store;
Step 2, if k^2!=d^2, and s is the greatest common divisor of k and d, and data are divided into the data of s*s size
Block guarantees that in a datagram, data can Coutinuous store in memory;
Step 3, if above two are all unsatisfactory for, the greatest common divisor f of s, k, d are found out, data, which are divided into size, is
The data block of f*f, t datagrams alternately store.
The calculating data of neural network include input feature vector data and trained weighted data, are deposited by good data
Storage layout can reduce processor internal data bandwidth and improve memory space utilization efficiency.Automated Design work provided by the invention
Tool stores the computational efficiency that locality improves processor by increasing processor data.
In conclusion the present invention provides a the Automation Design tool towards neural network processor, which has
The hardware identification code of description neural network processor is mapped as, according to hardware resource constraints optimized processor frame from neural network model
Structure flows the functions such as instruction with control is automatically generated, and realizes the Automation Design of neural network processor, reduces neural network
The design cycle of processor has adapted to nerual network technique network model updating decision, arithmetic speed requires block, energy efficiency requirement
High application characteristic.
Although not each embodiment only includes one it should be appreciated that this specification describes according to various embodiments
A independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should will say
As a whole, the technical solutions in the various embodiments may also be suitably combined for bright book, and forming those skilled in the art can be with
The other embodiments of understanding.
The present invention also proposes a kind of the Automation Design device towards neural network processor, comprising:
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein described hard
Part resource constraint parameter includes hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and hardware money according to the neural network model
Source constrained parameters, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to institute
State the hardware description language code of the neural network processor of neural network model;
Hardware circuit module is generated, for converting the neural network processor for the hardware description language code
Hardware circuit.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file,
Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute
Stating link information includes connection name, connection direction, connection type.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list
Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to
Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape
The mode of state machine describes;
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list
Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The finite state machine is mapped as address, and generates hardware description language code, and then is converted into the nerve
The hardware circuit of network processing unit.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code,
Generate data storage mapping and control instruction stream.
The foregoing is merely the schematical specific embodiment of the present invention, the range being not intended to limit the invention.It is any
Those skilled in the art, made equivalent variations, modification and combination under the premise of not departing from design and the principle of the present invention,
It should belong to the scope of protection of the invention.
Claims (17)
1. a kind of the Automation Design method towards neural network processor characterized by comprising
Step 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints parameter
Including hardware resource size and object run speed;
Step 2, file and the hardware resource constraints parameter are described according to the neural network model, from the nerve net constructed
Searching unit library in network Component Gallery, and the Processing with Neural Network for corresponding to the neural network model is generated according to the cell library
The hardware description language code of device;
Step 3, the neural network model and hardware resource constraints parameter specified according to user determine data path, and according to nerve
Network middle layer feature determines data resource sharing mode, and compiler generates storage address according to hardware configuration and network characterization
Access stream, hardware circuit generator and compiler, which cooperate, realizes address generating circuit;Address generating circuit includes Working mould
Formula signal is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full convolution mode;Address generates
Circuit further includes block size signal, takes the data volume of a data;
Step 4, the hardware description language code is converted to the hardware circuit of the neural network processor;
Wherein step 4 further include:, will be neural according to the calculating handling capacity and on-chip memory size of the neural network processor
Network characterization data and weighted data, which are divided into set of data blocks, to be stored and accessed, and the meter according to the neural network processor
Unit scale, memory and data bit width are calculated, data segmentation is carried out in data block.
2. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described
Neural network processor includes storage organization, control structure, calculates structure.
3. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described
It includes essential attribute, parameter description and link information three parts that neural network model, which describes file, wherein the essential attribute packet
Layer name and channel type are included, the parameter description includes the output number of plies, convolution kernel size and step size, the link information packet
Include connection name, connection direction, connection type.
4. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described
Cell library includes hardware description file and configuration script two parts.
5. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described
Cell library includes neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, look-up table
Unit, scalar/vector, control unit.
6. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described
Neural network processor includes main scalar/vector, data address generation unit and weight scalar/vector.
7. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that also wrap
Include the neural network model specified according to user and hardware resource constraints parameter determine data path, and according to neural network among
Layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through finite state machine
Mode describe;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
8. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that also wrap
It includes according to the neural network model, the hardware resource constraints parameter, the hardware description language code, generates data storage
Mapping and control instruction stream.
9. a kind of the Automation Design device towards neural network processor characterized by comprising
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein the hardware provides
Source constrained parameters include hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and the hardware resource about according to the neural network model
Beam parameter, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to the mind
The hardware description language code of neural network processor through network model;
Hardware/Software Collaborative Design module, the neural network model specified according to user and hardware resource constraints parameter determine data road
Diameter, and data resource sharing mode is determined according to neural network middle layer feature, compiler is according to hardware configuration and network characterization
Storage address access stream is generated, hardware circuit generator and compiler cooperate and realize address generating circuit;Address generates
Circuit includes operating mode signal, is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full volume
Product module formula;Address generating circuit further includes block size signal, takes the data volume of a data;
Hardware circuit module is generated, for converting the hardware description language code on the hardware of the neural network processor
Circuit;
Wherein generate hardware circuit module further include: according to the calculating handling capacity and on-chip memory of the neural network processor
Neural network characteristics data and weighted data are divided into set of data blocks and store and access by size, and according to the nerve net
Computing unit scale, memory and the data bit width of network processor carry out data segmentation in data block.
10. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute
Neural network processor is stated to include storage organization, control structure, calculate structure.
11. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute
Stating neural network model and describing file includes essential attribute, parameter description and link information three parts, wherein the essential attribute
Including layer name and channel type, the parameter description includes the output number of plies, convolution kernel size and step size, the link information
Including connection name, connection direction, connection type.
12. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute
Stating cell library includes hardware description file and configuration script two parts.
13. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute
Cell library is stated to include neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, search
Table unit, scalar/vector, control unit.
14. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute
Stating neural network processor includes main scalar/vector, data address generation unit and weight scalar/vector.
15. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that also
Data path is determined including the neural network model specified according to user and hardware resource constraints parameter, and according in neural network
Interbed feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through finite state machine
Mode describe;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
16. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that also
Including generating data and depositing according to the neural network model, the hardware resource constraints parameter, the hardware description language code
Storage mapping and control instruction stream.
17. a kind of a kind of the Automation Design towards neural network processor based on as described in claim 1-8 any one
The optimization method of method characterized by comprising
Definition convolution kernel size is k*k, stepping s, memory width d, and datagram number is that t will be counted if k^2=d^2
According to the data block for being divided into k*k size, data width is consistent with memory width, guarantees data Coutinuous store in memory;
If k^2!=d^2, and stepping s is the greatest common divisor of k Yu memory width d, and data are divided into s*s size
Data block guarantees in a datagram data Coutinuous store in memory;
If k^2!=d^2, and stepping s is not the greatest common divisor of k Yu memory width d, then finds out stepping s, k, storage
Data are divided into the data block that size is f*f by the greatest common divisor f of device width d, and t datagrams alternately store.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710178281.3A CN107103113B (en) | 2017-03-23 | 2017-03-23 | The Automation Design method, apparatus and optimization method towards neural network processor |
PCT/CN2018/080207 WO2018171717A1 (en) | 2017-03-23 | 2018-03-23 | Automated design method and system for neural network processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710178281.3A CN107103113B (en) | 2017-03-23 | 2017-03-23 | The Automation Design method, apparatus and optimization method towards neural network processor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107103113A CN107103113A (en) | 2017-08-29 |
CN107103113B true CN107103113B (en) | 2019-01-11 |
Family
ID=59676152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710178281.3A Active CN107103113B (en) | 2017-03-23 | 2017-03-23 | The Automation Design method, apparatus and optimization method towards neural network processor |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107103113B (en) |
WO (1) | WO2018171717A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11790664B2 (en) | 2019-02-19 | 2023-10-17 | Tesla, Inc. | Estimating object properties using visual image data |
US11797304B2 (en) | 2018-02-01 | 2023-10-24 | Tesla, Inc. | Instruction set architecture for a vector computational unit |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11841434B2 (en) | 2018-07-20 | 2023-12-12 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11893774B2 (en) | 2018-10-11 | 2024-02-06 | Tesla, Inc. | Systems and methods for training machine models with augmented data |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
Families Citing this family (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107103113B (en) * | 2017-03-23 | 2019-01-11 | 中国科学院计算技术研究所 | The Automation Design method, apparatus and optimization method towards neural network processor |
WO2018176000A1 (en) | 2017-03-23 | 2018-09-27 | DeepScale, Inc. | Data synthesis for autonomous control systems |
CN107341761A (en) * | 2017-07-12 | 2017-11-10 | 成都品果科技有限公司 | A kind of calculating of deep neural network performs method and system |
US11409692B2 (en) | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
US10671349B2 (en) | 2017-07-24 | 2020-06-02 | Tesla, Inc. | Accelerated mathematical engine |
US11157441B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system using non-consecutive data formatting |
CN107633295B (en) * | 2017-09-25 | 2020-04-28 | 南京地平线机器人技术有限公司 | Method and device for adapting parameters of a neural network |
CN109697509B (en) * | 2017-10-24 | 2020-10-20 | 上海寒武纪信息科技有限公司 | Processing method and device, and operation method and device |
CN109726805B (en) * | 2017-10-30 | 2021-02-09 | 上海寒武纪信息科技有限公司 | Method for designing neural network processor by using black box simulator |
US11521046B2 (en) | 2017-11-08 | 2022-12-06 | Samsung Electronics Co., Ltd. | Time-delayed convolutions for neural network device and method |
CN110097180B (en) * | 2018-01-29 | 2020-02-21 | 上海寒武纪信息科技有限公司 | Computer device, data processing method, and storage medium |
EP3614260A4 (en) | 2017-11-20 | 2020-10-21 | Shanghai Cambricon Information Technology Co., Ltd | Task parallel processing method, apparatus and system, storage medium and computer device |
CN110097179B (en) * | 2018-01-29 | 2020-03-10 | 上海寒武纪信息科技有限公司 | Computer device, data processing method, and storage medium |
KR20200100528A (en) * | 2017-12-29 | 2020-08-26 | 캠브리콘 테크놀로지스 코퍼레이션 리미티드 | Neural network processing method, computer system and storage medium |
CN111582464B (en) * | 2017-12-29 | 2023-09-29 | 中科寒武纪科技股份有限公司 | Neural network processing method, computer system and storage medium |
CN108563808B (en) * | 2018-01-05 | 2020-12-04 | 中国科学技术大学 | Design method of heterogeneous reconfigurable graph computing accelerator system based on FPGA |
CN108388943B (en) * | 2018-01-08 | 2020-12-29 | 中国科学院计算技术研究所 | Pooling device and method suitable for neural network |
CN108154229B (en) * | 2018-01-10 | 2022-04-08 | 西安电子科技大学 | Image processing method based on FPGA (field programmable Gate array) accelerated convolutional neural network framework |
CN108389183A (en) * | 2018-01-24 | 2018-08-10 | 上海交通大学 | Pulmonary nodule detects neural network accelerator and its control method |
WO2019181137A1 (en) * | 2018-03-23 | 2019-09-26 | ソニー株式会社 | Information processing device and information processing method |
CN108921289B (en) * | 2018-06-20 | 2021-10-29 | 郑州云海信息技术有限公司 | FPGA heterogeneous acceleration method, device and system |
US11215999B2 (en) | 2018-06-20 | 2022-01-04 | Tesla, Inc. | Data pipeline and deep learning system for autonomous driving |
US11636333B2 (en) | 2018-07-26 | 2023-04-25 | Tesla, Inc. | Optimizing neural network structures for embedded systems |
US11562231B2 (en) | 2018-09-03 | 2023-01-24 | Tesla, Inc. | Neural networks for embedded devices |
CN110955380B (en) * | 2018-09-21 | 2021-01-12 | 中科寒武纪科技股份有限公司 | Access data generation method, storage medium, computer device and apparatus |
CN111078293B (en) * | 2018-10-19 | 2021-03-16 | 中科寒武纪科技股份有限公司 | Operation method, device and related product |
CN111079924B (en) * | 2018-10-19 | 2021-01-08 | 中科寒武纪科技股份有限公司 | Operation method, system and related product |
CN111079916B (en) * | 2018-10-19 | 2021-01-15 | 安徽寒武纪信息科技有限公司 | Operation method, system and related product |
CN111079907B (en) * | 2018-10-19 | 2021-01-26 | 安徽寒武纪信息科技有限公司 | Operation method, device and related product |
CN111079914B (en) * | 2018-10-19 | 2021-02-09 | 中科寒武纪科技股份有限公司 | Operation method, system and related product |
CN111079911B (en) * | 2018-10-19 | 2021-02-09 | 中科寒武纪科技股份有限公司 | Operation method, system and related product |
WO2020078446A1 (en) * | 2018-10-19 | 2020-04-23 | 中科寒武纪科技股份有限公司 | Computation method and apparatus, and related product |
CN111079909B (en) * | 2018-10-19 | 2021-01-26 | 安徽寒武纪信息科技有限公司 | Operation method, system and related product |
CN111079912B (en) * | 2018-10-19 | 2021-02-12 | 中科寒武纪科技股份有限公司 | Operation method, system and related product |
CN111079910B (en) * | 2018-10-19 | 2021-01-26 | 中科寒武纪科技股份有限公司 | Operation method, device and related product |
US11196678B2 (en) | 2018-10-25 | 2021-12-07 | Tesla, Inc. | QOS manager for system on a chip communications |
CN111144561B (en) * | 2018-11-05 | 2023-05-02 | 杭州海康威视数字技术股份有限公司 | Neural network model determining method and device |
WO2020093304A1 (en) * | 2018-11-08 | 2020-05-14 | 北京比特大陆科技有限公司 | Method, apparatus, and device for compiling neural network, storage medium, and program product |
CN109491956B (en) * | 2018-11-09 | 2021-04-23 | 北京灵汐科技有限公司 | Heterogeneous collaborative computing system |
US11537811B2 (en) | 2018-12-04 | 2022-12-27 | Tesla, Inc. | Enhanced object detection for autonomous vehicles based on field view |
KR20200069901A (en) | 2018-12-07 | 2020-06-17 | 삼성전자주식회사 | A method for slicing a neural network and a neuromorphic apparatus |
CN111325311B (en) * | 2018-12-14 | 2024-03-29 | 深圳云天励飞技术有限公司 | Neural network model generation method for image recognition and related equipment |
CN109726797B (en) * | 2018-12-21 | 2019-11-19 | 北京中科寒武纪科技有限公司 | Data processing method, device, computer system and storage medium |
CN109685203B (en) * | 2018-12-21 | 2020-01-17 | 中科寒武纪科技股份有限公司 | Data processing method, device, computer system and storage medium |
US11610117B2 (en) | 2018-12-27 | 2023-03-21 | Tesla, Inc. | System and method for adapting a neural network model on a hardware platform |
CN111461296B (en) * | 2018-12-29 | 2023-09-22 | 中科寒武纪科技股份有限公司 | Data processing method, electronic device, and readable storage medium |
CN109754084B (en) * | 2018-12-29 | 2020-06-12 | 中科寒武纪科技股份有限公司 | Network structure processing method and device and related products |
US10997461B2 (en) | 2019-02-01 | 2021-05-04 | Tesla, Inc. | Generating ground truth for machine learning from time series elements |
US11150664B2 (en) | 2019-02-01 | 2021-10-19 | Tesla, Inc. | Predicting three-dimensional features for autonomous driving |
US11567514B2 (en) | 2019-02-11 | 2023-01-31 | Tesla, Inc. | Autonomous and user controlled vehicle summon to a target |
CN109978160B (en) * | 2019-03-25 | 2021-03-02 | 中科寒武纪科技股份有限公司 | Configuration device and method of artificial intelligence processor and related products |
CN109739802B (en) * | 2019-04-01 | 2019-06-18 | 上海燧原智能科技有限公司 | Computing cluster and computing cluster configuration method |
KR20200139909A (en) | 2019-06-05 | 2020-12-15 | 삼성전자주식회사 | Electronic apparatus and method of performing operations thereof |
CN112132271A (en) * | 2019-06-25 | 2020-12-25 | Oppo广东移动通信有限公司 | Neural network accelerator operation method, architecture and related device |
CN111126572B (en) * | 2019-12-26 | 2023-12-08 | 北京奇艺世纪科技有限公司 | Model parameter processing method and device, electronic equipment and storage medium |
CN111339027B (en) * | 2020-02-25 | 2023-11-28 | 中国科学院苏州纳米技术与纳米仿生研究所 | Automatic design method of reconfigurable artificial intelligent core and heterogeneous multi-core chip |
CN111488969B (en) * | 2020-04-03 | 2024-01-19 | 北京集朗半导体科技有限公司 | Execution optimization method and device based on neural network accelerator |
CN111949405A (en) * | 2020-08-13 | 2020-11-17 | Oppo广东移动通信有限公司 | Resource scheduling method, hardware accelerator and electronic equipment |
CN111931926A (en) * | 2020-10-12 | 2020-11-13 | 南京风兴科技有限公司 | Hardware acceleration system and control method for convolutional neural network CNN |
JP2023032348A (en) * | 2021-08-26 | 2023-03-09 | 国立大学法人 東京大学 | Information processing device, and program |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106022468A (en) * | 2016-05-17 | 2016-10-12 | 成都启英泰伦科技有限公司 | Artificial neural network processor integrated circuit and design method therefor |
WO2016179533A1 (en) * | 2015-05-06 | 2016-11-10 | Indiana University Research And Technology Corporation | Sensor signal processing using an analog neural network |
CN106447034A (en) * | 2016-10-27 | 2017-02-22 | 中国科学院计算技术研究所 | Neutral network processor based on data compression, design method and chip |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107103113B (en) * | 2017-03-23 | 2019-01-11 | 中国科学院计算技术研究所 | The Automation Design method, apparatus and optimization method towards neural network processor |
-
2017
- 2017-03-23 CN CN201710178281.3A patent/CN107103113B/en active Active
-
2018
- 2018-03-23 WO PCT/CN2018/080207 patent/WO2018171717A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016179533A1 (en) * | 2015-05-06 | 2016-11-10 | Indiana University Research And Technology Corporation | Sensor signal processing using an analog neural network |
CN106022468A (en) * | 2016-05-17 | 2016-10-12 | 成都启英泰伦科技有限公司 | Artificial neural network processor integrated circuit and design method therefor |
CN106447034A (en) * | 2016-10-27 | 2017-02-22 | 中国科学院计算技术研究所 | Neutral network processor based on data compression, design method and chip |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
Non-Patent Citations (1)
Title |
---|
基于神经网络嵌入式系统体系结构的研究;叶莉娅等;《杭州电子科技大学学报》;20050430;第61-64页 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
US11797304B2 (en) | 2018-02-01 | 2023-10-24 | Tesla, Inc. | Instruction set architecture for a vector computational unit |
US11841434B2 (en) | 2018-07-20 | 2023-12-12 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11893774B2 (en) | 2018-10-11 | 2024-02-06 | Tesla, Inc. | Systems and methods for training machine models with augmented data |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11790664B2 (en) | 2019-02-19 | 2023-10-17 | Tesla, Inc. | Estimating object properties using visual image data |
Also Published As
Publication number | Publication date |
---|---|
WO2018171717A1 (en) | 2018-09-27 |
CN107103113A (en) | 2017-08-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107103113B (en) | The Automation Design method, apparatus and optimization method towards neural network processor | |
CN107016175B (en) | It is applicable in the Automation Design method, apparatus and optimization method of neural network processor | |
US11783227B2 (en) | Method, apparatus, device and readable medium for transfer learning in machine learning | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN106201651A (en) | The simulator of neuromorphic chip | |
CN110392902A (en) | Use the operation of sparse volume data | |
CN107169563A (en) | Processing system and method applied to two-value weight convolutional network | |
CN111860828B (en) | Neural network training method, storage medium and equipment | |
CN110163353A (en) | A kind of computing device and method | |
CN109522945A (en) | One kind of groups emotion identification method, device, smart machine and storage medium | |
CN104375805A (en) | Method for simulating parallel computation process of reconfigurable processor through multi-core processor | |
CN110309911A (en) | Neural network model verification method, device, computer equipment and storage medium | |
CN110163350A (en) | A kind of computing device and method | |
CN108171328A (en) | A kind of convolution algorithm method and the neural network processor based on this method | |
CN115828831A (en) | Multi-core chip operator placement strategy generation method based on deep reinforcement learning | |
CN110263328A (en) | A kind of disciplinary capability type mask method, device, storage medium and terminal device | |
CN113168552A (en) | Artificial intelligence application development system, computer device and storage medium | |
CN110442753A (en) | A kind of chart database auto-creating method and device based on OPC UA | |
CN117574767A (en) | Simulation method and simulator for software and hardware systems of in-memory computing architecture | |
CN106844900A (en) | The erection method of electromagnetic transient simulation system | |
KR102188044B1 (en) | Framework system for intelligent application development based on neuromorphic architecture | |
CN104991884B (en) | Heterogeneous polynuclear SoC architecture design method | |
CN110276413A (en) | A kind of model compression method and device | |
Li et al. | Liquid state machine applications mapping for noc-based neuromorphic platforms | |
KR20220061835A (en) | Apparatus and method for hardware acceleration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |