CN107103113B - The Automation Design method, apparatus and optimization method towards neural network processor - Google Patents

The Automation Design method, apparatus and optimization method towards neural network processor Download PDF

Info

Publication number
CN107103113B
CN107103113B CN201710178281.3A CN201710178281A CN107103113B CN 107103113 B CN107103113 B CN 107103113B CN 201710178281 A CN201710178281 A CN 201710178281A CN 107103113 B CN107103113 B CN 107103113B
Authority
CN
China
Prior art keywords
neural network
data
hardware
network processor
automation design
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710178281.3A
Other languages
Chinese (zh)
Other versions
CN107103113A (en
Inventor
韩银和
许浩博
王颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201710178281.3A priority Critical patent/CN107103113B/en
Publication of CN107103113A publication Critical patent/CN107103113A/en
Priority to PCT/CN2018/080207 priority patent/WO2018171717A1/en
Application granted granted Critical
Publication of CN107103113B publication Critical patent/CN107103113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Devices For Executing Special Programs (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The present invention proposes a kind of the Automation Design method, apparatus and optimization method towards neural network processor, the method comprising the steps of 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints parameter includes hardware resource size and object run speed;Step 2, file and the hardware resource constraints parameter are described according to the neural network model, the searching unit library from the neural network component library constructed, and the hardware description language code for corresponding to the neural network processor of the neural network model is generated according to the cell library;Step 3, the hardware description language code is converted to the hardware circuit of the neural network processor.

Description

The Automation Design method, apparatus and optimization method towards neural network processor
Technical field
The present invention relates to neural network processor architecture technique fields, in particular to towards neural network processor The Automation Design method, apparatus and optimization method.
Background technique
The rapid development of deep learning and nerual network technique handles task for large-scale data and provides new solution way Diameter, various new neural network models have outstanding performance on handling complicated abstract problem, in visual pattern processing, voice The new application in the fields such as identification and intelligent robot emerges one after another.
It is analyzed currently with deep neural network progress real-time task and relies on extensive high-performance processor or general mostly Graphics processor, these equipment cost high power consumptions are big, towards portable intelligent device in application, there are circuit scales big, energy A series of problems, such as consumption height and valuable product.Therefore, it is answered for embedded device and small low-cost data center etc. The application handled in real time with high energy efficiency in field is accelerated using dedicated neural network processor rather than carries out mind by the way of software A kind of more effective solution is calculated as through network model, however the topological structure of neural network model and parameter designing meeting Changed according to different application scenarios, in addition quickly, providing one kind can be towards for the development change speed of neural network model The various application scenarios and Universal efficient neural network processor for covering various neural network models is extremely difficult, this is answered for high level With developer for the hardware-accelerated solution of different application Demand Design bring greatly it is constant.
Current existing neural network hardware acceleration technique includes specific integrated circuit (Application Specific Integrated Circuit, ASIC) chip and field programmable gate array (Field Programmable Gate Array, FPGA) two ways.Under same process conditions, the asic chip speed of service is fast and low in energy consumption, but design cycle is complicated, throws piece Period is long, development cost is high, can not adapt to the characteristics of neural network model quickly updates;FPGA is flexible with circuit configuration, opens Period short feature is sent out, but the speed of service is relatively low, hardware spending and power consumption are relatively large.Which kind of no matter added using above-mentioned hardware Fast technology is required to neural network model and algorithm development personnel and grasps while awareness network topology and pattern of traffic firmly The links such as part development technique, including processor architecture design, hardware identification code are write, simulating, verifying and placement-and-routing, these technologies For being absorbed in researching neural network model and structure design, the higher layer applications developer without having hardware design ability Development difficulty is higher.Therefore, in order to make high-rise developer efficiently carry out nerual network technique application and development, provide it is a kind of towards The neural network processor the Automation Design method and tool of a variety of neural network models are very urgent.
Summary of the invention
In view of the deficiencies of the prior art, the present invention proposes the Automation Design method, apparatus towards neural network processor And optimization method.
The present invention proposes a kind of the Automation Design method towards neural network processor, comprising:
Step 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints Parameter includes hardware resource size and object run speed;
Step 2, file and the hardware resource constraints parameter are described according to the neural network model, from the mind constructed Through searching unit library in networking component library, and the neural network for corresponding to the neural network model is generated according to the cell library The hardware description language code of processor;
Step 3, the hardware description language code is converted to the hardware circuit of the neural network processor.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file, Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute Stating link information includes connection name, connection direction, connection type.
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape The mode of state machine describes;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code, Generate data storage mapping and control instruction stream.
The invention also includes a kind of the Automation Design device towards neural network processor, comprising:
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein described hard Part resource constraint parameter includes hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and hardware money according to the neural network model Source constrained parameters, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to institute State the hardware description language code of the neural network processor of neural network model;
Hardware circuit module is generated, for converting the neural network processor for the hardware description language code Hardware circuit.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file, Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute Stating link information includes connection name, connection direction, connection type.
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape The mode of state machine describes;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code, Generate data storage mapping and control instruction stream.
The present invention also proposes a kind of based on a kind of the Automation Design method towards neural network processor as mentioned Optimization method, comprising:
Step 1, defining convolution kernel size is k*k, stepping s, memory width d, and datagram number is t, if k^2 Data are divided into the data block of k*k size by=d^2, and data width is consistent with memory width, guarantee data in memory Coutinuous store;
Step 2, if k^2!=d^2, and stepping s is the greatest common divisor of k Yu memory width d, and data are divided For the data block of s*s size, guarantee in a datagram data Coutinuous store in memory;
Step 3, if above two are all unsatisfactory for, the greatest common divisor f of stepping s, k, memory width d is found out, will be counted According to the data block that size is f*f is divided into, t datagrams are alternately stored.
As it can be seen from the above scheme the present invention has the advantages that
The present invention neural network model can be mapped as hardware circuit and according to hardware resource constraints and network characterization from Dynamic optimization circuit structure and data storage method, while corresponding control instruction stream is generated, realize neural network hardware acceleration The hardware of device and software automation collaborative design improve neural network while shortening the neural network processor design cycle Processor operation energy efficiency.
Detailed description of the invention
Fig. 1 is the automatic implementation tool work flow diagram of FPGA of neural network processor provided by the invention;
Fig. 2 is the neural network processor system schematic that invention can automatically generate;
Fig. 3 is the neural network reusable cell library schematic diagram that the present invention uses;
Fig. 4 is the address generating circuit interface diagram that the present invention uses.
Specific embodiment
It is logical below in conjunction with attached drawing in order to keep the purpose of the present invention, technical solution, design method and advantage more clear Crossing specific embodiment, the present invention is described in more detail, it should be understood that specific embodiment described herein is only to explain The present invention is not intended to limit the present invention.
The present invention is intended to provide the Automation Design method, apparatus and optimization method towards neural network processor, the dress Set including a hardware generator and a compiler, the hardware generator can according to neural network type and hardware resource constraints from The dynamic hardware description language code for generating neural network processor, subsequent designer are logical using existing hardware circuit design method It crosses hardware description language and generates processor hardware circuit;The compiler can be generated according to neural network processor circuit structure and be controlled System and data dispatch command stream.
Fig. 1 is that neural network processor provided by the invention automates generation technique schematic diagram, specific steps are as follows:
Step 1, apparatus of the present invention read neural network model and describe file, include in description file network topology structure and Each operation layer definition;
Step 2, apparatus of the present invention read in hardware resource constraints parameter, and hardware constraints parameter includes hardware resource size and mesh Speed of service etc. is marked, apparatus of the present invention can generate corresponding circuit structure according to hardware constraints parameter;
Step 3, apparatus of the present invention are according to the neural network model description script and hardware resource constraints from having been built up Suitable cell library is indexed in good neural network component library, the hardware circuit generator which is included utilizes said units Library generates the neural network processor hardware description language code of the corresponding neural network model;
Step 4, the compiler that apparatus of the present invention are included is constrained and is generated hard according to neural network model, logical resource Part description language code building data storage mapping and control instruction stream;
Step 5, hardware circuit is converted by hardware description language by existing hardware design methods.
The neural network processor that the present invention can automatically generate is based on storage-control-calculating structure;
Storage organization is used to store data, neural network weight and the coprocessor operation instruction for participating in calculating;
Control structure includes that decoding circuit and control logic circuit generate for parsing operational order and control signal, the letter Scheduling and storage and neural computing process number for data in control sheet;
Calculating structure includes computing unit, for participating in the operation of the neural computing in the processor.
Fig. 2 is 101 schematic diagram of neural network processor system that the present invention can automatically generate, the neural network processor system 101 frameworks of uniting are made of seven parts, including input data storage unit 102, control unit 103, output data storage unit 104, weight storage unit 105, the location of instruction 106, computing unit 107.
Input data storage unit 102 is used to store the data for participating in calculating, the data include primitive character diagram data with Participate in the data that middle layer calculates;Output data storage unit 104 stores the neuron response being calculated;Instruction storage is single 106 storage of member participates in the command information calculated, and instruction is resolved to control stream to dispatch neural computing;Weight storage unit 105 for storing trained neural network weight;
Control unit 103 respectively with output data storage unit 104, weight storage unit 105, the location of instruction 106, Computing unit 107 is connected, and control unit 103 obtains the instruction being stored in the location of instruction 106 and parses the instruction, controls Unit 103 processed can carry out neural computing according to the control signal control computing unit analyzed the instruction.
Computing unit 107 is used to execute corresponding neural computing according to the control signal that control unit 103 generates. Computing unit 107 is associated with one or more storage units, and computing unit 107 can be deposited from input data associated there Data storage part in storage unit 102 obtains data to be calculated, and can deposit to output data associated there Data are written in storage unit 104.Computing unit 107 completes most of operation in neural network algorithm, i.e. multiply-add operation of vector etc..
The present invention describes neural network model feature by providing the neural network description file format, this describes file Content includes essential attribute, parameter description and link information three parts, and wherein essential attribute includes layer name and channel type, parameter Description includes the output number of plies, convolution kernel size and step size, and link information includes connection name, connection direction, connection class Type.
In order to adapt to the hardware design of various neural network models, neural network reusable unit provided by the invention Library such as Fig. 3, cell library include hardware description file and configuration script two parts.Reusable cell library provided by the invention include but It is not limited to: neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, look-up table Unit, scalar/vector, control unit etc..
The present invention is when constituting neural network processor system using above-mentioned reusable cell library, by reading neural network Model describes file and hardware resource constraints reasonably optimizing call unit library.
In the neural network processor course of work, processor needs the automatic ground for obtaining on piece and chip external memory data Location stream, in the present invention, storage address stream determines generation by compiler, the memory access mould determined by storage address stream For formula by text interaction to hardware generator, memory access patterns include that main access module, data access patterns and weight are visited Ask mode etc..
Hardware generator is according to the memory access patterns scalar/vector (AGU).
The neural network processor circuit packet designed using neural network processor design aids provided by the invention Include the scalar/vector of three types, comprising: main scalar/vector, data address generation unit and weight address generate single Member, wherein main scalar/vector is responsible for the data exchange between on-chip memory and chip external memory, and data address generates single Member is responsible for reading data to computing unit from on-chip memory and by computing unit results of intermediate calculations and final calculation result Store to this two parts data exchange of storage unit, weight scalar/vector be responsible for reading from on-chip memory weighted data to Computing unit.
In the present invention, hardware circuit generator and compiler, which cooperate, realizes the design of address generating circuit, specifically Algorithm for design step are as follows:
Step 1, the neural network model and hardware constraints that apparatus of the present invention are specified according to designer determine data path, And data resource sharing mode is determined according to neural network middle layer feature;
Step 2, compiler generates storage address access stream, the address access stream according to hardware configuration and network characterization It is described by way of finite state machine by compiler;
Step 3, the finite state machine is mapped as address generating circuit hardware description language, Jin Ertong by hardware generator It crosses hardware circuit design method and is mapped as hardware circuit.
Fig. 4 is address generating circuit universal architecture schematic diagram provided by the invention.Address generating circuit tool of the present invention There is universal signal interface, the interface signal which includes has:
Starting address signal, i.e. data first address;
Block size signal takes the data volume of a data;
Memory flag signal determines that the memory for storing data is numbered;
Operating mode signal is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full volume Product module formula etc.;
Convolution kernel size signal defines convolution kernel size;
Length signals, definition output picture size;
Input number of layers signal, label input number of layers;
Export number of layers signal, label output number of layers;
Reset signal, when which is 1, initialization address generative circuit;
Write enable signal specifies accessed memory to carry out write operation;
Enable signal is read, accessed memory is specified to carry out read operation;
Address signal provides access storage address;
End signal accesses end signal.
The parameter ensures that AGU supports multiple-working mode and guarantees in different working modes and neural network communication process In can generate correct read/write address stream.
For different target networks, tool is chosen necessary parameter building address generator and is provided from the template On piece and chip external memory access module.
Neural network processor provided by the invention constructs processor architecture using the mode of data-driven, therefore described Location generative circuit access address is not only provided and also the different nervous layers of driving and and layer data block execution.
Due to the limitation of resource constraint, neural network model can not describe shape according to its model when being mapped as hardware circuit Formula is completely unfolded, thus design aids proposed by the present invention using cooperative work of software and hardware method optimizing data storage and Access mechanism, including two parts content: firstly, the calculating handling capacity and on-chip memory of compiler analysis neural network processor Neural network characteristics data and weighted data are divided into set of data blocks appropriate and store and access by size;Secondly, according to meter It calculates unit scale, memory and data bit width and carries out data segmentation in data block.
The present invention is based on above-mentioned Optimization Mechanisms to propose a kind of optimization method data storage and accessed, specific implementation step Are as follows:
Step 1, defining convolution kernel size is k*k, stepping s, memory width d, and datagram number is t, if k^2 Data are divided into the data block of k*k size by=d^2, and data width is consistent with memory width, guarantee data in memory Coutinuous store;
Step 2, if k^2!=d^2, and s is the greatest common divisor of k and d, and data are divided into the data of s*s size Block guarantees that in a datagram, data can Coutinuous store in memory;
Step 3, if above two are all unsatisfactory for, the greatest common divisor f of s, k, d are found out, data, which are divided into size, is The data block of f*f, t datagrams alternately store.
The calculating data of neural network include input feature vector data and trained weighted data, are deposited by good data Storage layout can reduce processor internal data bandwidth and improve memory space utilization efficiency.Automated Design work provided by the invention Tool stores the computational efficiency that locality improves processor by increasing processor data.
In conclusion the present invention provides a the Automation Design tool towards neural network processor, which has The hardware identification code of description neural network processor is mapped as, according to hardware resource constraints optimized processor frame from neural network model Structure flows the functions such as instruction with control is automatically generated, and realizes the Automation Design of neural network processor, reduces neural network The design cycle of processor has adapted to nerual network technique network model updating decision, arithmetic speed requires block, energy efficiency requirement High application characteristic.
Although not each embodiment only includes one it should be appreciated that this specification describes according to various embodiments A independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should will say As a whole, the technical solutions in the various embodiments may also be suitably combined for bright book, and forming those skilled in the art can be with The other embodiments of understanding.
The present invention also proposes a kind of the Automation Design device towards neural network processor, comprising:
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein described hard Part resource constraint parameter includes hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and hardware money according to the neural network model Source constrained parameters, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to institute State the hardware description language code of the neural network processor of neural network model;
Hardware circuit module is generated, for converting the neural network processor for the hardware description language code Hardware circuit.
The neural network processor includes storage organization, control structure, calculates structure.
It includes essential attribute, parameter description and link information three parts, wherein institute that the neural network model, which describes file, Stating essential attribute includes layer name and channel type, and the parameter description includes the output number of plies, convolution kernel size and step size, institute Stating link information includes connection name, connection direction, connection type.
The neural network processor includes that main scalar/vector, data address generation unit and weight address generate list Member.
It further include the neural network model specified according to user and hardware resource constraints parameter determines data path, and according to Neural network middle layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through limited shape The mode of state machine describes;
The neural network reusable cell library includes hardware description file and configuration script two parts.
The neural network reusable cell library includes neuron elements, accumulator element, pond unit, classifier list Member, local acknowledgement's normalization unit, look-up table unit, scalar/vector, control unit.
The finite state machine is mapped as address, and generates hardware description language code, and then is converted into the nerve The hardware circuit of network processing unit.
Further include according to the neural network model, the hardware resource constraints parameter, the hardware description language code, Generate data storage mapping and control instruction stream.
The foregoing is merely the schematical specific embodiment of the present invention, the range being not intended to limit the invention.It is any Those skilled in the art, made equivalent variations, modification and combination under the premise of not departing from design and the principle of the present invention, It should belong to the scope of protection of the invention.

Claims (17)

1. a kind of the Automation Design method towards neural network processor characterized by comprising
Step 1, it obtains neural network model and describes file, hardware resource constraints parameter, wherein the hardware resource constraints parameter Including hardware resource size and object run speed;
Step 2, file and the hardware resource constraints parameter are described according to the neural network model, from the nerve net constructed Searching unit library in network Component Gallery, and the Processing with Neural Network for corresponding to the neural network model is generated according to the cell library The hardware description language code of device;
Step 3, the neural network model and hardware resource constraints parameter specified according to user determine data path, and according to nerve Network middle layer feature determines data resource sharing mode, and compiler generates storage address according to hardware configuration and network characterization Access stream, hardware circuit generator and compiler, which cooperate, realizes address generating circuit;Address generating circuit includes Working mould Formula signal is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full convolution mode;Address generates Circuit further includes block size signal, takes the data volume of a data;
Step 4, the hardware description language code is converted to the hardware circuit of the neural network processor;
Wherein step 4 further include:, will be neural according to the calculating handling capacity and on-chip memory size of the neural network processor Network characterization data and weighted data, which are divided into set of data blocks, to be stored and accessed, and the meter according to the neural network processor Unit scale, memory and data bit width are calculated, data segmentation is carried out in data block.
2. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described Neural network processor includes storage organization, control structure, calculates structure.
3. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described It includes essential attribute, parameter description and link information three parts that neural network model, which describes file, wherein the essential attribute packet Layer name and channel type are included, the parameter description includes the output number of plies, convolution kernel size and step size, the link information packet Include connection name, connection direction, connection type.
4. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described Cell library includes hardware description file and configuration script two parts.
5. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described Cell library includes neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, look-up table Unit, scalar/vector, control unit.
6. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that described Neural network processor includes main scalar/vector, data address generation unit and weight scalar/vector.
7. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that also wrap Include the neural network model specified according to user and hardware resource constraints parameter determine data path, and according to neural network among Layer feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through finite state machine Mode describe;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
8. a kind of the Automation Design method towards neural network processor as described in claim 1, which is characterized in that also wrap It includes according to the neural network model, the hardware resource constraints parameter, the hardware description language code, generates data storage Mapping and control instruction stream.
9. a kind of the Automation Design device towards neural network processor characterized by comprising
Data module is obtained, describes file, hardware resource constraints parameter for obtaining neural network model, wherein the hardware provides Source constrained parameters include hardware resource size and object run speed;
Hardware description language code module is generated, for describing file and the hardware resource about according to the neural network model Beam parameter, the searching unit library from the neural network component library constructed, and generated according to the cell library and correspond to the mind The hardware description language code of neural network processor through network model;
Hardware/Software Collaborative Design module, the neural network model specified according to user and hardware resource constraints parameter determine data road Diameter, and data resource sharing mode is determined according to neural network middle layer feature, compiler is according to hardware configuration and network characterization Storage address access stream is generated, hardware circuit generator and compiler cooperate and realize address generating circuit;Address generates Circuit includes operating mode signal, is divided into big convolution kernel and data pattern, small convolution kernel is taken to take data pattern, pond mode, full volume Product module formula;Address generating circuit further includes block size signal, takes the data volume of a data;
Hardware circuit module is generated, for converting the hardware description language code on the hardware of the neural network processor Circuit;
Wherein generate hardware circuit module further include: according to the calculating handling capacity and on-chip memory of the neural network processor Neural network characteristics data and weighted data are divided into set of data blocks and store and access by size, and according to the nerve net Computing unit scale, memory and the data bit width of network processor carry out data segmentation in data block.
10. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute Neural network processor is stated to include storage organization, control structure, calculate structure.
11. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute Stating neural network model and describing file includes essential attribute, parameter description and link information three parts, wherein the essential attribute Including layer name and channel type, the parameter description includes the output number of plies, convolution kernel size and step size, the link information Including connection name, connection direction, connection type.
12. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute Stating cell library includes hardware description file and configuration script two parts.
13. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute Cell library is stated to include neuron elements, accumulator element, pond unit, classifier unit, local acknowledgement's normalization unit, search Table unit, scalar/vector, control unit.
14. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that institute Stating neural network processor includes main scalar/vector, data address generation unit and weight scalar/vector.
15. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that also Data path is determined including the neural network model specified according to user and hardware resource constraints parameter, and according in neural network Interbed feature determines data resource sharing mode;
It is accessed and is flowed according to the address that hardware configuration and network characterization generate memory, the address access stream passes through finite state machine Mode describe;
Hardware description language code is generated, and then is converted into the hardware circuit of the neural network processor.
16. a kind of the Automation Design device towards neural network processor as claimed in claim 9, which is characterized in that also Including generating data and depositing according to the neural network model, the hardware resource constraints parameter, the hardware description language code Storage mapping and control instruction stream.
17. a kind of a kind of the Automation Design towards neural network processor based on as described in claim 1-8 any one The optimization method of method characterized by comprising
Definition convolution kernel size is k*k, stepping s, memory width d, and datagram number is that t will be counted if k^2=d^2 According to the data block for being divided into k*k size, data width is consistent with memory width, guarantees data Coutinuous store in memory;
If k^2!=d^2, and stepping s is the greatest common divisor of k Yu memory width d, and data are divided into s*s size Data block guarantees in a datagram data Coutinuous store in memory;
If k^2!=d^2, and stepping s is not the greatest common divisor of k Yu memory width d, then finds out stepping s, k, storage Data are divided into the data block that size is f*f by the greatest common divisor f of device width d, and t datagrams alternately store.
CN201710178281.3A 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor Active CN107103113B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710178281.3A CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor
PCT/CN2018/080207 WO2018171717A1 (en) 2017-03-23 2018-03-23 Automated design method and system for neural network processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710178281.3A CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor

Publications (2)

Publication Number Publication Date
CN107103113A CN107103113A (en) 2017-08-29
CN107103113B true CN107103113B (en) 2019-01-11

Family

ID=59676152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710178281.3A Active CN107103113B (en) 2017-03-23 2017-03-23 The Automation Design method, apparatus and optimization method towards neural network processor

Country Status (2)

Country Link
CN (1) CN107103113B (en)
WO (1) WO2018171717A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11790664B2 (en) 2019-02-19 2023-10-17 Tesla, Inc. Estimating object properties using visual image data
US11797304B2 (en) 2018-02-01 2023-10-24 Tesla, Inc. Instruction set architecture for a vector computational unit
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11841434B2 (en) 2018-07-20 2023-12-12 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11893774B2 (en) 2018-10-11 2024-02-06 Tesla, Inc. Systems and methods for training machine models with augmented data
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103113B (en) * 2017-03-23 2019-01-11 中国科学院计算技术研究所 The Automation Design method, apparatus and optimization method towards neural network processor
WO2018176000A1 (en) 2017-03-23 2018-09-27 DeepScale, Inc. Data synthesis for autonomous control systems
CN107341761A (en) * 2017-07-12 2017-11-10 成都品果科技有限公司 A kind of calculating of deep neural network performs method and system
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US10671349B2 (en) 2017-07-24 2020-06-02 Tesla, Inc. Accelerated mathematical engine
US11157441B2 (en) 2017-07-24 2021-10-26 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
CN107633295B (en) * 2017-09-25 2020-04-28 南京地平线机器人技术有限公司 Method and device for adapting parameters of a neural network
CN109697509B (en) * 2017-10-24 2020-10-20 上海寒武纪信息科技有限公司 Processing method and device, and operation method and device
CN109726805B (en) * 2017-10-30 2021-02-09 上海寒武纪信息科技有限公司 Method for designing neural network processor by using black box simulator
US11521046B2 (en) 2017-11-08 2022-12-06 Samsung Electronics Co., Ltd. Time-delayed convolutions for neural network device and method
CN110097180B (en) * 2018-01-29 2020-02-21 上海寒武纪信息科技有限公司 Computer device, data processing method, and storage medium
EP3614260A4 (en) 2017-11-20 2020-10-21 Shanghai Cambricon Information Technology Co., Ltd Task parallel processing method, apparatus and system, storage medium and computer device
CN110097179B (en) * 2018-01-29 2020-03-10 上海寒武纪信息科技有限公司 Computer device, data processing method, and storage medium
KR20200100528A (en) * 2017-12-29 2020-08-26 캠브리콘 테크놀로지스 코퍼레이션 리미티드 Neural network processing method, computer system and storage medium
CN111582464B (en) * 2017-12-29 2023-09-29 中科寒武纪科技股份有限公司 Neural network processing method, computer system and storage medium
CN108563808B (en) * 2018-01-05 2020-12-04 中国科学技术大学 Design method of heterogeneous reconfigurable graph computing accelerator system based on FPGA
CN108388943B (en) * 2018-01-08 2020-12-29 中国科学院计算技术研究所 Pooling device and method suitable for neural network
CN108154229B (en) * 2018-01-10 2022-04-08 西安电子科技大学 Image processing method based on FPGA (field programmable Gate array) accelerated convolutional neural network framework
CN108389183A (en) * 2018-01-24 2018-08-10 上海交通大学 Pulmonary nodule detects neural network accelerator and its control method
WO2019181137A1 (en) * 2018-03-23 2019-09-26 ソニー株式会社 Information processing device and information processing method
CN108921289B (en) * 2018-06-20 2021-10-29 郑州云海信息技术有限公司 FPGA heterogeneous acceleration method, device and system
US11215999B2 (en) 2018-06-20 2022-01-04 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
CN110955380B (en) * 2018-09-21 2021-01-12 中科寒武纪科技股份有限公司 Access data generation method, storage medium, computer device and apparatus
CN111078293B (en) * 2018-10-19 2021-03-16 中科寒武纪科技股份有限公司 Operation method, device and related product
CN111079924B (en) * 2018-10-19 2021-01-08 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079916B (en) * 2018-10-19 2021-01-15 安徽寒武纪信息科技有限公司 Operation method, system and related product
CN111079907B (en) * 2018-10-19 2021-01-26 安徽寒武纪信息科技有限公司 Operation method, device and related product
CN111079914B (en) * 2018-10-19 2021-02-09 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079911B (en) * 2018-10-19 2021-02-09 中科寒武纪科技股份有限公司 Operation method, system and related product
WO2020078446A1 (en) * 2018-10-19 2020-04-23 中科寒武纪科技股份有限公司 Computation method and apparatus, and related product
CN111079909B (en) * 2018-10-19 2021-01-26 安徽寒武纪信息科技有限公司 Operation method, system and related product
CN111079912B (en) * 2018-10-19 2021-02-12 中科寒武纪科技股份有限公司 Operation method, system and related product
CN111079910B (en) * 2018-10-19 2021-01-26 中科寒武纪科技股份有限公司 Operation method, device and related product
US11196678B2 (en) 2018-10-25 2021-12-07 Tesla, Inc. QOS manager for system on a chip communications
CN111144561B (en) * 2018-11-05 2023-05-02 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
WO2020093304A1 (en) * 2018-11-08 2020-05-14 北京比特大陆科技有限公司 Method, apparatus, and device for compiling neural network, storage medium, and program product
CN109491956B (en) * 2018-11-09 2021-04-23 北京灵汐科技有限公司 Heterogeneous collaborative computing system
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
KR20200069901A (en) 2018-12-07 2020-06-17 삼성전자주식회사 A method for slicing a neural network and a neuromorphic apparatus
CN111325311B (en) * 2018-12-14 2024-03-29 深圳云天励飞技术有限公司 Neural network model generation method for image recognition and related equipment
CN109726797B (en) * 2018-12-21 2019-11-19 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109685203B (en) * 2018-12-21 2020-01-17 中科寒武纪科技股份有限公司 Data processing method, device, computer system and storage medium
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
CN111461296B (en) * 2018-12-29 2023-09-22 中科寒武纪科技股份有限公司 Data processing method, electronic device, and readable storage medium
CN109754084B (en) * 2018-12-29 2020-06-12 中科寒武纪科技股份有限公司 Network structure processing method and device and related products
US10997461B2 (en) 2019-02-01 2021-05-04 Tesla, Inc. Generating ground truth for machine learning from time series elements
US11150664B2 (en) 2019-02-01 2021-10-19 Tesla, Inc. Predicting three-dimensional features for autonomous driving
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
CN109978160B (en) * 2019-03-25 2021-03-02 中科寒武纪科技股份有限公司 Configuration device and method of artificial intelligence processor and related products
CN109739802B (en) * 2019-04-01 2019-06-18 上海燧原智能科技有限公司 Computing cluster and computing cluster configuration method
KR20200139909A (en) 2019-06-05 2020-12-15 삼성전자주식회사 Electronic apparatus and method of performing operations thereof
CN112132271A (en) * 2019-06-25 2020-12-25 Oppo广东移动通信有限公司 Neural network accelerator operation method, architecture and related device
CN111126572B (en) * 2019-12-26 2023-12-08 北京奇艺世纪科技有限公司 Model parameter processing method and device, electronic equipment and storage medium
CN111339027B (en) * 2020-02-25 2023-11-28 中国科学院苏州纳米技术与纳米仿生研究所 Automatic design method of reconfigurable artificial intelligent core and heterogeneous multi-core chip
CN111488969B (en) * 2020-04-03 2024-01-19 北京集朗半导体科技有限公司 Execution optimization method and device based on neural network accelerator
CN111949405A (en) * 2020-08-13 2020-11-17 Oppo广东移动通信有限公司 Resource scheduling method, hardware accelerator and electronic equipment
CN111931926A (en) * 2020-10-12 2020-11-13 南京风兴科技有限公司 Hardware acceleration system and control method for convolutional neural network CNN
JP2023032348A (en) * 2021-08-26 2023-03-09 国立大学法人 東京大学 Information processing device, and program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106022468A (en) * 2016-05-17 2016-10-12 成都启英泰伦科技有限公司 Artificial neural network processor integrated circuit and design method therefor
WO2016179533A1 (en) * 2015-05-06 2016-11-10 Indiana University Research And Technology Corporation Sensor signal processing using an analog neural network
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103113B (en) * 2017-03-23 2019-01-11 中国科学院计算技术研究所 The Automation Design method, apparatus and optimization method towards neural network processor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016179533A1 (en) * 2015-05-06 2016-11-10 Indiana University Research And Technology Corporation Sensor signal processing using an analog neural network
CN106022468A (en) * 2016-05-17 2016-10-12 成都启英泰伦科技有限公司 Artificial neural network processor integrated circuit and design method therefor
CN106447034A (en) * 2016-10-27 2017-02-22 中国科学院计算技术研究所 Neutral network processor based on data compression, design method and chip
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于神经网络嵌入式系统体系结构的研究;叶莉娅等;《杭州电子科技大学学报》;20050430;第61-64页 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11797304B2 (en) 2018-02-01 2023-10-24 Tesla, Inc. Instruction set architecture for a vector computational unit
US11841434B2 (en) 2018-07-20 2023-12-12 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11893774B2 (en) 2018-10-11 2024-02-06 Tesla, Inc. Systems and methods for training machine models with augmented data
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11790664B2 (en) 2019-02-19 2023-10-17 Tesla, Inc. Estimating object properties using visual image data

Also Published As

Publication number Publication date
WO2018171717A1 (en) 2018-09-27
CN107103113A (en) 2017-08-29

Similar Documents

Publication Publication Date Title
CN107103113B (en) The Automation Design method, apparatus and optimization method towards neural network processor
CN107016175B (en) It is applicable in the Automation Design method, apparatus and optimization method of neural network processor
US11783227B2 (en) Method, apparatus, device and readable medium for transfer learning in machine learning
CN106951926A (en) The deep learning systems approach and device of a kind of mixed architecture
CN106201651A (en) The simulator of neuromorphic chip
CN110392902A (en) Use the operation of sparse volume data
CN107169563A (en) Processing system and method applied to two-value weight convolutional network
CN111860828B (en) Neural network training method, storage medium and equipment
CN110163353A (en) A kind of computing device and method
CN109522945A (en) One kind of groups emotion identification method, device, smart machine and storage medium
CN104375805A (en) Method for simulating parallel computation process of reconfigurable processor through multi-core processor
CN110309911A (en) Neural network model verification method, device, computer equipment and storage medium
CN110163350A (en) A kind of computing device and method
CN108171328A (en) A kind of convolution algorithm method and the neural network processor based on this method
CN115828831A (en) Multi-core chip operator placement strategy generation method based on deep reinforcement learning
CN110263328A (en) A kind of disciplinary capability type mask method, device, storage medium and terminal device
CN113168552A (en) Artificial intelligence application development system, computer device and storage medium
CN110442753A (en) A kind of chart database auto-creating method and device based on OPC UA
CN117574767A (en) Simulation method and simulator for software and hardware systems of in-memory computing architecture
CN106844900A (en) The erection method of electromagnetic transient simulation system
KR102188044B1 (en) Framework system for intelligent application development based on neuromorphic architecture
CN104991884B (en) Heterogeneous polynuclear SoC architecture design method
CN110276413A (en) A kind of model compression method and device
Li et al. Liquid state machine applications mapping for noc-based neuromorphic platforms
KR20220061835A (en) Apparatus and method for hardware acceleration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant