CN109409510A - Neuron circuit, chip, system and method, storage medium - Google Patents
Neuron circuit, chip, system and method, storage medium Download PDFInfo
- Publication number
- CN109409510A CN109409510A CN201811076248.0A CN201811076248A CN109409510A CN 109409510 A CN109409510 A CN 109409510A CN 201811076248 A CN201811076248 A CN 201811076248A CN 109409510 A CN109409510 A CN 109409510A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- neuron
- module
- net layer
- neural net
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The present invention is applicable in field of computer technology, provides a kind of neuron circuit, chip, system and method, storage medium, comprises the following structure in neuron circuit: computing module;Configuration information memory module, for storing neuron tupe configuration information;And control module, for controlling the computing module and being adjusted to corresponding calculating architecture and execute corresponding neural net layer node data processing according to the tupe configuration information.In this way, the complicated and diversified neural computing demand of iteratively faster can be met, can be widely applied to the field that computing resource is limited, needs certain neural network framework restructural, extend the application of deep learning chip.
Description
Technical field
The invention belongs to field of computer technology more particularly to a kind of neuron circuit, chip, system and method, deposit
Storage media.
Background technique
In recent years, as the depth learning technology based on artificial neural network is in computer vision, natural language processing, intelligence
The extensive use in the fields such as the energy system decision-making, the artificial intelligence chip technology accelerated to neural computing obtain academia
With the concern and attention of industry.
Existing specific integrated circuit (the Application Specific customized for neural computing
Integrated Circuit, ASIC) chip is most or is based on preassigned network structure and algorithm, excessively pursue power consumption
With the performance of speed, its hardware configuration is caused to be fixed, do not have the reconfigurability of neural network framework, can not disposed nowadays quickly
The complicated and diversified neural network structure of iteration is limited so that computing resource can not be widely applied to, needs certain neural network
The restructural field of framework, such as mobile internet-of-things terminal, unmanned plane, unmanned field, asic chip application are restricted.
Summary of the invention
The purpose of the present invention is to provide a kind of neuron circuit, chip, system and method, storage mediums, it is intended to solve
Certainly present in the prior art, neural network framework can not reconstruct and the problem that causes the application of deep learning chip limited.
On the one hand, the present invention provides a kind of neuron circuit, the neuron circuit includes:
Computing module;
Configuration information memory module, for storing neuron tupe configuration information;And
Control module, for controlling the computing module and being adjusted to corresponding meter according to the tupe configuration information
It calculates architecture and executes corresponding neural net layer node data processing.
On the other hand, the present invention provides a kind of deep learning chip, the deep learning chip includes:
Storage unit refers to for storage depth study instruction set and the targeted data of deep learning, the deep learning
Enabling collection includes: several neural net layer instructions with predetermined process sequence;
By several neuron arrays constituted such as above-mentioned neuron circuit;
Central controller, for being controlled according to the deep learning instruction set so that: from the storage unit to the mind
Presently described processing mould corresponding with the instruction of presently described neural net layer is placed in through the neuron circuit in element array
Formula configuration information and the corresponding required data handled, and the Current Situation of Neural Network indicated in the instruction of presently described neural net layer
After the completion of layer processing task, next neural net layer processing task is executed, until depth indicated by the deep learning instruction set
Learning tasks are spent to complete;And
Input-output unit, for realizing transmission of the data between the storage unit and the neuron arrays.
On the other hand, the present invention also provides a kind of deep learning chip cascade system, the deep learning chip cascades
System include: at least two between each other there are cascade connection, such as above-mentioned deep learning chips.
On the other hand, the present invention also provides a kind of deep learning systems, and the deep learning system includes: at least one
Such as above-mentioned deep learning chip, and the peripheral components being connected with the deep learning chip.
On the other hand, the present invention also provides a kind of neuron control method, the neuron control method includes following
Step:
Obtain neuron tupe configuration information;
According to the tupe configuration information, controls computing module and be adjusted to corresponding calculating architecture and execution pair
The neural net layer node data processing answered.
On the other hand, the present invention also provides deep learning control methods, and the deep learning control method includes following
Step:
Deep learning instruction set is obtained, the deep learning instruction set includes: several nerves with predetermined process sequence
Network layer instruction;
According to the deep learning instruction set, control so that: neuron circuit merging into neuron arrays with it is current
Neural net layer instructs corresponding current processing mode configuration information and the corresponding required data handled, wherein neuron electricity
Road is adjusted to corresponding calculating architecture according to presently described tupe configuration information and executes corresponding neural net layer
Node data processing, and after the completion of presently described neural net layer instruction indicated Current Situation of Neural Network layer processing task,
Next neural net layer processing task is executed, until deep learning task indicated by the deep learning instruction set is completed.
On the other hand, the present invention also provides a kind of computer readable storage medium, the computer readable storage mediums
It is stored with computer program, is realized when the computer program is executed by processor such as the step in the above method.
On the other hand, the present invention also provides a kind of deep learning methods, and the deep learning method is based on above-mentioned depth
Degree learns chip or such as above-mentioned deep learning chip cascade system, the deep learning method include the following steps:
The deep learning instruction set and the data are placed in the storage unit;
The central controller according to the deep learning instruction set, control so that: the institute in Xiang Suoshu neuron arrays
It states neuron circuit and is placed in presently described tupe configuration information corresponding with the instruction of presently described neural net layer and phase
The data handled needed for answering, wherein the neuron circuit is adjusted to corresponding according to presently described tupe configuration information
It calculates architecture and executes corresponding neural net layer node data processing, and instruct meaning in presently described neural net layer
After the completion of the Current Situation of Neural Network layer processing task shown, next neural net layer processing task is executed, until the deep learning
Deep learning task indicated by instruction set is completed.
The present invention comprises the following structure in neuron circuit: computing module;Configuration information memory module, for storing mind
Through first tupe configuration information;And control module, for controlling the calculating according to the tupe configuration information
Module is adjusted to corresponding calculating architecture and executes corresponding neural net layer node data processing.In this way, can be according to not
The demands such as same scenes function, neural network type, scale of neural network, neuron operation mode, flexible configuration neuron electricity
Deep learning chip applied by road and neuron circuit enables deep learning chip and neuron circuit according to practical mind
It is reconstructed through network query function needs, to meet the complicated and diversified neural computing demand of iteratively faster, can answer extensively
The field that computing resource is limited, needs certain neural network framework restructural is used, the application of deep learning chip is extended.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the neuron circuit that the embodiment of the present invention one provides;
Fig. 2 is the structural schematic diagram of neuron circuit provided by Embodiment 2 of the present invention;
Fig. 3 is the structural schematic diagram for the neuron circuit that the embodiment of the present invention three provides;
Fig. 4 is the structural schematic diagram for the neuron circuit that the embodiment of the present invention four provides;
Fig. 5 is the structural schematic diagram for the deep learning chip that the embodiment of the present invention five provides;
Fig. 6 is the structural schematic diagram for the deep learning chip cascade system that the embodiment of the present invention eight provides;
Fig. 7 is the structural schematic diagram for the deep learning system that the embodiment of the present invention nine provides;
Fig. 8 is the flow diagram for the neuron control method that the embodiment of the present invention ten provides;
Fig. 9 is the flow diagram for the deep learning control method that the embodiment of the present invention 11 provides;
Figure 10 is the data structure schematic diagram of convolutional network layer instruction in an application example of the invention;
Figure 11 is the data structure schematic diagram of network layer instruction in pond in an application example of the invention;
Figure 12 is the data structure schematic diagram of fully connected network network layers instruction in an application example of the invention;
Figure 13 is the data structure schematic diagram of activation primitive network layer instruction in an application example of the invention;
Figure 14 is the data structure schematic diagram of state action network layer instruction in an application example of the invention;
Figure 15 is the structural schematic diagram of CRNA framework chip in an application example of the invention;
Figure 16 is the structural schematic diagram of neuron circuit in an application example of the invention;
Figure 17 is 128 grades of state machine control flow schematic diagrams of central controller in an application example of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
Specific implementation of the invention is described in detail below in conjunction with specific embodiment:
Embodiment one:
Fig. 1 shows the structure of the neuron circuit of the offer of the embodiment of the present invention one, and in particular, to a kind of digital nerve
First circuit, for constituting deep learning neural network, and deep learning neural network can be to each needed for the progress of the data of input
The orderly processing of neural net layer, neuron circuit are then used to execute data processing required in neural net layer corresponding node.
For ease of description, only parts related to embodiments of the present invention are shown, including:
Computing module 101 can be carried out the adjustment for calculating architecture, to execute at different neural net layer node datas
Reason.In the present embodiment, computing module 101 can be used for carrying out corresponding multiplying, add operation, be activated using activation primitive
Deng single processing, or carry out the flexible combination etc. of different disposal.Computing module 101 mentioned herein can execute at least two not
Same neural net layer node data processing includes at least the following two kinds meaning: first, corresponding to different types of neural network
Layer, such as: convolutional network layer, pond network layer, fully connected network network layers, activation primitive network layer, state action network layer etc.,
Data handling requirements are not quite similar, and computing module 101 can meet at least two neural net layer data handling requirements, computing module
The 101 this abilities for adapting to variety classes neural net layer data processing, can by utilize according to demand to multiplying, plus
The flexible combination of the processing such as method operation, activation realizes, such as: needing certain computing module 101 to carry out convolutional network layer data
When processing, computing module 101 is being needed by that can meet convolutional network layer data processing requirement to the flexible combination of above-mentioned processing
Will the computing module 101 carry out fully-connected network layer data processing when, computing module 101 pass through to above-mentioned processing it is another flexibly
Combination can meet fully-connected network layer data processing requirement, and this flexible combination for different demands is needed by subsequent
The cooperation of configuration information memory module 102 and control module 103 is realized together, and in this meaning, neuron circuit can be carried out
The data processing of a node in a certain neural network layer can also carry out a node in another neural network layer
Data processing;Second, neuron circuit can carry out the corresponding data processing of difference node, example in same type neural net layer
Such as: a neuron circuit can carry out the data processing of a certain node in the first convolution network layer, can also carry out the second convolutional network
The data processing of a certain node in layer;Third, neuron circuit can carry out the corresponding number of difference node in same neural net layer
According to processing, neuron circuit Data processing needed for same neural net layer is multiplexed.When the different node datas of progress
When processing, the calculating architecture of computing module 101 is not quite similar, and computing module 101 controlled can carry out calculating architecture
It is adapted to different node data processing.The data processing of all nodes is completed in a certain neural net layer, then the nerve net
The data processing of network layers is completed, and the data of each neural net layer are completed in neural network, then the data processing of the neural network is complete
At.
Configuration information memory module 102, for storing neuron tupe configuration information.In the present embodiment, it handles
When pattern configurations information indicates that the neuron circuit carries out corresponding neural net layer node data processing, required one
A little configuration informations, these configuration informations can indicate the neuron circuit needs to realize which node operation etc..
Control module 103, for controlling computing module and being adjusted to corresponding calculating basis according to tupe configuration information
Framework simultaneously executes corresponding neural net layer node data processing.
Implement the present embodiment, neuron circuit can carry out in a certain neural net layer of entire depth learning neural network one
The data processing of node, in this way, can be according to different scenes functions, neural network type, scale of neural network, neural n ary operation
The demands such as mode, flexible configuration neuron circuit enable neuron circuit to carry out weight according to practical neural computing needs
Structure can be widely applied to that computing resource is limited, needs to meet the complicated and diversified neural computing demand of iteratively faster
The restructural field of certain neural network framework, extends the application of deep learning chip.
Embodiment two:
The present embodiment is on the basis of the neuron circuit of other embodiments, it is further provided following content:
As shown in Fig. 2, neuron circuit further include:
Parameter memory module 201, for parameter needed for storing the processing of neural net layer node data.In the present embodiment
In, parameter can be training gained neural network parameter.
Address generation module 202 is searched for being controlled by control module 103 and handles needle with neural net layer node data
Pair data corresponding to parameter, the parameter searched can be input to computing module 101 with participate in corresponding data processing in.
In the present embodiment, since the data processing of certain type neural net layers is not necessarily to call parameters, such as: pond net
Network layers, activation primitive network layer etc., therefore, neuron circuit configurations are not necessarily to above-mentioned parameter memory module 201 and address
Generation module 202, and other such as convolutional network layer, fully connected network network layers, state action network layers need when data processing
Neural network parameter is called, then needs to configure above-mentioned parameter memory module 201 and address generation module in neuron circuit
202, the broad applicability of neuron circuit can be enhanced in this way.
Embodiment three:
The present embodiment is on the basis of the neuron circuit of other embodiments, it is further provided following content:
As shown in figure 3, neuron circuit further include:
Temporary storage module 301, for storing the intermediate data of neural net layer node data processing.
In the present embodiment, due to certain type neural networks such as convolutional network, Local Area Network etc., without to neuron electricity
Processing gained intermediate data in road carries out being stored for subsequent processing, and therefore, neuron circuit configurations are not necessarily to above-mentioned interim
Memory module 301, and other such as intensified learning networks, recirculating network data processing need to handle gained using neuron circuit
Intermediate data then needs to configure above-mentioned temporary storage module 301 in neuron circuit, can also enhance neuron circuit in this way
Broad applicability.
Example IV:
The present embodiment is on the basis of the neuron circuit of other embodiments, it is further provided following content:
As shown in figure 4, computing module 101 includes:
Basic calculation module 401, basic calculation module include: multiplier, adder and/or activation primitive module etc..
Gating module 402 makes basic calculation mould for executing corresponding gating movement under the control of control module 103
Block 401 constitutes corresponding calculating architecture, gating module 402 can include: multiplexer (Multiplexer, MUX) and/or solution
Multiplexer (Demultiplexer, DEMUX) etc..
In the present embodiment, basic calculation module 401 can carry out multiplying, add operation, be activated using activation primitive
Etc. based process required parameter can be obtained from above-mentioned parameter memory module 201, gating module 402 is then when carrying out based process
Basic calculation module 401 can be adjusted accordingly as required, be obtained needed for Real-time Neural Network node layer data processing
Architecture is calculated, realizes the reconstruct of computing module 101.
Embodiment five:
The structure that Fig. 5 shows the deep learning chip of the offer of the embodiment of the present invention five illustrates only for ease of description
Part related to the embodiment of the present invention, including:
Storage unit 501, for storage depth study instruction set and the targeted data of deep learning, deep learning instruction
Collection includes: several neural net layer instructions with predetermined process sequence.In the present embodiment, deep learning instruction set includes deep
All neural net layers instruction that degree learning tasks are covered, such as: the instruction of convolutional network layer, the instruction of pond network layer, Quan Lian
Connect network layer instruction, the instruction of activation primitive network layer, the instruction of state action network layer etc..Certainly, in this instruction set usually also
Including the other information needed to have to complete deep learning task, such as: neural network information, neural network structure
Information etc., wherein neural network information can indicate that the neural network is convolutional network, Local Area Network, recirculating network or strong
Change learning network etc., neural network structure information can include: neural net layer number of plies information that neural network is included, nerve net
Number of nodes information and neural net layer in network layers need to realize instruction information of which operation etc., the instruction information with it is upper
It is corresponding to state the tupe configuration information stored in configuration information memory module 102.
By several neuron arrays 504 constituted such as above-mentioned neuron circuit 502.Concrete function, structure such as other realities
It applies described in example, details are not described herein again.
Central controller 503, for according to deep learning instruction set control so that: from storage unit 501 to neuron battle array
Neuron circuit 502 in column 504 be placed in current processing mode configuration information corresponding with the instruction of Current Situation of Neural Network layer and
The data handled needed for accordingly, and after the completion of working as Current Situation of Neural Network layer processing task indicated by neural net layer instruction,
Next neural net layer processing task is executed, until deep learning task indicated by deep learning instruction set is completed.In this reality
It applies in example, various types controller can be used in central controller 503, such as: advanced reduced instruction set machine (Advanced
Reduced Instruction Set Computer Machine, ARM) controller, Intel's Intel Series Controller, China
Series Controller etc. is thought for sea, and finite state machine can be used in framework, the conversion of different conditions can be completed according to condition, to control
The workflow for making the neural network that it is covered, specifically includes: configuration flow, neural network computing process, data transmission stream
Journey etc., wherein may relate to neuron arrays 504 in entire depth study chip and be directed at the single batch of a neural net layer
Reason or neuron arrays are directed to the multiple batches of processing of a neural net layer, and multiple batches of processing is then related to neuron circuit 502
Multiplexing.Central controller 503 is primarily useful for configuring the deep learning chip for constituting neural network, so that neuron battle array
Column 504 can instruct according to the neural net layer in deep learning instruction set and carry out orderly data processing.In entire neural network
Operational process in, main operational of the central controller 503 to realize neural network, comprising: the update of instruction, content decoding
Deng.
Input-output unit 505, for realizing biography of the data between storage unit 501 and in real time neuron arrays 504
It is defeated.
The process flow of one deep learning task approximately as:
Current Situation of Neural Network layer processing task is first carried out, it is corresponding with the instruction of Current Situation of Neural Network layer in storage unit 501
Current processing mode configuration information, can pass through input-output unit 505 be placed in neuron arrays 504 in neuron circuit
In 502, the configuration of neuron circuit 502 is completed, the pending data in subsequent storage unit 501 passes through input-output unit again
In neuron circuit 502 in 505 merging neuron arrays 504, neuron circuit 502 then on the basis of the configuration completed,
The data of merging are handled, pending data of the processing the data obtained as next neural net layer processing task.Then,
According to above-mentioned same method, next neural net layer processing task is executed, until all neural net layer processing tasks are complete
At being finally completed this deep learning task.
In addition, handled if needing to introduce neural network parameter in deep learning task, in above-mentioned process flow,
Tupe configuration information be placed in neuron circuit 502 after, can also from storage unit 501 into neuron arrays 504
Neuron circuit 502 is placed in corresponding parameter, then executes merging, the processing to data.
Implement the present embodiment, can be transported according to different scenes functions, neural network type, scale of neural network, neuron
The demands such as calculation mode, deep learning chip applied by flexible configuration neuron circuit and neuron circuit, so that depth
Practising chip and neuron circuit can need to be reconstructed according to practical neural computing, so that the complexity for meeting iteratively faster is more
The neural computing demand of sample can be widely applied to the neck that computing resource is limited, needs certain neural network framework restructural
Domain extends the application of deep learning chip.
Embodiment six:
The present embodiment on the deep learning chip basis of other embodiments, further to:
Input-output unit 505 is to seal in (Stream-in) to go here and there out (Stream-out) formula shift register, neuron electricity
Independent data transmission path is established between road 502 and input-output unit 505.
In the present embodiment, the pending data for each node of neural net layer that storage unit 501 is stored will pass through
Input shift register and independent data transmission path are transferred in neuron arrays 504 in corresponding neuron circuit 502
It is handled, after the completion of processing, processing the data obtained passes through independent data transmission path again and Output Shift Register passes
It is defeated to be stored into storage unit 501.If the processing the data obtained of all nodes of Current Situation of Neural Network layer is next nerve
The pending data of network layer is then after all nodes of Current Situation of Neural Network layer complete data processing, then by all processing
Pending data of the data obtained as next neural net layer.
Implement the present embodiment, it can be achieved that the data pipeline transmission for the formula of going here and there out is sealed in, compared to traditional multi-neuron pair
Most evidences access for required more fan-out circuits, no longer need to the access address for calculating data storage, carry out greatly to read-write
Amplitude simplifies, and reduces the bandwidth requirement to memory, considerably reduces input and output power consumption.And the use of shift register
And the data transmission path established between neuron circuit 502 and shift register phase between each neuron circuit 502
To independence, the competition mechanism (Retention) that the more core systems of neuron arrays access same storage can avoid, without
As traditional multi-core processor system is required on communication bus, the caching of the arbitration mechanism for avoiding conflict and complexity
Synchronization mechanism (Cache Synchronization), thus because seal in go here and there out formula register introducing due to more array classes for constituting
Join input-output system, can make to calculate handling capacity and the quantity of neuron circuit 502 linearly increases, while optimize storage
Access mechanism, avoid useless calculating.
Embodiment seven:
The present embodiment on the deep learning chip basis of other embodiments, further to:
Storage unit 501 is also used to: the intermediate data of storage Real-time Neural Network node layer data processing.
The implementation purpose of the present embodiment is identical as above-described embodiment three, and details are not described herein again.
Embodiment eight:
Fig. 6 shows the structure of the deep learning chip cascade system of the offer of the embodiment of the present invention eight, for ease of description,
Only parts related to embodiments of the present invention are shown, including:
At least two between each other there are the deep learning chips 601 of cascade connection, such as above-mentioned any embodiment.
Implement the present embodiment, muti-piece can be cascaded, chip is accelerated to meet different scenes to expand parallel processing capability
Use demand.
Embodiment nine:
The structure that Fig. 7 shows the deep learning system of the offer of the embodiment of the present invention nine illustrates only for ease of description
Part related to the embodiment of the present invention, including:
At least one is such as above-mentioned deep learning chip 701, and the peripheral components being connected with deep learning chip 701
702.In the present embodiment, it when there are at least two deep learning chips 701, can be cascaded between deep learning chip 701,
It can not also cascade and mutually indepedent.And peripheral components 702 can be other embeded processors or sensor etc..
Embodiment ten:
The process that Fig. 8 shows the neuron control method of the offer of the embodiment of the present invention ten is only shown for ease of description
Part related to the embodiment of the present invention, is directed to following steps:
In step S801, neuron tupe configuration information is obtained.
In step S802, according to the tupe configuration information, controls computing module and be adjusted to corresponding calculating base
Plinth framework simultaneously executes corresponding neural net layer node data processing.
The involved content of above-mentioned steps S801, S802 processing has specific presentation in related content in other embodiments, this
Place can quote and repeat no more.
Embodiment 11:
Fig. 9 shows the process of the deep learning control method of the offer of the embodiment of the present invention 11, for ease of description, only
Part related to the embodiment of the present invention is shown, following steps are directed to:
In step S901, deep learning instruction set is obtained, which includes: several with predetermined process
The neural net layer of sequence instructs.
In step S902, according to deep learning instruction set, control so that: the neuron circuit into neuron arrays is set
Enter and instruct corresponding current processing mode configuration information and the corresponding required data handled to Current Situation of Neural Network layer, wherein
Neuron circuit is adjusted to corresponding calculating architecture according to presently described tupe configuration information and executes corresponding mind
It handles through network layer node data, and is completed in the indicated Current Situation of Neural Network layer processing task of Current Situation of Neural Network layer instruction
Afterwards, next neural net layer processing task is executed, until deep learning task indicated by deep learning instruction set is completed.
The involved content of above-mentioned steps S901, S902 processing has specific presentation in related content in other embodiments, this
Place can quote and repeat no more.
Embodiment 12:
In embodiments of the present invention, a kind of computer readable storage medium is provided, which deposits
Computer program is contained, the step in above method embodiment 11 or 12 is realized when which is executed by processor
Suddenly, for example, step S801 to S802 shown in FIG. 1.
The computer readable storage medium of the embodiment of the present invention may include can carry computer program code any
Entity or device, recording medium, for example, the memories such as ROM/RAM, disk, CD, flash memory.
Embodiment 13:
The process for the deep learning method that the embodiment of the present invention 13 provides, based on such as above-mentioned deep learning chip or as above
State deep learning chip cascade system or as above-mentioned deep learning system illustrates only and implements with the present invention for ease of description
The relevant part of example, is directed to following steps:
Learn instruction set and data to 501 placed-depth of storage unit;
Central controller 503 according to deep learning instruction set, control so that: the neuron electricity into neuron arrays 504
The merging of road 502 instructs corresponding current processing mode configuration information and the corresponding required number handled to Current Situation of Neural Network floor
According to, wherein neuron circuit 502 is adjusted to corresponding calculating architecture and execution pair according to current processing mode configuration information
The neural net layer node data processing answered, and appoint in the indicated Current Situation of Neural Network layer processing of Current Situation of Neural Network layer instruction
After the completion of business, next neural net layer processing task is executed, until deep learning task indicated by deep learning instruction set is complete
At.
Below by an application example, to neuron circuit, chip involved in the various embodiments described above, system and its side
Method, the related content of storage medium are specifically described.
This application example is specifically related to a kind of deep learning instruction set and the restructural mind of coarseness based on the instruction set
Design through form array (Coarse-grained Reconfigurable Neuromorphic Array, CRNA) framework and
Using, the design and application in, cover neuron circuit, chip involved in the various embodiments described above, system and method,
The related content of storage medium.This application example introduces assembly line using full-digital circuit design neuron and neuron arrays
Design method flexibly realizes configuration neural network type, neural network structure (neural net layer section by way of dynamic configuration
Point number and the neural net layer number of plies), the combined application of a variety of neural networks, the operating mode of neuron etc..Using should
With example, the processing speed of data can be increased substantially, and is able to satisfy the neural network algorithm demand of existing iteratively faster, is had low
The characteristics of power consumption, processing speed be fast, reconfigurability, be particularly suitable for computing resource is limited, memory capacity is few, power consumption requires,
The usage scenario to be sought quickness of processing speed has widened hardware neural network based, software application field.
Firstly, accordingly being illustrated deep learning instruction set.Instruction set is the core of processor design, is software systems
With the interface of hardware chip.The instruction set of this application example support neural network hierarchical description, and in particular to following five kinds even
Further types of neural net layer instruction.In this application example, instruction width is 96, certainly, in other application example
In, the adjustment of instruction width adaptability, and in particular to: as shown in Figure 10 convolutional network layer instruction, pond as shown in figure 11
Network layer instruction, fully connected network network layers as shown in figure 12 instruction, as shown in fig. 13 that activation primitive network layer instruction and such as
The instruction of state action network layer shown in Figure 14, wherein assignment can be carried out to corresponding data position in instruction, to realize corresponding
Function, such as:, can be to the 70th assignment " 1 " to indicate to fill in Figure 10, assignment " 0 ", can be to to indicate to be not filled with
65-67 assignment " 001 " are to indicate convolution kernel having a size of 1 × 1, and assignment " 010 " is to indicate convolution kernel having a size of 2 × 2 etc.;Scheming
In 11, pondization strategy can be indicated 65-67 assignment " 000 " using maximum pond (max-pooling), assignment " 001 "
To indicate pondization strategy using minimum pond (min-pooling), assignment " 010 " is to indicate pondization strategy using average pond
(average-pooling) etc., it can indicate the 70th assignment " 1 " positive, assignment " 0 " is to indicate reversed etc.;In Figure 13,
Can to 5-9 assignment " 00001 " using indicate activation primitive mode as line rectification (Rectified Linear Unit,
ReLU) function, assignment " 00010 " with indicate activation primitive mode be S type (Sigmoid) function, assignment " 00011-11111 " with
Presentation code is expansible etc.;In Figure 14, can 45-47 assignment " 000 " be indicated with iterative strategy is learnt using depth Q
(Deep Q-learning, DQN) algorithm, assignment " 001 " is to indicate iterative strategy adoption status-movement-reward-state-movement
(State-Action-Reward-State-Action, SARSA) algorithm, assignment " 010-111 " is expansible with presentation code, E-
Greedy probability can use 0-100 etc..
Secondly, the design and application to CRNA framework chip are accordingly illustrated.The CRNA framework that this application example proposes
Chip, generally comprises storage unit 1501 as shown in figure 15, input-output unit 1502, several neuron circuits 1503 and
Central controller 1504 etc..The framework breaks the limitation of traditional von Neumann framework, by distributed memory memory optimization
It uses, different network modes, neural network structure and various modes nerve net is flexibly realized by dynamic configuration mode
The combined application etc. of network;Based on the configuration of control module and the realization of central processing unit 1504 to storage in neuron circuit 1503
With the assembly line rapid computations function of neural network, and by the optimization design of hardware realization artificial neuron, mention significantly
The high computing capability of entire CRNA framework.The CRNA framework sufficiently uses memory source, and it is further to break through von Neumann framework
Computing capability is improved, volume of transmitted data is effectively reduced, and power consumption is greatly lowered, the CRNA framework that this application example proposes
It supports the deployment of polyhybird neural net layer, there are well the flexibly advantages such as reconfigurability, low-power consumption, high computing capability.
The function of each unit can be as following in CRNA framework:
(1) memory:
Memory includes: storage unit 1501 and the parameter memory module in neuron circuit 1503 as shown in figure 15
15031, storage unit 1501 can carry out distributed deployment: the first storing sub-units 15011, the second storing sub-units 15012 and
Third storing sub-units 15013, these storing sub-units, which can also be concentrated, is deployed in a physical store, is described as follows:
First storing sub-units 15011, the data being directed to for storing Processing with Neural Network include: input data, nerve
Network interlayer storing data and output data etc..
Parameter memory module, for parameter needed for storing trained neural network node data processing, in mind
The storage of parameter can be completed through the netinit stage.When neural network is in operation stages, neuron circuit 1503 can
The corresponding parameter read in parameter memory module completes corresponding neural network node layer operation, and neuron circuit 1503 is only read
Local parameter is so as to avoid data access is carried out between neuron a possibility that.
Second storing sub-units 15012, the partial memory determine neural network type (convolutional network, area of CRNA framework
Domain network, recirculating network or intensified learning network) and neural network structure (neural net layer interstitial content, neural net layer
The number of plies, the realized operation of every neural net layer) etc..
Third storing sub-units 15013, the partial memory particular for intensified learning network mode or recirculating network mode,
Intermediate data caused by intensified learning network, recirculating network operation is stored.
(2) input and output:
Input-output unit 1502, for being realized respectively by input shift register and Output Shift Register to input
Data and sealing in for output data are gone here and there out, illustrate that see the above embodiment 6 for details, and details are not described herein again.
(3) artificial neuron:
Neuron circuit 1503 can carry out the neural n ary operation of designated mode according to configuration to the input data of neural network,
Obtain operation result.Artificial neuron's design method of CRNA framework can flexibly realize the deployment or more of single kind neural network
The combination deployment of kind neural network, is described as follows:
Neuron circuit 1503 may include structure as shown in figure 16, be directed to: computing module, configuration information store mould
Block 1601, control module 1602, parameter memory module 1603, address generation module 1604, temporary storage module 1605, operation are slow
Storing module 1606, configuration information/parameter input module 1607, data input module 1608, data outputting module 1609 etc..Its
In, configuration chain register can be used in configuration information memory module 1601, and operation cache module 1606 can be accumulator register, calculates
Module may include multiplier, adder, activation primitive module 1610, gating module etc..Details are as follows for the function of each module:
Configuration information/parameter input module 1607, the tupe for inputting neuron to neuron circuit 1503 are matched
Confidence breath and neural network parameter, operating mode of the tupe configuration information to configure neuron.
Gating module can be presented as that MUX and/or DEMUX, MUX are M1, M2 in figure label, and DEMUX is in figure label
For DM1, DM2.M1 is used to choose whether to skip multiplication unit, skips if reading parameter and being 0;DM1 is used to control input content
Destination be configuration information memory module 1601 or parameter memory module 1603;DM2 is used to specify activation primitive or skips
Activation processing;The output of M2 selection activation primitive;Content selected by M1, M2, DM1, DM2 is specified by control module 1602.
Address generation module 1604, the parameter phase read in the input data and parameter memory for guaranteeing neuron in real time
Matching.
Multiplier, adder form multiply-add module, and for carrying out multiplying to data, parameter, result is stored in operation
Cache module 1606, and read in next period as one of addition input, it such as needs to back up the calculated result, then result is stored in
In temporary storage module 1605.
Control module 1602, for controlling the operating mode of entire neuron, including MUX, DEMUX according to configuration information
Selection, the operating mode of address generation module 1604 etc..
Data outputting module 169, the calculated result for output neuron circuit 1503.
The workflow of neuron circuit 1503 approximately as:
Firstly, being used to configure neuron by the serial input content of configuration information/parameter input module 1607, wherein mind
It is stored respectively in parameter memory module 1603 and configuration information memory module 1601 through network parameter and configuration information.
Secondly, completing neuron with postponing, neuron obtains input data from data input module 1608, stores from parameter
The multiply-add operation needed for the neural network parameter to match with input data is carried out for neuron is found in module 1603.
Then, by multiply-add operation as a result, selection instruction domain selects phase according to the mode contents in the instruction of activation primitive layer
Activation processing needed for answering activation primitive to carry out, then by neuronal activation as a result, being stored according to network mode to corresponding
Memory (operation cache module 1606 or operation cache module 1606 and temporary storage module 1605).
When completing the operation of all input datas of Current Situation of Neural Network node layer, by the output result of neuron through data
Output module 1609 exports, and exports by CRNA framework input-output unit 1502 to be stored.
It should be understood that the above process contains multiply-add operation, activation processing, but in practical applications, these operations
All select on demand to configure.
(4) center control:
Central controller 1504 uses finite state machine in CRNA framework, and state machine completes different shapes according to switch condition
The conversion of state, so that the workflow of entire framework is controlled, including as shown in figure 17: configuration flow S1701, neural network fortune
Calculate process S1702, data transmission stream journey S1703 etc..As shown in figure 17, it is specifically described with 128 grades of state machine control flows
It is as follows:
In process S1701, to the second storing sub-units 15012, parameter memory module 1603, the first storing sub-units
15011 are configured according to algorithm requirements.
In process S1702, it is related to the state machine control flow of 128 grades of overlength vector assembly line, including: instruction
Update and content decoding, realize the main operational of neural network.128 neurons are used in the CRNA architecture design, when same
When one neural net layer interstitial content is greater than 128 artificial neuron numbers that CRNA framework is included, 128 minds can be utilized
It is repeatedly calculated in batches through member progress, i.e., the artificial neuron element array constituted to 128 each neurons is constantly multiplexed, by reading and solving
Code instruction configures 128 neuron tupes, and the characteristic of control neural network entirety is configured by global parameter,
And data flow is controlled in the jumping of neural network interlayer, the input and output of data, parametric distribution etc..
In process S1703, by the output result of neural network be transferred to host computer carry out using.
The process that above-mentioned CRNA framework chip is realized relates generally to:
First, row initial configuration is internally deposited into according to the corresponding configuration information of deep learning instruction set, parameter, data,
Specifically can, configuration neural network type (mode), neural net layer number of plies number and neuron tupe etc..
Second, carrying out corresponding neural network computing according to neural network type.It is specifically related to: Current Situation of Neural Network layer is referred to
It enables and executes corresponding data processing, complete Current Situation of Neural Network layer and handle task, then institute's total is handled with Current Situation of Neural Network layer
According to the pending data as next neural net layer processing task, next neural net layer processing task is executed.
When executing neural net layer task, the general nerve first read from memory in configuration information merging neuron arrays
In first circuit, the tupe configuration of neuron circuit is completed, then neural network parameter is read from memory and is placed in neuron battle array
In neuron circuit in column, is then handled from memory serial input pending data, mind can be called during processing
Through network parameter.
In this way, realizing by 128 level production line process of overlength and carrying out network operations to data, in this process, constantly
Operation result deposit is added up into accumulator register, instructs targeted all data all complete to a neural net layer
After portion completes the network operations of neuron arrays, which is passed sequentially through into Output Shift Register storage into memory,
Input data as the processing of next neural net layer.When a last neural net layer is arrived in processing, by last nerve net
The operation result of network layers is equally stored in memory, and according to data output journey, the operation result is successively output to host computer,
For the subsequent use of host computer.
The design of this CRNA framework chip is based on United Microelectronics company (United Microelectronics
Corporation, UMC) 65nm complementary metal oxide semiconductor (Complementary Metal Oxide
Semiconductor, CMOS) technique carry out emulation and logic synthesis.Devise the restructural battle array of integrated 128 digital neurons
Column are used as computing unit, each neuron includes the datarams of two pieces of 1KB and the parameter memory of two pieces of 1KB, have in real time
The control port for flexibly closing neuron, can reduce the dynamic power consumption of neuron.Host state machine can flexible modulation memory and nerve
The working condition of element array, and the jumping of network interlayer, the input and output of data, parametric distribution are realized by data flow control
Etc. functions.Synthesis result shows that the design of distributed memory system greatly reduces being fanned out to for storage access, reduces storage control
The complexity of system processed and the bandwidth of data access, improve the harmony of parametric distribution.By configuring, can effectively dispose
Type is different, neural network model diversified in specifications.It is part of test results below:
Chip emulation: it is configured with the full Connection Neural Network of 10 layers of any Inport And Outport Node quantity, is sent out by waveform diagram
Existing, either neuron arrays service efficiency in the case where it is enabled is several into 100%, and interlayer jumps 2 of only state machine
The delay of clock cycle.Fully-connected network function is able to the Complete Mappings from algorithm to circuit.
Logic synthesis: synthesis result indicates that this restructural design chips specification in 1.5mm2 or so, is saved hard very much
Part resource is always advantageously integrated in resource-constrained terminal, as shown in table 1 below;Power consumption is easily integrated into tens milliwatt ranks
In the terminal device of low-power consumption, as shown in table 2 below.
Table 1: basic unit uses and area occupancy situation
Table 2: module dissipation ratio
The above this application example has the advantage that
First, proposing that instruction is assembler language level instruction, with the existing platform rank Network Dept. based on operating system
It is different to affix one's name to model (TensorFlow, Caffe etc.) frame, does not need the support of operating system, is directly changed chip operation mould
Formula, programming efficiency is high, can be deployed directly into super low-power consumption and calculate scene.
Second, CRNA framework uses digital circuit artificial neuron so that neuron have noise resisting ability it is strong, can
The advantages that, precision height, favorable expandability high by property, design method maturation specification.The design computational accuracy is 8 fixed point quantization sides
Formula uses unit binary quantization network compared to deep learning processor part now, and the design is higher than its computational accuracy.
Third, the implementation of neural net layer is more flexible.Complicated neural network is disposed by the way of array multiplexing,
Piecemeal realizes the different neural network model of number of nodes, a large amount of to be multiplexed neural computing unit, greatly improves hardware resource
Utilization rate, save hardware cost, have high flexibility.
Fourth, CRNA framework has reconfigurability, programmability.CRNA framework uses assembly line distributed storage mode,
Delay and power consumption are reduced, the reliability of system is improved, and it is relatively complete to make each computing unit become a function
Kind, the relatively complete independent mini system of structure, configuration chain is directly connect with each individual neuron, and configuration process has
Similitude and progressive relationship pass through configuration-direct memory global configuration different mode networks to be easier to realize restructural.Cause
This is from global and local realization reconfigurability, programmability;
Fifth, integration and scalability.Distributed storage mode reduces delay and power consumption, improves the reliable of system
Property, but also data and parameter distribution are more uniform, there is preferably harmony relative to centrally stored mode, to have good
Good integration;CRNA framework can be used in combination with embeded processor, sensor, and cascade muti-piece accelerates chip, expand parallel
Processing capacity, to meet the use of different scenes.
To sum up, the mentioned instruction of this application example and CRNA framework have high-speed low-power-consumption, flexible reconfigurable ability, for nowadays
The different deep neural network of type provides reliable computing platform, and deep neural network algorithm is promoted to set in mobile internet-of-things terminal
The extensive use in the fields such as standby, unmanned plane, automatic Pilot.
It should be understood that each unit involved in the above-described embodiments or module can be by corresponding hardware or software lists
Member realizes that each unit or module can be independent soft and hardware unit or module, also can integrate as a soft and hardware unit
Or module, herein not to limit the present invention.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (13)
1. a kind of neuron circuit, which is characterized in that the neuron circuit includes:
Computing module;
Configuration information memory module, for storing neuron tupe configuration information;And
Control module, for controlling the computing module and being adjusted to corresponding calculating base according to the tupe configuration information
Plinth framework simultaneously executes corresponding neural net layer node data processing.
2. neuron circuit as described in claim 1, which is characterized in that the neuron circuit further include:
Parameter memory module, for parameter needed for storing the neural net layer node data processing;And
Address generation module is searched and is directed to neural net layer node data processing for being controlled by the control module
Data corresponding to parameter.
3. neuron circuit as described in claim 1, which is characterized in that the neuron circuit further include:
Temporary storage module, for storing the intermediate data of the neural net layer node data processing.
4. neuron circuit as described in claim 1, which is characterized in that the computing module includes:
Basic calculation module, the basic calculation module include: multiplier, adder and/or activation primitive module;And
Gating module makes the basic calculation module for executing corresponding gating movement under the control of the control module
The corresponding calculating architecture is constituted, the gating module includes: multiplexer and/or demultiplexer.
5. a kind of deep learning chip, which is characterized in that the deep learning chip includes:
Storage unit, for storage depth study instruction set and the targeted data of deep learning, the deep learning instruction set
It include: several neural net layer instructions with predetermined process sequence;
By several neuron arrays constituted such as the described in any item neuron circuits of Claims 1-4;
Central controller, for being controlled according to the deep learning instruction set so that: from the storage unit to the neuron
The neuron circuit in array is placed in presently described tupe corresponding with the instruction of presently described neural net layer and matches
Confidence breath and the corresponding required data handled, and at the indicated Current Situation of Neural Network layer of presently described neural net layer instruction
After the completion of reason task, next neural net layer processing task is executed, until depth indicated by the deep learning instruction set
Habit task is completed;And
Input-output unit, for realizing transmission of the data between the storage unit and the neuron arrays.
6. deep learning chip as claimed in claim 5, which is characterized in that the input-output unit is to seal in formula of going here and there out to move
Bit register establishes independent data transmission path between the neuron circuit and the input-output unit.
7. deep learning chip as claimed in claim 5, which is characterized in that the storage unit is also used to: storing the mind
The intermediate data handled through network layer node data.
8. a kind of deep learning chip cascade system, which is characterized in that the deep learning chip cascade system includes: at least two
It is a that there are cascade connections, the described in any item deep learning chips of such as claim 5 to 7 between each other.
9. a kind of deep learning system, which is characterized in that the deep learning system includes: at least one such as claim 5 to 7
Described in any item deep learning chips, and the peripheral components being connected with the deep learning chip.
10. a kind of neuron control method, which is characterized in that the neuron control method includes the following steps:
Obtain neuron tupe configuration information;
According to the tupe configuration information, controls computing module and be adjusted to corresponding calculating architecture and execute corresponding
The processing of neural net layer node data.
11. a kind of deep learning control method, which is characterized in that the deep learning control method includes the following steps:
Deep learning instruction set is obtained, the deep learning instruction set includes: several neural networks with predetermined process sequence
Layer instruction;
According to the deep learning instruction set, control so that: neuron circuit merging and Current neural into neuron arrays
Network layer instructs corresponding current processing mode configuration information and the corresponding required data handled, wherein neuron circuit root
Corresponding calculating architecture is adjusted to according to presently described tupe configuration information and executes corresponding neural network node layer
Data processing, and after the completion of presently described neural net layer instruction indicated Current Situation of Neural Network layer processing task, it executes
Next neural net layer handles task, until deep learning task indicated by the deep learning instruction set is completed.
12. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In realization such as the step in claim 10 or 11 the methods when the computer program is executed by processor.
13. a kind of deep learning method, which is characterized in that the deep learning method is based on such as any one of claim 5 to 7 institute
The deep learning chip or deep learning chip cascade system as claimed in claim 8 or as claimed in claim 9 deep stated
Learning system is spent, the deep learning method includes the following steps:
The deep learning instruction set and the data are placed in the storage unit;
The central controller according to the deep learning instruction set, control so that: the mind in Xiang Suoshu neuron arrays
Corresponding presently described tupe configuration information and corresponding institute are instructed to presently described neural net layer through the merging of first circuit
The data that need to be handled, wherein the neuron circuit is adjusted to corresponding calculating according to presently described tupe configuration information
Architecture simultaneously executes corresponding neural net layer node data processing, and indicated in the instruction of presently described neural net layer
After the completion of Current Situation of Neural Network layer processing task, next neural net layer processing task is executed, until the deep learning instructs
The indicated deep learning task of collection is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811076248.0A CN109409510B (en) | 2018-09-14 | 2018-09-14 | Neuron circuit, chip, system and method thereof, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811076248.0A CN109409510B (en) | 2018-09-14 | 2018-09-14 | Neuron circuit, chip, system and method thereof, and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109409510A true CN109409510A (en) | 2019-03-01 |
CN109409510B CN109409510B (en) | 2022-12-23 |
Family
ID=65464184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811076248.0A Active CN109409510B (en) | 2018-09-14 | 2018-09-14 | Neuron circuit, chip, system and method thereof, and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109409510B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110516789A (en) * | 2019-08-09 | 2019-11-29 | 苏州浪潮智能科技有限公司 | The processing method of instruction set, device and relevant device in convolutional network accelerator |
WO2020051918A1 (en) * | 2018-09-14 | 2020-03-19 | 中国科学院深圳先进技术研究院 | Neuronal circuit, chip, system and method therefor, and storage medium |
CN111105023A (en) * | 2019-11-08 | 2020-05-05 | 中国科学院深圳先进技术研究院 | Data stream reconstruction method and reconfigurable data stream processor |
CN111222637A (en) * | 2020-01-17 | 2020-06-02 | 上海商汤智能科技有限公司 | Neural network model deployment method and device, electronic equipment and storage medium |
CN111651207A (en) * | 2020-08-06 | 2020-09-11 | 腾讯科技(深圳)有限公司 | Neural network model operation chip, method, device, equipment and medium |
CN112446475A (en) * | 2019-09-03 | 2021-03-05 | 芯盟科技有限公司 | Neural network intelligent chip and forming method thereof |
CN112598107A (en) * | 2019-10-01 | 2021-04-02 | 创鑫智慧股份有限公司 | Data processing system and data processing method thereof |
CN114970406A (en) * | 2022-05-30 | 2022-08-30 | 中昊芯英(杭州)科技有限公司 | Method, apparatus, medium and computing device for customizing digital integrated circuit |
WO2022246639A1 (en) * | 2021-05-25 | 2022-12-01 | Nvidia Corporation | Hardware circuit for deep learning task scheduling |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8306931B1 (en) * | 2009-08-06 | 2012-11-06 | Data Fusion & Neural Networks, LLC | Detecting, classifying, and tracking abnormal data in a data stream |
US20150310311A1 (en) * | 2012-12-04 | 2015-10-29 | Institute Of Semiconductors, Chinese Academy Of Sciences | Dynamically reconstructable multistage parallel single instruction multiple data array processing system |
CN105930903A (en) * | 2016-05-16 | 2016-09-07 | 浙江大学 | Digital-analog hybrid neural network chip architecture |
CN106022468A (en) * | 2016-05-17 | 2016-10-12 | 成都启英泰伦科技有限公司 | Artificial neural network processor integrated circuit and design method therefor |
CN106295799A (en) * | 2015-05-12 | 2017-01-04 | 核工业北京地质研究院 | A kind of implementation method of degree of depth study multilayer neural network |
CN106971229A (en) * | 2017-02-17 | 2017-07-21 | 清华大学 | Neural computing nuclear information processing method and system |
CN107016175A (en) * | 2017-03-23 | 2017-08-04 | 中国科学院计算技术研究所 | It is applicable the Automation Design method, device and the optimization method of neural network processor |
CN107169560A (en) * | 2017-04-19 | 2017-09-15 | 清华大学 | The depth convolutional neural networks computational methods and device of a kind of adaptive reconfigurable |
CN108364063A (en) * | 2018-01-24 | 2018-08-03 | 福州瑞芯微电子股份有限公司 | A kind of neural network training method and device distributing resource based on weights |
CN108416436A (en) * | 2016-04-18 | 2018-08-17 | 中国科学院计算技术研究所 | The method and its system of neural network division are carried out using multi-core processing module |
-
2018
- 2018-09-14 CN CN201811076248.0A patent/CN109409510B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8306931B1 (en) * | 2009-08-06 | 2012-11-06 | Data Fusion & Neural Networks, LLC | Detecting, classifying, and tracking abnormal data in a data stream |
US20150310311A1 (en) * | 2012-12-04 | 2015-10-29 | Institute Of Semiconductors, Chinese Academy Of Sciences | Dynamically reconstructable multistage parallel single instruction multiple data array processing system |
CN106295799A (en) * | 2015-05-12 | 2017-01-04 | 核工业北京地质研究院 | A kind of implementation method of degree of depth study multilayer neural network |
CN108416436A (en) * | 2016-04-18 | 2018-08-17 | 中国科学院计算技术研究所 | The method and its system of neural network division are carried out using multi-core processing module |
CN105930903A (en) * | 2016-05-16 | 2016-09-07 | 浙江大学 | Digital-analog hybrid neural network chip architecture |
CN106022468A (en) * | 2016-05-17 | 2016-10-12 | 成都启英泰伦科技有限公司 | Artificial neural network processor integrated circuit and design method therefor |
CN106971229A (en) * | 2017-02-17 | 2017-07-21 | 清华大学 | Neural computing nuclear information processing method and system |
CN107016175A (en) * | 2017-03-23 | 2017-08-04 | 中国科学院计算技术研究所 | It is applicable the Automation Design method, device and the optimization method of neural network processor |
CN107169560A (en) * | 2017-04-19 | 2017-09-15 | 清华大学 | The depth convolutional neural networks computational methods and device of a kind of adaptive reconfigurable |
CN108364063A (en) * | 2018-01-24 | 2018-08-03 | 福州瑞芯微电子股份有限公司 | A kind of neural network training method and device distributing resource based on weights |
Non-Patent Citations (1)
Title |
---|
陈志坚 等: ""基于神经网络的重构指令预取机制及其可扩展架构"", 《电子学报》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020051918A1 (en) * | 2018-09-14 | 2020-03-19 | 中国科学院深圳先进技术研究院 | Neuronal circuit, chip, system and method therefor, and storage medium |
CN110516789A (en) * | 2019-08-09 | 2019-11-29 | 苏州浪潮智能科技有限公司 | The processing method of instruction set, device and relevant device in convolutional network accelerator |
CN110516789B (en) * | 2019-08-09 | 2022-02-18 | 苏州浪潮智能科技有限公司 | Method and device for processing instruction set in convolutional network accelerator and related equipment |
CN112446475A (en) * | 2019-09-03 | 2021-03-05 | 芯盟科技有限公司 | Neural network intelligent chip and forming method thereof |
CN112598107A (en) * | 2019-10-01 | 2021-04-02 | 创鑫智慧股份有限公司 | Data processing system and data processing method thereof |
WO2021089009A1 (en) * | 2019-11-08 | 2021-05-14 | 中国科学院深圳先进技术研究院 | Data stream reconstruction method and reconstructable data stream processor |
CN111105023A (en) * | 2019-11-08 | 2020-05-05 | 中国科学院深圳先进技术研究院 | Data stream reconstruction method and reconfigurable data stream processor |
CN111105023B (en) * | 2019-11-08 | 2023-03-31 | 深圳市中科元物芯科技有限公司 | Data stream reconstruction method and reconfigurable data stream processor |
CN111222637A (en) * | 2020-01-17 | 2020-06-02 | 上海商汤智能科技有限公司 | Neural network model deployment method and device, electronic equipment and storage medium |
CN111222637B (en) * | 2020-01-17 | 2023-11-28 | 上海商汤智能科技有限公司 | Neural network model deployment method and device, electronic equipment and storage medium |
CN111651207A (en) * | 2020-08-06 | 2020-09-11 | 腾讯科技(深圳)有限公司 | Neural network model operation chip, method, device, equipment and medium |
CN111651207B (en) * | 2020-08-06 | 2020-11-17 | 腾讯科技(深圳)有限公司 | Neural network model operation chip, method, device, equipment and medium |
WO2022246639A1 (en) * | 2021-05-25 | 2022-12-01 | Nvidia Corporation | Hardware circuit for deep learning task scheduling |
CN114970406A (en) * | 2022-05-30 | 2022-08-30 | 中昊芯英(杭州)科技有限公司 | Method, apparatus, medium and computing device for customizing digital integrated circuit |
Also Published As
Publication number | Publication date |
---|---|
CN109409510B (en) | 2022-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109409510A (en) | Neuron circuit, chip, system and method, storage medium | |
Pei et al. | Towards artificial general intelligence with hybrid Tianjic chip architecture | |
CN106650922B (en) | Hardware neural network conversion method, computing device, software and hardware cooperative system | |
US20200026992A1 (en) | Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system | |
CN107578095B (en) | Neural computing device and processor comprising the computing device | |
CN107636638B (en) | General parallel computing architecture | |
CN110334799A (en) | Integrated ANN Reasoning and training accelerator and its operation method are calculated based on depositing | |
CN107766935B (en) | Multilayer artificial neural network | |
CN109190756A (en) | Arithmetic unit based on Winograd convolution and the neural network processor comprising the device | |
CN104145281A (en) | Neural network computing apparatus and system, and method therefor | |
CN110309911A (en) | Neural network model verification method, device, computer equipment and storage medium | |
CN109711539A (en) | Operation method, device and Related product | |
Stevens et al. | Manna: An accelerator for memory-augmented neural networks | |
Wang et al. | Shenjing: A low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip | |
Abdelsalam et al. | An efficient FPGA-based overlay inference architecture for fully connected DNNs | |
CN109496319A (en) | Artificial intelligence process device hardware optimization method, system, storage medium, terminal | |
CN112836814A (en) | Storage and computation integrated processor, processing system and method for deploying algorithm model | |
Geng et al. | CQNN: a CGRA-based QNN framework | |
CN111831354A (en) | Data precision configuration method, device, chip array, equipment and medium | |
Bilaniuk et al. | Bit-slicing FPGA accelerator for quantized neural networks | |
CN109643336A (en) | Artificial intelligence process device designs a model method for building up, system, storage medium, terminal | |
CN110490317B (en) | Neural network operation device and operation method | |
Dazzi et al. | 5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory | |
CN113469326B (en) | Integrated circuit device and board for executing pruning optimization in neural network model | |
CN110720095A (en) | General parallel computing architecture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220921 Address after: Room 201, Building A, No. 1, Qianwan 1st Road, Qianhai Shenzhen-Hong Kong Cooperation Zone, Shenzhen, Guangdong, 518101 (Settled in Shenzhen Qianhai Road Commercial Secretary Co., Ltd.) Applicant after: Shenzhen Zhongke Yuanwuxin Technology Co.,Ltd. Address before: 518000 No. 1068, Xue Yuan Avenue, Shenzhen University Town, Nanshan District, Shenzhen, Guangdong. Applicant before: SHENZHEN INSTITUTES OF ADVANCED TECHNOLOGY CHINESE ACADEMY OF SCIENCES |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |