CN108255514A - For the neuron calculator operation method of cellular array computing system - Google Patents
For the neuron calculator operation method of cellular array computing system Download PDFInfo
- Publication number
- CN108255514A CN108255514A CN201611238870.8A CN201611238870A CN108255514A CN 108255514 A CN108255514 A CN 108255514A CN 201611238870 A CN201611238870 A CN 201611238870A CN 108255514 A CN108255514 A CN 108255514A
- Authority
- CN
- China
- Prior art keywords
- neuron
- cellular array
- computing unit
- computing system
- identification code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Neurology (AREA)
- General Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of neuron calculator operation methods for cellular array computing system.Cellular array computing system includes master controller, bus, forms cellular array by multiple computing units;Each computing unit includes:For performing the neuron calculator of the calculating operation of neuron and internal storage location, its position in cellular array is stored in each computing unit as identification code.The input information of each neuron calculator is made to include the identification code and output data of upstream neuron;The identification code of all upstream neurons is stored using the one section of content adressable memory included in each neuron calculator;It is inputted for each, the identification code of the neuron calculator of input is compared with the content in above-mentioned memory;Identification code according to meeting in comparison result find with the corresponding weight of the input, each weight and input are multiplied to obtain product, and added up to obtain cumulative signal to all products.
Description
Technical field
The present invention relates to semiconductor chip field and artificial intelligence fields more particularly to a kind of cellular array that is used for calculate
The neuron calculator operation method of system.
Background technology
Human brain is a network connected by a large amount of neuron complexity.Each neuron is connected a large amount of by a large amount of dendron
Other neurons, receive information, each tie point is cynapse (Synapse).After outside stimulus accumulation to a certain extent,
A stimulus signal is generated, is sent out by aixs cylinder.Aixs cylinder has a large amount of tip, by cynapse, is connected to other a large amount of god
Dendron through member.Exactly such a network being made of the neuron of simple functions realizes all intelligency activities of the mankind.
Man memory and intelligence are generally believed that in the different stiffness of couplings for being stored in each cynapse.
The response frequency of neuron is no more than 100Hz, 10,000,000 times faster than human brain of the CPU of modern computer, but handles very
The ability of more challenges is not so good as human brain.This has promoted computer industry to start to imitate human brain.The earliest imitation to human brain is
In software view.
Neural network (Neural Networks) is common algorithm in computer learning.God in neural network algorithm
It is exactly a function through member, it has very multiple input, each input corresponds to a weight.General algorithm is each
Input is multiplied by weight and is being added.It exports 0 or 1 (being determined by a threshold value) or a value between 0 and 1.One allusion quotation
The neural network of type is the network that the output input of a large amount of cellular arrays (Neuron) is linked together, is typically organized to more
Level framework.It is exactly to adjust these parameters to have very multiple parameters (weight, threshold value), the process of learning training inside it.This is one
The function optimization that magnanimity is needed to calculate.This kind of algorithm has been achieved for abundant achievement, is used widely.
Network in neural network algorithm is all divided into plurality of layers.Earliest network, each neuron of last layer
Each neuron with next layer connects, and becomes the network of full-mesh.One problem of full-mesh network, is image procossing
In this kind of application, there are many pixel of image, and the weight quantity of each layer of needs is proportional to pixel square, thus in program occupancy
It deposits too greatly, calculation amount is even more to be unable to cope with.
In convolutional neural networks, the plurality of layers of front is no longer full-mesh.Each layer of neuron is as a figure
As being ordered in array.Next layer each neuron is only connected with a zonule of this layer.Zonule is often one
The length of side is the square region of k, and k is known as the size of cores (Kernel Size) of convolutional network, as shown in Figure 1.
Convolutional neural networks (Convolutional Neural Network, CNN) are because to each of this zonule
The summation of point weighted is gained the name similar to convolution.Each point of this group of weight in each same confluent monolayer cells is the same
Weight quantity is greatly reduced so as to be compared with full-mesh network so that high-resolution image procossing in (both translation invariances)
It is possibly realized.One convolutional neural networks includes multiple layers connected in this way and other kinds of layer.
With popularizing for deep learning application, people start to develop dedicated neural network chip.It is realized with special circuit
The addition and multiplication that neuron calculates, than with CPU GPU much more efficients.
Reluctance type random access storage device (Magnetic Random Access Memory, MRAM) is a kind of new interior
Deposit and memory technology, can as SRAM/DRAM quick random read-write, it is and faster than DRAM;Can also as flash memory
Permanent retention data after power-off, and unlike NAND, it can be with unlimited secondary erasable.
It is local good that the economy of MRAM is thought, the silicon area that unit capacity occupies is than SRAM (usually as the caching of CPU)
There is very big advantage, be expected to the level close to DRAM.Its performance is also fairly good, reads and writes time delay close to best SRAM, power consumption
It is then best in various memories and memory technology.And MRAM unlike DRAM and flash memory with standard CMOS semiconductor technique not
It is compatible.MRAM can be integrated into logic circuit in a chip.There is MRAM technology, it is possible to memory, storage, calculate three
A function is integrated on a chip.New computing architecture is possible to.
The characteristics of human brain is extensive parallel computing, not only has a large amount of neuron that can work at the same time, but also each god
It is connected through member with thousands of a neurons.For modern integrated circuits technology, a large amount of neuron is integrated on a single die
It is easy to, but internal communication bandwidth as offer human brain is extremely difficult.For example, if the input data of one layer of neuron exists
In one block RAM, it is necessary to which at least k clock cycle could read out data, because the memory do not gone together cannot be carried out at the same time
Read-write.The speed of data is read as a result, and both memory bandwidth was the bottleneck calculated.
Invention content
In view of the drawbacks described above of the prior art, a kind of and neural network frame based on cellular array is proposed in the present invention
The framework of structure is made of numerous neuron calculators for having both store function and dense network connection.The present invention's is this new
Framework will will be widely used in the fields such as mass computing, big data processing, artificial intelligence.
To achieve the above object, the present invention provides a kind of neuron calculator operations for cellular array computing system
Method, wherein cellular array computing system include master controller, bus, form cellular array by multiple computing units;Wherein, carefully
Each computing unit of born of the same parents' array includes:For perform the calculating operation of neuron one or more neuron calculators, with
And internal storage location, wherein, its position in cellular array is stored in each computing unit as identification code;The nerve
First calculator operation method includes:Make the input information including upstream neuron of each neuron calculator identification code and
Output data;All upstream neurons are stored using the one section of content adressable memory included in each neuron calculator
Identification code;It is inputted for each, the content in the identification code of the neuron calculator of input and above-mentioned memory is carried out
It compares;Identification code according to meeting in comparison result find with the corresponding weight of the input, by each weight with input be multiplied
To obtain product, and added up to obtain cumulative signal to all products.
Preferably, every time store the identification code in above-mentioned memory with some input compare meet when, it is defeated
Go out one and meet signal, and completion signal is generated according to all signals that meets.
Preferably, when identification code meets, output high potential expression meets;The high potential that all expressions are met
Signal is connected to a NAND gate, and NAND gate represents to complete when exporting low potential.
Preferably, using counter, initialization counter causes the counting of counter to be equal to input neuron population, often sends out
It is raw once meet counter is made to subtract one, during counter clear output complete signal.
Preferably, when occurring to complete signal, cumulative signal is carried out mapping output by neuron calculator.
Preferably, the master controller is communicated by the bus with each computing unit;The master controller passes through institute
The data that bus is read and write in the internal storage location of each computing unit are stated, and the master controller passes through the bus and each meter
Calculate the neuron calculator communication of unit.
Preferably, the internal storage location is MRAM.
Preferably, the internal storage location storage weight parameter.
Preferably, cellular array includes communication network so that each computing unit of cellular array can be with neighborhood calculation
Unit communication.
The technique effect of the design of the present invention, concrete structure and generation is described further below with reference to attached drawing, with
It is fully understood from the purpose of the present invention, feature and effect.
Description of the drawings
With reference to attached drawing, and by reference to following detailed description, it will more easily have more complete understanding to the present invention
And be more easily understood its with the advantages of and feature, wherein:
Fig. 1 is the framework of convolutional neural networks.
Fig. 2 is the schematic diagram of cellular array computing system framework according to the preferred embodiment of the invention.
Fig. 3 is the communication of the computing unit of the cellular array of cellular array computing system according to the preferred embodiment of the invention
Network diagram.
Fig. 4 is the example of the computing unit of the cellular array of cellular array computing system according to the preferred embodiment of the invention
Schematic diagram.
Fig. 5 is the exemplary schematic diagram of the Path selection of the network communication in array according to the preferred embodiment of the invention.
Fig. 6 is schematic diagram of the starting point according to the preferred embodiment of the invention on the angle of rectangular area.
Fig. 7 is schematic diagram of the starting point according to the preferred embodiment of the invention on the side of rectangular area.
Fig. 8 is schematic diagram of the starting point according to the preferred embodiment of the invention outside rectangular area.
Fig. 9 is schematic diagram of the starting point according to the preferred embodiment of the invention outside rectangular area.
Figure 10 is the schematic diagram of the specific example of communication mode according to the preferred embodiment of the invention.
Figure 11 is the schematic diagram of the specific example of mass-sending mode according to the preferred embodiment of the invention.
Figure 12 is the schematic diagram of neuron calculator operation method according to the preferred embodiment of the invention.
It should be noted that attached drawing is not intended to limit the present invention for illustrating the present invention.Note that represent that the attached drawing of structure can
It can be not necessarily drawn to scale.Also, in attached drawing, same or similar element indicates same or similar label.
Specific embodiment
<Cellular array computing system>
Fig. 2 is the schematic diagram of cellular array computing system framework according to the preferred embodiment of the invention.
As shown in Fig. 2, cellular array computing system according to the preferred embodiment of the invention includes:Master controller 10 (for example,
The master controller 10 is master cpu), bus 20, by multiple computing units 30 form cellular array.
For example, the master controller 10 is master controller or the outer master controller of piece in piece.
Wherein, each computing unit 30 of cellular array includes:For performing the calculating operation of neuron (for example, calculating
Operation includes addition and multiplication etc., specifically for example each inputs and to be added with all inputs after being multiplied to of its weight) one
Or multiple neuron calculators 31 and internal storage location 32.
Wherein, internal storage location 32 may be used SRAM or MRAM, and wherein MRAM has non-volatile and has higher density.
It is therefore preferred that internal storage location 32 is MRAM.
Wherein, internal storage location 32 stores parameter, such as weight parameter.
Wherein, the master controller 10 is communicated by the bus 20 with each computing unit 30.Specifically, for example, institute
State the data that master controller 10 is read and write by the bus 20 in the internal storage location 32 of each computing unit 30, and the master control
Device 10 processed is communicated by the bus 20 with the neuron calculator 31 of each computing unit 30.
Wherein, its position (x, y) in cellular array is stored in each computing unit 30 as identification code, and
Software and hardware in computing unit 30 can read this identification code, to be used in specific operation.
The characteristics of embodiment of the present invention is integrated using MRAM and logic circuit, allows density to reach SRAM
8-20 times the characteristics of;Wherein, the embodiment of the invention proposes to be made of a fritter MRAM and neuron calculator one thin
Born of the same parents form an array, then this array is connected to form a framework by bus by a large amount of cells.This framework pole
The earth has expanded memory bandwidth, improves the overall performance of chip.
<Cellular array mesh communication network>
Fig. 3 is the communication of the computing unit of the cellular array of cellular array computing system according to the preferred embodiment of the invention
Network diagram.As shown in figure 3, cellular array includes communication network so that each computing unit 30 of cellular array can be with
Neighboring computational unit 30 communicates.
For example, each computing unit 30 can be read and write in the internal storage location 32 of neighboring computational unit 30 by the bus 20
Data, and each computing unit 30 can be logical by the neuron calculator 31 of the bus 20 and neighboring computational unit 30
Letter.
The output of each neuron calculator 31 is transmitted to next stage neuron calculator by bus 20 or communication network
Input.
In this embodiment of the invention, cellular array framework by data mass-send and internal network, solve memory and
The bottleneck problem of communication.Moreover, the embodiment of the present invention solves memory and the bottleneck of communication, so as to land productivity to a greater extent
With parallel computing, higher computing capability has thus been given play to.
<Cellular array bus broadcast method>
Master controller 10 can by bus instruction or information group sending to the god in the computing unit of a rectangular area
Through the same phase that the internal storage location in the computing unit of a rectangular area is dealt into first calculator and/or data group
To in address.
The bus for having broadcast capability in this way can be implemented by the following method:
Fig. 4 is the example of the computing unit of the cellular array of cellular array computing system according to the preferred embodiment of the invention
Schematic diagram.
As shown in figure 4, each computing unit 30 of cellular array includes:For performing the one of the calculating operation of neuron
Or multiple neuron calculators 31, internal storage location 32, bus control unit 33 and internal bus 34.
The bus control unit 33 of each computing unit 30 is connect with the bus 20.
The internal storage location 32 of each computing unit 30 is the slave device of respective inner bus 34;Each computing unit 30 is total
Lane controller 33 and neuron calculator 31 are the main equipments of respective inner bus 34, and wherein bus control unit has higher excellent
First grade.
Moreover, cellular array bus broadcast method specifically may include following step:
Broadcast destination address is simultaneously in the bus of cellular array for 10 read/write memory unit of master controller, wherein master controller 10
It sends or prepares to read data, bus control unit receives destination address, if the destination address in computing unit, connects the meter
The internal storage location of unit is calculated to perform read-write operation.If neuron calculator reads and writes the internal storage location of the computing unit,
It connects the internal storage location of the computing unit and the computing unit is read and write by neuron calculator again to perform after read-write operation is completed
Internal storage location.
Master controller 10 communicates with neuron calculator, wherein it is reserved that first is reserved in the address space of master controller 10
Section is for the communication with neuron calculator.First reserved section is used to store the identification code of target computing unit.Bus control unit
The targeted neuron calculator of present communications is identified in the identification code for receiving target computing unit, connects targeted god
Received through first calculator with execute instruction, data receiver, state read etc. subsequent operations.
Master controller 10 performs mass-sending processing, is used for wherein reserving the second reserved section in the address space of master controller 10
Mass-send the instruction to neuron calculator and/or information, second reserves the address stored in section and include target when mass-sending data
(starting point computing unit and endpoint calculation unit are in target rectangle for starting point computing unit and endpoint calculation unit in rectangular area
On the diagonal in region) identification code, with send instruction and/or information.
Master controller 10 performs mass-sending processing, wherein third is reserved in the address space in master controller 10 reserves section use
In mass-sending data, when mass-sending data third reserve the address stored in section include in target rectangle region starting point computing unit with
The identification code of endpoint calculation unit (starting point computing unit and endpoint calculation unit are on diagonal), and mass-send and contain in data
Have the number of transmission data i.e..In the data transmission of the number, the address included in data transmission each time includes
Relative address of the computing unit in target rectangle region, to indicate computing unit reception data and be stored in relative address.
For example, Yi Shang agreement is implemented by bus control unit, it is responsible for decoding the address in bus, and perform accordingly with carefully
The neuron calculator of intracellular and the data exchange of internal storage location.
The broadcast capability of cellular array bus can provide full-mesh neural network very big help, greatly improve
The transmission speed of mass data.Concrete operation method is as follows:
A Layer assignment in full-mesh neural network (for image procossing application, is done so in a rectangular area
It is more natural), each cell performs the function of one or more neurons.The weight of each input of each neuron is deposited
In the internal storage location of this cell.
After a neuron calculator is completed to calculate, read by master controller 10 from its delivery outlet as a result, data are wide
It is multicast in the region where next layer network, is sent to the neuron calculator of each computing unit.
Or after one or more neuron calculators in a computing unit are completed to calculate, result is stored in memory
It preset relative address and is read, and is broadcast in the region where next layer of neuron by master controller 10 in unit,
It is stored in preset relative address in each cell.
The broadcast capability of cellular array bus can provide full-mesh neural network very big help, greatly improve
The transmission speed of mass data.
<Cellular array internal network communication method>
Fig. 5 is the exemplary schematic diagram of the Path selection of the network communication in array according to the preferred embodiment of the invention.
● identification of each information all containing starting point computing unit and endpoint calculation unit between computing unit (cell)
Code.
● an information is reached home by the connection between neighboring computational unit from starting point computing unit by multiple transfer
Computing unit.
● a network controller is set in each computing unit, in the case where not interfering other functions rapidly
Transfering the letter breath.
● while the identification code for indicating endpoint calculation unit, information indicates the address in endpoint calculation unit simultaneously
Or neuron calculator.
■ indicates under the pattern of the address in endpoint calculation unit that information will be by the network control in computing unit at the same time
Device processed writes direct the appropriate address in the internal storage location of the computing unit.
■ indicates under the pattern of the neuron calculator in endpoint calculation unit that information will transfer to intracellular god simultaneously
It is handled through first calculator.
● each sends and the computing unit of transfer information, network controller must all select an adjacent calculating
Unit is as the next stop.
■ when starting point computing unit and endpoint calculation unit on one wire when, there are one rational selections only.
■ is in the case of other, and there are two similary rational selection, network controller can select a traffic is opposite not to be in a hurry
Neighboring computational unit.
From a computing unit sending bulk message to a rectangular area, there are one simple methods:By master controller
10 reading information are mass-sended again.Another mode is provided herein:Point communications functionality between computing unit is expanded
It opens up to region and mass-sends.This mode can support the higher depth of parallelism, much higher total bandwidth.It is very suitable for convolutional Neural net
Network.For the mass-sending between computing unit, original sender is responsible for indicating target area, is still completed by serial transfer.
● if computing unit and transfer computing unit are sent in target area:
■ (as shown in Figure 6) on the angle of rectangular area
◆ if the width in region is 1, at this time only can be with the transfer of the selected as next stop there are one neighboring computational unit.The meter
The network controller for calculating unit receives the data (if the computing unit is not sender of the message) of the information, and information is forwarded
To this neighboring computational unit, then update target area (length subtracts 1).
If ● in be left the last one computing unit, stop transfer.
◆ if the length and width in region are both greater than 1, at this time can be with the transfer of the selected as next stop there are two neighboring computational unit.It should
The network controller of computing unit receives the data (if the computing unit is not sender of the message) of the information, and respectively to this
Two neighboring computational units forward the information, then update target area, and one of region is the rectangular area that width is 1.
■ (as shown in Figure 7) on the side of rectangular area
◆ if the width in region is 1, at this time can be with the transfer of the selected as next stop there are two neighboring computational unit.The calculating
The network controller of unit receives the data (if the cell is not sender of the message) of the information, and adjacent to the two respectively
Computing unit forwards the information, then updates target area.
◆ if the width in region is more than 1, at this time can be with the transfer of the selected as next stop there are three neighboring computational unit.The meter
The network controller for calculating unit receives the data (if the cell is not sender of the message) of the information, and respectively to these three phases
Adjacent computing unit forwards the information, then updates target area, and two of which region is the rectangular area that width is 1.
■, at this time can be with the transfer of the selected as next stop, and such as there are four neighboring computational unit in the inside of rectangular area
The fruit computing unit is only possible to be sender of the message).And the information is forwarded to this four neighboring computational units respectively, then update
Target area, two of which region are the rectangular areas that width is 1.
If ● send computing unit and transfer computing unit outside target area (as shown in Figure 8 and Figure 9).In view of god
It needs to say that piece of data is transferred to another on a large scale through network, in this case, communication network is easy to get congestion.This
In using an agreement, congestion can be avoided in convolutional network through research.
■ indicates transmission direction when sending information.
Information all is transmitted by transmission direction during ■ transfers each time, when coordinate in a forward direction enters target area
After coordinate range, gradually laterally the mass-sending that width is 1 is done.
Specific implementation method (as shown in Figure 10):
1. the communication port between each two neighboring computational unit is made of a pair of of FIFO (first in, first out).From a calculating
Unit is written and the FIFO of another computing unit output and FIFO of opposite direction.It stands the angle of a computing unit wherein
On, can it be referred to as input and output FIFOs.
2. network controller is connected with all (most 4 couples) FIFO in computing unit.Network controller also with the calculating
MPU connections in unit are sent out to it and are interrupted (such as FIFO empty, FIFO is full, new to arrive information etc., information submitting etc.).
3. network controller transmission, reception and transfer information.
4. if some input FIFO has information entrance, network controller will first check for it:
● if terminal is this computing unit, then:
If ■ terminal feature relative addresses, since network controller has DMA abilities, information will be directly stored in interior
The appropriate address of memory cell and with notice neuron calculator
If ■ terminals are neuron calculators, directly it is notified to be handled.
Cellular array network mass-sending function can provide very big help, pole for neural network particularly convolutional neural networks
The earth improves the transmission bandwidth of mass data.Concrete operation method is as follows:
1. the Layer assignment in neural network (for image procossing application, is done so more in a rectangular area
It is natural), each computing unit performs the function of one or more neurons.The weight of each input of each neuron is deposited
In internal storage location/MRAM of this computing unit.
2. adjacent layer is deployed in adjacent region.
After this layer of all neuron is completed to calculate, the region where the layer that transmission direction is directed towards, knowledge
All computing units synchronize mass-sending, as shown in figure 11.
The advantage that neural computing is realized using the present invention is obvious:
1. the arithmetic speed that the parallel computing of a large amount of computing units is significantly speeded up, so that the speed pole of learning training
It is big to improve.
2. the huge bandwidth of cellular array internal network communication and mass-sending mechanism are equally remarkably contributing to improve speed.
3.MRAM's is non-volatile so that the successful chip of training can be replicated directly as the product for solving particular problem
Sale.
<Neuron calculator operation method>
The Comparision of neuron calculator is simple:The output for each the upper strata neuron for having connection therewith, it is multiplied by
Corresponding weight, adds up.Usually last result also needs to do a simple mapping, for example, be mapped to 0 to 1 it
Between a number.However, each neuron has tens at least, at most thousands of input neurons, in a net
In network environment, it is difficult to ensure that these inputs arrive by set sequence.A method rapidly and efficiently is needed, it is defeated to each
Enter, go to search comparison, also to determine whether that all inputs all have arrived at.
The present invention proposes that a method solves the problems, such as this, as shown in figure 12:
1. make the identification code of the input information including upstream neuron of each neuron calculator in network and defeated
Go out data.
2. make comprising one section of content adressable memory (SRAM, CAM) in each neuron calculator, it is all to store
The identification code of upstream neuron.This internal storage location can once complete input and the comparison of all identification codes, meet
An output is generated during identification code.
3. being inputted for each, the content in the identification code of the neuron calculator of input and above-mentioned memory is carried out
It compares;Identification code according to meeting in comparison result find with the corresponding weight of the input, by each weight with input be multiplied
To obtain product, and added up to obtain cumulative signal to all products.For example, it is a kind of look for address implementation method be:
The storage of weight address is the corresponding address of its identification code.
4. every time store the identification code in above-mentioned memory with some input compare meet when, export one
Meet signal, and completion signal is generated according to all signals that meets.There are two types of implementation methods:
A. when all identification codes meet, output high potential expression meets.All signals that meet are connected to one
NAND gate, NAND gate represent to complete when exporting low potential.
B. using counter, initialization counter causes the counting of counter to be equal to input neuron population, often occurs one
It is secondary meet counter is made to subtract one, during counter clear output complete signal.
5. when occurring to complete signal, cumulative signal is carried out mapping output by neuron calculator.
Thereby, it is possible to while ensureing that input arrives by set sequence, improve service speed.
<Cellular array three-dimensional communication transmission method>
Three-dimensional cell array neural network chip can provide higher performance.Cell according to the preferred embodiment of the invention
Array three-dimensional method for communication transmission may include:
By multi-layer cellular array neural network chip (that is, cellular array computing system) perpendicular through silicon hole (TSV) excessively
It is overlapped into three-dimensional chip;
The bus of every one layer of cells array is got up by crossing silicon hole connection.
The computing unit in two neighbouring cellular array neural network chips is made to pass through silicon hole and carries out network
Connection.In this way, Extension of Communication Networks between computing unit is into three-dimensional communication network.
The neighbouring cell battle array adjacent neural net layer in multilayer neural network being deployed in three-dimensional chip
In row neural network chip layer;When needing to transmit data to next layer of neural network by communication network, first passed through silicon and led to
Hole carries out the data transmission of a vertical direction, and the communication network per one layer of cells array neural network chip is recycled to carry out water
Flat data transmission.
This pattern of the preferred embodiment of the present invention is especially efficient for convolutional neural network so that several network transmissions
Period can complete the transhipment work for needing thousands of a periods that could complete originally.
<Convolutional neural networks implementation method>
Cellular array framework has very big flexibility, provides another convolutional neural networks implementation method here:
1. by one layer of convolutional neural networks by the interior location relational deployment of this layer of convolutional neural networks in cellular array
In one rectangular area.
2. one by one weight is sent in rectangular area where each neuron by cellular array bus broadcast method
Computing unit, be stored in the memory of the computing unit.(weight of convolutional neural networks has translation invariance, each nerve
It is different, but all neurons use same group of weight that member closes on the corresponding weight of cell input from difference.It is very suitable for
It is sent using broadcast mechanism).
3. for this layer of convolutional neural networks, receiving the calculating of input progress neuron, (neuron calculating includes:It will be each defeated
Enter multiplied by weight corresponding with its, then sum to all products), the output of each neuron is temporarily retained in nerve
In member calculates.
4. by cellular array bus broadcast method next layer of convolutional neural networks (above-mentioned this layer of convolutional Neural
Next layer of convolutional neural networks of network) weight be sent in the computing unit of the rectangular area.
5. last layer neuron, (the above-mentioned layer is rolled up using the cellular array mesh communication network in cellular array
Product neural network last layer neuron) output be transferred in predetermined close region need the output computing unit in.
(in convolutional neural networks, the neuron of be only output to close on zonule of each neuron is very suitable for using cell
Array mesh communication network is spread.Only need a few clock cycle that can complete all data exchanges)
6. using the output of new weight (weights of next layer of convolutional neural networks) and last layer neuron, carry out next
The calculating of layer convolutional neural networks.
7. if following several layer networks or convolutional network, still can carry out successively like this.
The advantages of this method is:
1. if the image of input is very big, and convolutional network usually has ten several layers of, will occupy a large amount of hardware and memory source.
This method carries out the calculating of different layers using same hardware, dramatically saves hardware resource.
2. utilizing cellular array mesh communication network, the data exchange between different layers is very efficient.
The preferred embodiment of the present invention has shown and described in above description, as previously described, it should be understood that the present invention is not office
Be limited to form disclosed herein, be not to be taken as the exclusion to other embodiment, and available for various other combinations, modification and
Environment, and can be changed in the scope of the invention is set forth herein by the above teachings or related fields of technology or knowledge
It is dynamic.And changes and modifications made by those skilled in the art do not depart from the spirit and scope of the present invention, then it all should be appended by the present invention
In scope of the claims.
Claims (9)
1. a kind of neuron calculator operation method for cellular array computing system, wherein cellular array computing system includes
Master controller, bus form cellular array by multiple computing units;Wherein, each computing unit of cellular array includes:For
The one or more neuron calculators and internal storage location of the calculating operation of neuron are performed, wherein, each computing unit
In store its position in cellular array as identification code;It is characterized in that the neuron calculator operation method packet
It includes:
The input information of each neuron calculator is made to include the identification code and output data of upstream neuron;
The knowledge of all upstream neurons is stored using the one section of content adressable memory included in each neuron calculator
Other code;
It is inputted for each, the identification code of the neuron calculator of input is compared with the content in above-mentioned memory;
Identification code according to meeting in comparison result find with the corresponding weight of the input, each weight is multiplied to obtain with input
Product, and added up to obtain cumulative signal to all products.
2. it to be used for the neuron calculator operation method of cellular array computing system as described in claim 1, it is characterised in that
It further includes:Every time store the identification code in above-mentioned memory with some input compare meet when, export one symbol
Signal is closed, and completion signal is generated according to all signals that meets.
3. being used for the neuron calculator operation method of cellular array computing system as claimed in claim 1 or 2, feature exists
In further including:When identification code meets, output high potential expression meets;The high potential signal that all expressions meet is connect
To a NAND gate, NAND gate represents to complete when exporting low potential.
4. it to be used for the neuron calculator operation method of cellular array computing system as claimed in claim 3, which is characterized in that
Using counter, initialization counter causes the counting of counter to be equal to input neuron population, and often occurring once to meet makes meter
Number devices subtract one, during counter clear output complete signal.
5. it to be used for the neuron calculator operation method of cellular array computing system as claimed in claim 3, which is characterized in that
When occurring to complete signal, cumulative signal is carried out mapping output by neuron calculator.
6. being used for the neuron calculator operation method of cellular array computing system as claimed in claim 1 or 2, feature exists
In the master controller is communicated by the bus with each computing unit;The master controller is read and write every by the bus
Data in the internal storage location of a computing unit, and the master controller passes through the bus and the nerve of each computing unit
First calculator communication.
7. being used for the neuron calculator operation method of cellular array computing system as claimed in claim 1 or 2, feature exists
In the internal storage location is MRAM.
8. being used for the neuron calculator operation method of cellular array computing system as claimed in claim 1 or 2, feature exists
In the internal storage location stores weight parameter.
9. being used for the neuron calculator operation method of cellular array computing system as claimed in claim 1 or 2, feature exists
In cellular array includes communication network so that each computing unit of cellular array can communicate with neighboring computational unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611238870.8A CN108255514A (en) | 2016-12-28 | 2016-12-28 | For the neuron calculator operation method of cellular array computing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611238870.8A CN108255514A (en) | 2016-12-28 | 2016-12-28 | For the neuron calculator operation method of cellular array computing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108255514A true CN108255514A (en) | 2018-07-06 |
Family
ID=62720269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611238870.8A Pending CN108255514A (en) | 2016-12-28 | 2016-12-28 | For the neuron calculator operation method of cellular array computing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108255514A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0438800A2 (en) * | 1990-01-24 | 1991-07-31 | Hitachi, Ltd. | Neural network processing system using semiconductor memories |
CN101866446A (en) * | 2010-03-08 | 2010-10-20 | 李爱国 | Method for community correction work and correction device |
CN105740946A (en) * | 2015-07-29 | 2016-07-06 | 上海磁宇信息科技有限公司 | Method for realizing neural network calculation by using cell array computing system |
-
2016
- 2016-12-28 CN CN201611238870.8A patent/CN108255514A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0438800A2 (en) * | 1990-01-24 | 1991-07-31 | Hitachi, Ltd. | Neural network processing system using semiconductor memories |
CN101866446A (en) * | 2010-03-08 | 2010-10-20 | 李爱国 | Method for community correction work and correction device |
CN105740946A (en) * | 2015-07-29 | 2016-07-06 | 上海磁宇信息科技有限公司 | Method for realizing neural network calculation by using cell array computing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106022468B (en) | the design method of artificial neural network processor integrated circuit and the integrated circuit | |
JP6714690B2 (en) | Information processing system, method of operating information processing system, and machine learning computing unit | |
US11055608B2 (en) | Convolutional neural network | |
US11423296B2 (en) | Device and method for distributing convolutional data of a convolutional neural network | |
CN108256640A (en) | Convolutional neural networks implementation method | |
CN105740946B (en) | A kind of method that application cell array computation system realizes neural computing | |
CN107918794A (en) | Neural network processor based on computing array | |
CN105719000B (en) | A kind of neuron hardware unit and the method with this unit simulation impulsive neural networks | |
CN109416756A (en) | Acoustic convolver and its applied artificial intelligence process device | |
CN110390388A (en) | Neuromorphic circuit with 3D stacked structure and the semiconductor device including it | |
CN104881666B (en) | A kind of real-time bianry image connected component labeling implementation method based on FPGA | |
CN105718996B (en) | Cellular array computing system and communication means therein | |
KR20200037748A (en) | Chip device and related product | |
CN107392309A (en) | A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA | |
JP6912491B2 (en) | Energy-saving multiple neural core circuits, methods and neurosynaptic systems | |
CN107766935B (en) | Multilayer artificial neural network | |
CN112464784A (en) | Distributed training method based on hybrid parallel | |
CN105469143B (en) | Network-on-chip method for mapping resource based on neural network dynamic feature | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN108256637A (en) | A kind of cellular array three-dimensional communication transmission method | |
CN108647776A (en) | A kind of convolutional neural networks convolution expansion process circuit and method | |
CN108090496A (en) | The method and apparatus of image procossing based on convolutional neural networks | |
CN109670581A (en) | A kind of computing device and board | |
CN114443862A (en) | Knowledge graph completion method and system based on weighted graph convolution network | |
CN108256641A (en) | For the cellular array internal network communication method of cellular array computing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180706 |