CN107992942B - Convolutional neural network chip and convolutional neural network chip operation method - Google Patents
Convolutional neural network chip and convolutional neural network chip operation method Download PDFInfo
- Publication number
- CN107992942B CN107992942B CN201610946130.3A CN201610946130A CN107992942B CN 107992942 B CN107992942 B CN 107992942B CN 201610946130 A CN201610946130 A CN 201610946130A CN 107992942 B CN107992942 B CN 107992942B
- Authority
- CN
- China
- Prior art keywords
- neural network
- convolutional neural
- network chip
- memory
- array
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a convolutional neural network chip and an operation method thereof. The convolutional neural network chip of the present invention includes: the storage array of the predetermined number is used for storing an input data array of a layer of convolutional neural network, wherein the predetermined number is larger than or equal to the kernel size of the convolutional neural network; and wherein each row of the input data array is present in turn within a respective row of the memory array. Compared with the traditional memory architecture, the invention obtains the improvement of the memory reading speed by several times, thereby obtaining the same overall speed improvement. Therefore, the invention provides a new neural network chip memory architecture capable of improving the memory reading speed.
Description
Technical Field
The invention relates to the field of semiconductor chips and artificial intelligence, in particular to a convolutional neural network chip and an operation method of the convolutional neural network chip.
Background
The human brain is a complex network of numerous neurons connected. Each neuron receives information by connecting to a large number of other neurons through a large number of dendrites, each connection point being called a Synapse (Synapse). After the external stimulus has accumulated to a certain extent, a stimulus signal is generated and transmitted out through the axon. Axons have a large number of terminals, which are connected by synapses to dendrites of a large number of other neurons. It is such a network consisting of simple functional neurons that implement all the intelligent activities of human beings. Human memory and intelligence are generally believed to be stored in the different coupling strengths at each synapse.
The response frequency of neurons does not exceed 100Hz, and the CPU of modern computers is 1000 ten thousand times faster than the human brain, but the ability to handle many complex problems is inferior to the human brain. This has prompted the computer industry to begin to mimic the human brain. The earliest emulation of the human brain was at the software level. Neural network algorithms, emerging from the 60 s of the last century, mimic the function of neurons with a function. The function accepts a plurality of inputs, each input having a different weight, and the process of learning training is to adjust the respective weights. The function is output to many other neurons, forming a network. The algorithm has achieved abundant results and is widely applied.
The networks in neural network algorithms are divided into many layers. In the earliest network, each neuron in the upper layer is connected with each neuron in the lower layer to form a fully connected network. One problem with fully-connected networks is that in image processing applications, the number of pixels in an image is large, and the amount of weight required for each layer is proportional to the square of the pixels, so that the solution occupies too much memory and is computationally infeasible.
In convolutional neural networks, many of the previous layers are no longer fully connected. The neurons of each layer are arrayed as one image. Each neuron of the next layer is in communication with only a small region of this layer. The small region is often a square region with a length of k, called the Kernel Size (Kernel Size) of the convolutional network, as shown in fig. 1.
Convolutional Neural Networks (CNN) are named because the weighted sum of the individual points of this small area resembles a convolution. This set of weights is the same at each point in each cell of the same layer (i.e., translational invariance), thereby substantially reducing the number of weights compared to a fully connected network, enabling high resolution image processing. A convolutional neural network comprises a plurality of such connected layers, as well as other kinds of layers.
With the popularization of deep learning applications, people begin to develop special neural network chips. The addition and multiplication of neuron calculation is realized by special circuits, and the method is much more efficient than that of a CPU or a GPU.
The human brain is characterized by large-scale parallel computation, a large number of neurons can work simultaneously, and each neuron is connected with thousands of neurons. For modern integrated circuit technology, it is easy to integrate a large number of neurons on one chip, but it is very difficult to provide internal communication bandwidth like the human brain. For example, if input data for a layer of neurons is stored in a RAM, it takes at least k clock cycles to read the data out, since the memories of different rows cannot be read or written simultaneously. Therefore, the speed of reading data, i.e. memory bandwidth, is a bottleneck in the computation.
Disclosure of Invention
In view of the above-mentioned defects in the prior art, the present invention provides a new neural network on-chip memory architecture capable of increasing the memory readout speed.
In order to achieve the above object, the present invention provides a convolutional neural network chip, including: the storage array of the predetermined number is used for storing an input data array of a layer of convolutional neural network, wherein the predetermined number is larger than or equal to the kernel size of the convolutional neural network; and wherein each row of the input data array is present in turn within a respective row of the memory array.
Preferably, the memory array employs magnetoresistive random access memory.
In order to achieve the above object, the present invention provides a convolutional neural network chip operating method, including:
a storage step: storing an input data array of a layer of convolutional neural network by using a convolutional neural network chip comprising a predetermined number of storage arrays, wherein the predetermined number is greater than or equal to the kernel size of the convolutional neural network; and wherein each row of the input data array is sequentially present within a respective row of the memory array;
a calculation step: neurons are used to read data simultaneously on multiple arrays of the same number as the kernel size.
Preferably, in the calculating step, data are read from a plurality of neurons located in the same row on the same row of the plurality of arrays at the same time, and the calculation is performed in parallel.
Preferably, the memory array employs magnetoresistive random access memory.
Compared with the traditional memory architecture, the invention obtains the improvement of k times of the memory reading speed, thereby obtaining the same overall speed improvement. Therefore, the invention provides a new neural network chip memory architecture capable of improving the memory reading speed.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
Fig. 1 is an architecture of a convolutional neural network.
Fig. 2 is a schematic structural diagram of a convolutional neural network chip according to a preferred embodiment of the present invention.
Fig. 3 is a flow chart of a convolutional neural network chip operation method according to a preferred embodiment of the present invention.
It is to be noted, however, that the appended drawings illustrate rather than limit the invention. It is noted that the drawings representing structures may not be drawn to scale. Also, in the drawings, the same or similar elements are denoted by the same or similar reference numerals.
Detailed Description
Fig. 2 is a schematic structural diagram of a convolutional neural network chip according to a preferred embodiment of the present invention.
Specifically, as shown in fig. 2, the convolutional neural network chip according to the preferred embodiment of the present invention includes: the device comprises a predetermined number of storage arrays, a convolutional neural network and a data processing unit, wherein the predetermined number of storage arrays is used for storing an input data array of one layer of the convolutional neural network, and is greater than or equal to the kernel size k of the convolutional neural network; and wherein each row of the input data array is present in turn within a respective row of the memory array.
Thus, in performing the computation, the neuron can read data on the k arrays simultaneously, reading all the data required for the computation in one cycle. Furthermore, data may be read simultaneously from multiple neurons in the same row on the same row of the k arrays, with calculations performed in parallel.
Accordingly, FIG. 3 is a flow chart of a method of operation of a convolutional neural network chip, in accordance with a preferred embodiment of the present invention.
Specifically, as shown in fig. 3, the convolutional neural network chip operation method according to the preferred embodiment of the present invention includes:
storage step S1: storing an input data array of a layer of convolutional neural network by using a convolutional neural network chip comprising a predetermined number of storage arrays, wherein the predetermined number is greater than or equal to a kernel size k of the convolutional neural network; and wherein each row of the input data array is sequentially present within a respective row of the memory array;
calculation step S2: neurons are used to read data on k arrays simultaneously, thus reading all the data needed for computation in one cycle.
In the calculation step S2, data may be read from a plurality of neurons located in the same row at the same time in the same row of the k arrays, and the calculation may be performed in parallel.
Compared with the traditional memory architecture, the invention obtains the improvement of k times of the memory reading speed, thereby obtaining the same overall speed improvement. Therefore, the invention provides a new neural network chip memory architecture capable of improving the memory reading speed.
Moreover, the present invention is applicable to any Memory, but Magnetoresistive Random Access Memory (MRAM) is currently the best Memory technology that can be integrated into logic circuits, and thus the present invention is most suitably applied to MRAM. In particular, the memory array of the present invention employs magnetoresistive random access memory.
While the foregoing description shows and describes the preferred embodiments of the present invention, it is to be understood that the invention is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as described herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (5)
1. A convolutional neural network chip, comprising: the storage array of the predetermined number is used for storing an input data array of a layer of convolutional neural network, wherein the predetermined number is larger than or equal to the kernel size of the convolutional neural network; and wherein each row of the input data array is present in turn within a respective row of the memory array.
2. The convolutional neural network chip of claim 1, wherein the memory array employs magnetoresistive random access memory.
3. A convolutional neural network chip operation method, comprising:
a storage step: storing an input data array of a layer of convolutional neural network by using a convolutional neural network chip comprising a predetermined number of storage arrays, wherein the predetermined number is greater than or equal to the kernel size of the convolutional neural network; and wherein each row of the input data array is sequentially present within a respective row of the memory array;
a calculation step: neurons are used to read data simultaneously on multiple arrays of the same number as the kernel size.
4. The convolutional neural network chip operating method of claim 3, wherein in the calculating step, data is read simultaneously on the same row of the plurality of arrays by a plurality of neurons located on the same row, and the calculation is performed in parallel.
5. The convolutional neural network chip operating method of claim 3 or 4, wherein the memory array employs magnetoresistive random access memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610946130.3A CN107992942B (en) | 2016-10-26 | 2016-10-26 | Convolutional neural network chip and convolutional neural network chip operation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610946130.3A CN107992942B (en) | 2016-10-26 | 2016-10-26 | Convolutional neural network chip and convolutional neural network chip operation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107992942A CN107992942A (en) | 2018-05-04 |
CN107992942B true CN107992942B (en) | 2021-10-01 |
Family
ID=62029151
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610946130.3A Active CN107992942B (en) | 2016-10-26 | 2016-10-26 | Convolutional neural network chip and convolutional neural network chip operation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107992942B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112970037B (en) * | 2018-11-06 | 2024-02-02 | 创惟科技股份有限公司 | Multi-chip system for implementing neural network applications, data processing method suitable for multi-chip system, and non-transitory computer readable medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9902115D0 (en) * | 1999-02-01 | 1999-03-24 | Axeon Limited | Neural networks |
US7401058B2 (en) * | 2004-04-29 | 2008-07-15 | University Of Massachusetts | Artificial neuron with phase-encoded logic |
US8515885B2 (en) * | 2010-10-29 | 2013-08-20 | International Business Machines Corporation | Neuromorphic and synaptronic spiking neural network with synaptic weights learned using simulation |
KR20130090147A (en) * | 2012-02-03 | 2013-08-13 | 안병익 | Neural network computing apparatus and system, and method thereof |
CN105760931A (en) * | 2016-03-17 | 2016-07-13 | 上海新储集成电路有限公司 | Artificial neural network chip and robot with artificial neural network chip |
CN105789139B (en) * | 2016-03-31 | 2018-08-28 | 上海新储集成电路有限公司 | A kind of preparation method of neural network chip |
-
2016
- 2016-10-26 CN CN201610946130.3A patent/CN107992942B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107992942A (en) | 2018-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11055608B2 (en) | Convolutional neural network | |
CN109460817B (en) | Convolutional neural network on-chip learning system based on nonvolatile memory | |
Rathi et al. | STDP-based pruning of connections and weight quantization in spiking neural networks for energy-efficient recognition | |
CN113537471B (en) | Improved spiking neural network | |
TWI661428B (en) | Neuromorphic weight cell and method of forming the same and artificial neural network | |
JP7399517B2 (en) | Memristor-based neural network parallel acceleration method, processor, and device | |
KR101686827B1 (en) | Method for implementing artificial neural networks in neuromorphic hardware | |
WO2019136764A1 (en) | Convolutor and artificial intelligent processing device applied thereto | |
Taha et al. | Memristor crossbar based multicore neuromorphic processors | |
CN111210019B (en) | Neural network inference method based on software and hardware cooperative acceleration | |
JP7150998B2 (en) | Superconducting neuromorphic core | |
KR102618546B1 (en) | 2-dimensional array based neuromorphic processor and operating method for the same | |
CN109496319A (en) | Artificial intelligence process device hardware optimization method, system, storage medium, terminal | |
Sun et al. | Low-consumption neuromorphic memristor architecture based on convolutional neural networks | |
Cho et al. | An on-chip learning neuromorphic autoencoder with current-mode transposable memory read and virtual lookup table | |
CN108154225B (en) | Neural network chip using analog computation | |
Tran et al. | Memcapacitive reservoir computing | |
CN114925320B (en) | Data processing method and related device | |
CN107992942B (en) | Convolutional neural network chip and convolutional neural network chip operation method | |
CN108154226B (en) | Neural network chip using analog computation | |
CN108154227B (en) | Neural network chip using analog computation | |
Sun et al. | Quaternary synapses network for memristor-based spiking convolutional neural networks | |
Hossain et al. | Reservoir computing system using biomolecular memristor | |
CN110178146B (en) | Deconvolutor and artificial intelligence processing device applied by deconvolutor | |
Verma et al. | Advances in neuromorphic spin-based spiking neural networks: a review |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |