CN114239815A

CN114239815A - Reconfigurable neural network computing chip

Info

Publication number: CN114239815A
Application number: CN202111347702.3A
Authority: CN
Inventors: 刘洋; 罗念祖; 王雅迪; 王俊杰
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2021-11-15
Filing date: 2021-11-15
Publication date: 2022-03-25
Anticipated expiration: 2041-11-15
Also published as: CN114239815B

Abstract

The invention relates to the field of neural network chip architecture, in particular to a reconfigurable neural network computing chip. According to the invention, the intersection structure of the word lines and the bit lines forming the neural network is replaced by the structure C capable of switching between the neuron or the synapse function, and the neuron and the synapse function at different intersections are switched, so that the function diversification is realized through flexible configuration, and the utilization rate of the PE array of the computing unit can reach the maximum beneficial effect. Meanwhile, the architecture is suitable for application, the system scale can be expanded instead of a rigid architecture which continues to use the traditional chip design, and more flexible chip design is realized. The function of the structure C can be adjusted, the configuration can be carried out according to the requirement of the neural network to be mapped, and the structure C is connected into different shapes, so that the function that a plurality of reconfigurable neural network computing chips are used as subunits and secondarily constructed into a large neural network is realized.

Description

Reconfigurable neural network computing chip

Technical Field

The invention relates to the field of neural network chip architecture, in particular to a reconfigurable neural network computing chip.

Background

The artificial neural network is a nonlinear and self-adaptive information processing system formed by interconnection of a large number of processing units and is formed by interconnection of a large number of neurons, namely nodes. Each node represents a particular output function, called the stimulus function. Every connection between two nodes represents a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the neural network. The output is different according to the connection mode of the network, the weight value and the excitation function. The neural network is a nonlinear statistical data modeling tool, is commonly used for modeling a complex relation between input and output or exploring a data mode, and finally obtains the capability of solving practical problems by continuously adjusting weights among neurons. Neural networks are currently used in a wide variety of fields, such as speech recognition, image recognition and understanding, computer vision, market analysis, decision optimization, material allocation and transportation, neurophysiology, psychology, and cognitive science research.

The operation of neural networks requires a large amount of computational resources, and the implementation of neural network hardware becomes critical in order to solve larger-scale problems faster and more efficiently. But because of the high complexity of neural networks themselves, mapping them onto hardware presents significant challenges, especially with regard to power consumption and performance.

The traditional execution hardware comprises a CPU, a GPU and an FPGA, and the defect is that the CPU cannot perform low-delay operation processing in the embedded equipment; although the GPU can meet the requirement of low-delay processing, the power consumption of the GPU is extremely high, and the GPU is not suitable for embedded equipment; on the other hand, although the FPGA can just barely meet the requirements of power consumption and execution performance, the wiring resources and the computing units in the FPGA limit the execution efficiency of different deep convolutional neural networks. And because the existing execution hardware is fixed in a connection mode, the single-core computing capability is not easy to expand, a large-scale network cannot be operated, resources cannot be effectively utilized, and the high-energy-efficiency neural network computing is difficult to realize. Therefore, exploring new neural network computing chip architectures is becoming an increasingly popular research focus and discipline frontier.

Disclosure of Invention

Aiming at the problems or the defects, in order to solve the problems that the existing neural network technology cannot give consideration to universality and is relatively low in energy efficiency, the invention provides a reconfigurable neural network computing chip, wherein a junction structure (neuron or synapse) of a word line and a bit line which form a neural network is replaced by a structure capable of switching between neuron or synapse functions, and the functions are diversified by flexible configuration through switching of the neuron and synapse functions at different junctions, so that the utilization rate of a PE array of a computing unit can achieve the maximum beneficial effect.

The technical scheme of the invention is as follows:

a reconfigurable neural network computing chip, characterized in that: the intersection of the N word lines and the M bit lines forming the neural network adopts a structure C, and the structure C switches between neuron or synapse functions according to requirements so as to reasonably utilize computing resources and achieve the maximum beneficial effect of the utilization rate of the PE array.

Further, the structure C is a memristor.

Further, when a single chip of the reconfigurable neural network computing chip works, the reconfigurable design is represented as follows: and selecting a transverse mode or a longitudinal mode to work according to the requirement of the neural network to be mapped.

The reconfigurable neural network computing chip of the invention can realize the optimized utilization of computing resources by specifically selecting the function of the structure C for the utilization of the computing resources, and the specific examples are as follows:

when the landscape mode of operation is selected: if the mapped neural network only requires N input channels and M output channels, N is less than or equal to (N-1) and M is less than or equal to (M-1). Taking the structure at the intersection of the (n +1) th bit word line and the m bit line as a neuron, taking the structure at the intersection of the (m +1) th bit line and the n bit word line as a neuron, and taking the structures at the intersection of the previous n bit word line and the previous m bit line as synapses. The input data is multiplied by the weight preserved by synapse, and then the neuron accumulates and sums to output the final result, and only the computing resource of (n +1) × (m +1) is occupied at the moment. The neural network with the required number of input channels less than or equal to (N-1) and the required number of output channels less than or equal to (M-1) can realize mapping in the neural network computing chip in the invention.

In the traditional neural network computing chip, the functional characteristics of the structure are singly fixed, namely, only one of the neurons or the synapses can be selected, and the position is also fixed and can not be adjusted. In this case, the input buffer unit can only input the input data from the input channel of the word line in the designated direction, and when passing through the synapse, the input data is multiplied by the stored weight and converted into the synapse current, and finally the synapse current is output from the output channel composed of neurons in the bit line, and the output channel is summed to realize the neural network function. At this time, N × M fixed computing resources are occupied, and because the positions of synapses and neurons are fixed, the conventional neural network computing chip can only map a fixed neural network structure.

Further, under the condition of a single chip, the reconfigurable neural network computing chip can select a transverse working mode or a longitudinal working mode. When the transverse mode is selected: the input buffer unit inputs input data from a word line and outputs output data from a bit line; when the longitudinal mode is selected: the input and output directions are interchanged, the input buffer unit inputs input data from the bit line and outputs output data from the word line. Compared with the traditional neural network computing chip, the neural network computing chip has more flexible size of the mappable neural network and is convenient for system-level expansion.

Furthermore, under the condition that a plurality of reconfigurable neural network computing chips work, board-level system expansion is carried out.

The adjacent reconfigurable neural network computing chips adopt different initial working directions, and the input and the output are directly connected. At the moment, the device is used as a traditional neural network computing chip to directly realize the mapping of the fixed neural network. Meanwhile, the work direction of word line bit lines of part of adjacent chips can be changed, and the structural functionality of the intersection of the word line bit lines can be adjusted to be used as synapses or neurons to execute operation, so that the utilization rate of computing resources is improved, and neural network mapping is more flexibly realized; a large neural network may also be configured in multiple connected chips. The neural network computing chip can adjust the function of the structure C, is configured according to the requirement of the neural network to be mapped and is connected into different shapes, so that the function of secondarily constructing a large neural network by taking a plurality of reconfigurable neural network computing chips as subunits is realized.

In summary, the present invention replaces the intersection structure (neuron or synapse) of the word line and the bit line constituting the neural network with a structure that can switch between neuron or synapse functions, and switches between neuron and synapse functions at different intersections to flexibly configure and realize function diversification, and can maximize the utilization rate of the PE array. Meanwhile, the architecture is suitable for application, the system scale can be expanded instead of a rigid architecture which continues to use the traditional chip design, and more flexible chip design is realized.

Drawings

FIG. 1 is a schematic diagram of a neural network to be mapped;

FIG. 2 is a specific structure diagram of a single reconfigurable neural network computing chip;

FIG. 3 is a schematic diagram of a method for mapping different neural networks of a single reconfigurable neural network computing chip according to an embodiment;

FIG. 4 is a schematic diagram of an embodiment of a plurality of reconfigurable neural network computing chip board-level system expansion;

FIG. 5 is a schematic diagram of FIG. 4 with a single neural network configured on adjacent chips after board level system expansion.

Detailed Description

The present invention is described in detail below with reference to the attached drawings so that those skilled in the art can better understand the present invention. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

As shown in fig. 2, a specific structural schematic diagram of a single reconfigurable neural network computing chip of the present invention is that a structure C (memristor) is adopted at an intersection of N word lines and M bit lines, and the structure C has a characteristic of being used as both a neuron and a synapse to achieve flexible transition under different requirements.

In the case of a single chip, the neural network computing chip of the present invention can select a horizontal operation mode or a vertical operation mode. When the horizontal mode is selected, the input cache unit inputs input data from a word line and outputs output data from a bit line; when the longitudinal mode is selected, the input and output directions are interchanged, the input buffer unit inputs input data from the bit line, and outputs output data from the word line.

As shown in fig. 3, taking a 2 × 4 neural network computing chip as an example, the functional characteristics of the neurons and synapses of structure C are transformed from one another according to the requirements. On the same chip 2 × 4, the left side of the figure is a detailed functional switching diagram of the structure C, where the structure C is switched to the upper synapse portion as synapse and the structure C is switched to the lower neuron portion as neuron. Different neural network functions are realized by changing the functionality of the structure C at the intersection of each word line and bit line in the chip, namely different neural networks can be mapped, and reconfigurable design is realized.

Fig. 4 is a schematic diagram of board-level system expansion of the plurality of reconfigurable neural network computing chips in fig. 2, and the neural network computing chip in the present invention may perform board-level system expansion when a plurality of chips are in operation.

The adjacent reconfigurable neural network computing chips adopt different initial working directions, and the input and the output are directly connected. At the moment, the device is used as a traditional neural network computing chip to directly realize the mapping of the fixed neural network. Meanwhile, the work direction of word lines and bit lines of part of adjacent chips can be changed, and the structural functionality of the intersection of the word lines and the bit lines can be adjusted to be used as synapses or neurons to execute operation, so that the calculation resources can be reasonably utilized, the neural network mapping can be more flexibly realized, and a large neural network is configured in a plurality of connected chips.

When four chips with the same computing resource of 5 × 5 are connected with each other, if a neural network with a first layer network size of 5 × 4, a second layer neural network size of 5 × 6, a third layer network size of 5 × 6, and a fourth layer network size of 5 × 4 is mapped, all structures C at intersections of word lines and bit lines in the round-cornered frame are used as neurons, all other structures C are used as synapses, and the operating mode directions of all word lines and bit lines are as shown in the figure. At this time, the input data of each layer is multiplied by the weight stored by synapse, and then the neuron accumulates and sums to output the final result, which is then used as the input data of the next layer of neural network and enters the input channel of the next layer.

Therefore, the first layer of neural network only occupies 5 × 4 of computing resources, and the rest resources are used for mapping the second layer of neural network, so that the neural network is configured in the adjacent chips, and the third layer of neural network is also configured on the two adjacent chips. The expansion can be obtained, the required single-layer neural network with the number of input channels less than or equal to 9 and the number of output channels less than or equal to 9 or the multilayer neural network with the multilayer used computing resources not more than 10 x 10 can realize mapping in the neural network computing chip, and the neural network computing chip has good reconfigurability and can be realized in the single and double-layer structured neural networks.

Claims

1. A reconfigurable neural network computing chip, characterized in that: the intersection of the N word lines and the M bit lines forming the neural network adopts a structure C, and the structure C switches between neuron or synapse functions according to requirements so as to reasonably utilize computing resources and achieve the maximum beneficial effect of the utilization rate of the PE array.

2. The reconfigurable neural network computing chip of claim 1, wherein: the structure C is a memristor.

3. The reconfigurable neural network computing chip of claim 1, wherein:

when a single reconfigurable neural network computing chip works, the reconfigurable design is represented as follows: and selecting a transverse mode or a longitudinal mode to work according to the requirement of the neural network to be mapped.

4. The reconfigurable neural network computing chip of claim 1, wherein: and a plurality of reconfigurable neural network computing chips are interconnected on a board to expand the system scale.

5. The reconfigurable neural network computing chip of claim 4, wherein:

when the reconfigurable neural network computing chips are interconnected, the adjacent reconfigurable neural network computing chips adopt different initial working directions, and the input and the output are directly connected; and each reconfigurable neural network computing chip is configured according to the requirements of the neural network to be mapped through the function of the adjusting structure C, so that a plurality of neural networks are used as subunits to be secondarily constructed into a large neural network.