CN108805271B

CN108805271B - Arithmetic device and method

Info

Publication number: CN108805271B
Application number: CN201710312415.6A
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2017-05-05
Filing date: 2017-05-05
Publication date: 2021-03-26
Anticipated expiration: 2037-05-05
Also published as: CN108805271A

Abstract

The present disclosure provides an arithmetic device including: and the operation unit is used for receiving the data and the operation instruction of the neural network operation and executing the neural network operation on the received neuron data and the weight data according to the operation instruction. The present disclosure also provides an operation method. The arithmetic device and the method reduce the expenses of storage resources and calculation resources and improve the arithmetic speed.

Description

Arithmetic device and method

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a neural network operation device and method supporting idempotent neuron representation.

Background

In recent years, due to high recognition rate and high parallelism, the multilayer neural network has received wide attention from academic and industrial fields.

At present, some neural networks with better performance are usually very large, which also means that the neural networks require a large amount of computing resources and storage resources. The operation speed of the neural network can be reduced due to the large consumption of calculation and storage resources, and meanwhile, the requirements on the transmission bandwidth of hardware and an operator are greatly improved.

Disclosure of Invention

Technical problem to be solved

In view of the above technical problems, the present disclosure provides a neural network operation device and method, which support the expression of the power neurons, reduce the overhead of the storage resources and the calculation resources of the neural network by the power of the neuron data, and improve the operation speed of the neural network.

(II) technical scheme

According to an aspect of the present disclosure, there is provided a neural network operation device including:

the first power conversion unit is used for converting non-power data in the input data of the neural network into power data;

the operation unit is used for receiving data and instructions of neural network operation and executing the neural network operation on the received neuron data and weight data according to the operation instructions; wherein the data received by the arithmetic unit comprises power data converted by a first power conversion unit.

Preferably, the method further comprises the following steps: a storage unit to store data and instructions; wherein the storage unit is connected with the first power conversion unit to receive the power data.

Preferably, the method further comprises the following steps: the control unit and the output neuron cache unit; wherein

The control unit is connected with the storage unit, is used for controlling the interaction of data and instructions, receives the data and the instructions sent by the storage unit, and decodes the instructions into operation instructions;

the operation unit is connected with the control unit, receives the data and the operation instruction sent by the control unit, and executes neural network operation on the received neuron data and the weight data according to the operation instruction; and

and the output neuron cache unit is connected with the operation unit and used for receiving neuron data output by the operation unit and sending the neuron data to the control unit as input data of the next layer of neural network operation.

Preferably, the control unit includes:

the data control module is connected with the storage unit and used for realizing data and instruction interaction between the storage unit and each cache module;

the instruction cache module is connected with the data control module and used for receiving the instruction sent by the data control module;

the decoding module is connected with the instruction cache module and used for reading the instruction from the instruction cache module and decoding the instruction into an operation instruction;

the input neuron cache module is connected with the data control module and is used for acquiring corresponding input neuron data from the data control module;

the weight cache module is connected with the data control module and is used for acquiring corresponding weight data from the data control module; wherein the content of the first and second substances,

the operation unit is respectively connected with the decoding module, the input neuron cache module and the weight cache module, receives each operation instruction, neuron data and weight data, and executes corresponding neural network operation on the received neuron data and weight data according to the operation instruction.

Preferably, the first power conversion unit is configured to convert non-power weight data in the neural network input data into power weight data.

Preferably, the method further comprises the following steps: a second power conversion unit; wherein the content of the first and second substances,

the first power conversion unit is used for converting non-power neuron data and non-power weight data in the neural network input data into power neuron data and power weight data respectively and sending the power neuron data and the power weight data to the storage unit;

the second power conversion unit is connected with the output neuron cache unit and used for converting neuron data received by the second power conversion unit into power neuron data and sending the power neuron data to the control unit as input data of next-layer neural network operation.

Preferably, the neural network input data is directly stored in the storage unit if the input data is power data.

Preferably, the power data includes power neuron data and power weight data; wherein the content of the first and second substances,

the value of the neuron data representing neuron data is represented in the form of a power index value, wherein the neuron data comprises a sign bit and a power bit, the sign bit represents the sign of the neuron data by adopting one bit or a plurality of bits, the power bit represents the power bit data of the neuron data by adopting m bits, and m is a positive integer greater than 1;

the power weight data represent the value of the weight data in a power index value form, wherein the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data in one or more bits, the power bit represents the power bit data of the weight data in m bits, and m is a positive integer greater than 1.

Preferably, the storage unit has a coding table prestored therein, and is configured to provide an exponent value corresponding to each of the power neuron data and the power weight data.

Preferably, the coding table sets one or more power level data as zero power level data, and the corresponding power neuron data and power weight data are 0.

Preferably, the correspondence of the coding table is a disorder relationship, a positive correlation or a negative correlation.

Preferably, the maximum power level data corresponds to power neuron data and power weight data of 0, or the minimum power level data corresponds to power neuron data and power weight data of 0.

Preferably, the corresponding relation of the coding table is that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponent value.

Preferably, the corresponding relation of the coding table is a positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power-order data corresponds to an exponent value x, and any one or more other power-order data correspond to power neuron data and power weight data which are 0; where x denotes an offset value and y denotes a step size.

Preferably, the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y.

Preferably, y is 1 and x has the value-2^m-1。

Preferably, the corresponding relation of the coding table is a negative correlation relation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to an exponent value x, and any one or more other power order data correspond to power neuron data and power weight data which are 0; where x denotes an offset value and y denotes a step size.

Preferably, the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y.

Preferably, y is 1 and x has a value equal to 2^m-1。

Preferably, the converting the non-power neuron data and the non-power weight data into the power neuron data and the power weight data includes:

s_out＝s_in

wherein d is_inInput data for power conversion unit, d_outIs the output data of the power conversion unit, s_inFor symbols of input data, s_outTo output the symbols of the data, d_in+For positive part of the input data, d_in+＝d_in×s_in，d_out+To output a positive part of the data, d_out+＝d_out×s_out，

Representing the whole operation of taking down the data x; or the like, or, alternatively,

s_out＝s_in

Representing the operation of taking the whole of the data x; or the like, or, alternatively,

s_out＝s_in

wherein d is_inInput data for power conversion unit, d_outIs the output data of the power conversion unit; s_inFor symbols of input data, s_outIs the sign of the output data; d_in+For positive part of the input data, d_in+＝d_in×s_in，d_out+To output a positive part of the data, d_out+＝d_out×s_out；[x]Indicating a rounding operation on data x.

According to another aspect of the present disclosure, there is provided a neural network operation method, including:

acquiring an instruction, neuron data and power weight data;

and carrying out neural network operation on the neuron data and the power weight data according to the operation instruction.

acquiring an instruction, power neuron data and power weight data;

and performing neural network operation on the power neuron data and the power weight data according to the operation instruction.

Preferably, the method further comprises the following steps: outputting neuron data after neural network operation and taking the neuron data as input data of the next layer of neural network operation; repeating the operation steps of the neural network until the last layer of operation of the neural network is finished.

Preferably, the obtaining of the instruction, the weight data and the neuron data includes:

inputting the instruction, the neuron data and the weight data into a storage unit;

the data control module receives the instruction, the neuron data and the weight data sent by the storage unit;

the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the weight data input into the storage unit is non-power weight data, the weight data is converted into power weight data by the first power conversion unit and input into the storage unit; and if the weight data input into the storage unit is power weight data, directly inputting into the storage unit.

Preferably, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit and sends the neuron data to the data control module as input data of the next layer of neural network operation.

the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the neuron data and the weight data input into the storage unit are non-power neuron data and non-power weight data, the neuron data and the non-power weight data are converted into power neuron data and power weight data through the first power conversion unit and input into the storage unit; if the neuron data and the weight data input to the storage unit are power neuron data and power weight data, the neuron data and the weight data are directly input to the storage unit.

Preferably, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit; the second power conversion unit receives the neuron data sent by the output neuron cache unit, converts the neuron data into power neuron data and sends the power neuron data to the data control module to be used as input data of next-layer neural network operation.

Preferably, the performing a neural network operation on the weight data and the neuron data according to the operation instruction includes:

the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;

the operation unit respectively receives the operation instruction, the neuron data and the weight data sent by the decoding module, the input neuron cache module and the weight cache module, and performs neural network operation on the neuron data and the weight data according to the operation instruction.

Preferably, the range of the power neuron data and the power weight data which can be expressed by the neural network operation device is adjusted by changing the integer value x and the positive integer value y pre-stored in the storage unit.

According to another aspect of the present disclosure, a method for using a neural network operation device is provided, in which a range of power neuron data and power weight data that can be expressed by the neural network operation device is adjusted by changing an integer value x and a positive integer value y that are pre-stored in a storage unit.

(III) advantageous effects

According to the technical scheme, the neural network operation device and the neural network operation method have at least one of the following beneficial effects:

(1) the neuron data and the weight data are stored by utilizing the power data representation method, the storage space required by network data storage is reduced, meanwhile, the data representation method simplifies multiplication operation of the neuron and the weight data, reduces the design requirement on an arithmetic unit and accelerates the arithmetic speed of the neural network.

(2) The neuron data obtained after operation is converted into neuron data expressed in power, so that the expenses of storage resources and calculation resources of the neural network are reduced, and the operation speed of the neural network is improved.

(3) The non-power neuron data and the non-power weight data can be subjected to power conversion before being input into the neural network operation device, and then are input into the neural network operation device, so that the expenses of neural network storage resources and calculation resources are further reduced, and the operation speed of the neural network is improved.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present disclosure.

Fig. 1 is a schematic structural diagram of a neural network computing device according to a first embodiment of the disclosure.

Fig. 2 is a schematic structural diagram of a neural network computing device according to a second embodiment of the disclosure.

Fig. 3 is a flowchart of a neural network operation method according to a third embodiment of the present disclosure.

Fig. 3.1 is a schematic diagram of a coding table according to a third embodiment of the disclosure.

Fig. 3.2 is another schematic diagram of a coding table according to a third embodiment of the disclosure.

Fig. 3.3 is another schematic diagram of a coding table according to a third embodiment of the disclosure.

Fig. 3.4 is another diagram of a coding table according to a third embodiment of the disclosure.

Fig. 3.5 is a schematic diagram of a representation method of power data according to a third embodiment of the disclosure.

Fig. 3.6 is a schematic diagram of the multiplication operation of neurons and power weights according to the third embodiment of the disclosure.

Fig. 3.7 is a schematic diagram of multiplication operation of neurons and power weights according to a third embodiment of the disclosure.

Fig. 4 is a flowchart of a neural network operation method according to a fourth embodiment of the disclosure.

Fig. 4.1 is a diagram of a coding table according to a fourth embodiment of the disclosure.

Fig. 4.2 is another schematic diagram of a coding table according to a fourth embodiment of the disclosure.

Fig. 4.3 is another diagram of a coding table according to a fourth embodiment of the disclosure.

Fig. 4.4 is another schematic diagram of a coding table according to a fourth embodiment of the disclosure.

Fig. 4.5 is a schematic diagram of a method for representing power data according to a fourth embodiment of the disclosure.

Fig. 4.6 is a schematic diagram of multiplication operation of power neurons and power weights according to a fourth embodiment of the disclosure.

Detailed Description

For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.

It should be noted that in the drawings or description, the same drawing reference numerals are used for similar or identical parts. Implementations not depicted or described in the drawings are of a form known to those of ordinary skill in the art. Additionally, while exemplifications of parameters including particular values may be provided herein, it is to be understood that the parameters need not be exactly equal to the respective values, but may be approximated to the respective values within acceptable error margins or design constraints. Directional phrases used in the embodiments, such as "upper," "lower," "front," "rear," "left," "right," and the like, refer only to the orientation of the figure. Accordingly, the directional terminology used is intended to be in the nature of words of description rather than of limitation.

First, first embodiment

The present disclosure provides a neural network operation device. Fig. 1 is a schematic diagram of a neural network computing device according to the present embodiment. Referring to fig. 1, the neural network operation device of the present embodiment includes:

a storage unit 1 for storing data and instructions;

the control unit is connected with the storage unit and used for controlling the interaction of data and instructions, receiving the data and the instructions sent by the storage unit and decoding the instructions into operation instructions;

the operation unit 7 is connected with the control unit, receives the data and the operation instruction sent by the control unit, and executes neural network operation on the received neuron data and the weight data according to the operation instruction;

the output neuron buffer unit 8 is connected with the arithmetic unit and is used for receiving neuron data output by the arithmetic unit; and sends it to the control unit. Therefore, the data can be used as input data of the next layer of neural network operation; and

and the power conversion unit 9 is connected with the storage unit and is used for converting non-power weight data in the input data of the neural network into power weight data and sending the power weight data to the storage unit. And for the power weight data in the input data of the neural network, directly storing the power weight data in the storage unit.

Specifically, the control unit includes:

the data control module 2 is connected with the storage unit and is used for data and instruction interaction between the storage unit and each cache module;

the instruction cache module 3 is connected with the data control module and used for receiving the instruction sent by the data control module;

the decoding module 4 is connected with the instruction cache module and used for reading the instructions from the instruction cache module and decoding the instructions into various operation instructions;

the input neuron cache module 5 is connected with the data control module and used for receiving neuron data sent by the data control module;

and the weight cache module 6 is connected with the data control module and is used for receiving the weight data sent from the data control module.

Further, the operation unit 7 is connected to the decoding module, the input neuron buffer module, and the weight buffer module, respectively, and is configured to receive each operation instruction, neuron data, and weight data, and execute corresponding operation on the received neuron data and weight data according to each operation instruction. The output neuron cache unit 8 is connected with the arithmetic unit and is used for receiving neuron data output by the arithmetic unit; and sends it to the data control module 2 of the control unit. Thereby being used as input data of the next layer of neural network operation

The memory unit receives data and instructions from an external address space, wherein the data comprises neural network weight data, neural network input data and the like.

Further, there are many alternative ways of the power conversion operation. The following lists three power conversion operations employed in this embodiment:

the first power conversion method:

s_out＝s_in

Indicating a round-down operation on data x.

The second power conversion method:

s_out＝s_in

Indicating that a rounding operation is performed on data x.

The third power conversion method:

s_out＝s_in

Second and third embodiments

The present disclosure also provides another neural network operation device. Fig. 2 is a schematic diagram of a neural network computing device according to the present embodiment. Referring to fig. 2, the neural network operation device of the present embodiment includes:

a storage unit 101 for storing data and instructions; the memory unit receives data and instructions from an external address space, the data including neural network weight data, neural network input data, and the like.

an arithmetic unit 107, connected to the control unit, for receiving the data and the arithmetic instruction sent by the control unit, and performing neural network operation on the received weight data and neuron data according to the arithmetic instruction;

an output neuron buffer unit 108 connected to the arithmetic unit, for receiving neuron data output by the arithmetic unit and sending the neuron data to the control unit;

and the power conversion unit 109 is connected with the storage unit and is used for converting the non-power neuron data and the non-power weight data in the neural network input data into power neuron data and power weight data respectively and sending the power neuron data and the power weight data to the storage unit. The power neuron data and the power weight data in the input data of the neural network are directly stored in a storage unit; and

and the power conversion unit 110 is connected to the output neuron buffer unit 108, and is configured to convert the neuron data after the neural network operation into power neuron data and send the power neuron data to the control unit.

Further, the control unit includes:

the data control module 102 is connected with the storage unit and is used for data and instruction interaction between the storage unit and each cache module;

the instruction cache module 103 is connected with the data control module and used for receiving the instruction sent by the data control module;

a decoding module 104 connected to the instruction cache module, and configured to read an instruction from the instruction cache module and decode the instruction into each operation instruction;

an input neuron cache module 105 connected to the data control module, for receiving neuron data sent by the data control module;

and the weight caching module 106 is connected with the data control module and is used for receiving the weight data sent from the data control module.

Specifically, the operation unit 107 is connected to the decoding module, the input neuron buffer module, and the weight buffer module, respectively, and is configured to receive each operation instruction, neuron data, and weight data, and execute corresponding operation on the received neuron data and weight data according to each operation instruction.

The power conversion unit 110 is connected to the data control module, and is configured to convert the neuron data after the neural network operation into power neuron data, and send the power neuron data to the data control module 102 of the control unit. The power neuron data obtained by the power conversion unit 110 can be used as input neurons of the next layer of the neural network operation.

In addition, the specific operation method of the power conversion is the same as that of the previous embodiment, and is not described herein again.

Third and fourth embodiments

In addition, an embodiment of the present disclosure further provides a neural network operation method, and fig. 3 is a flowchart of the neural network operation method according to the embodiment. Specifically, the neural network of the embodiment of the present disclosure is a multilayer neural network, and can operate according to the operation method shown in fig. 3 for each layer of neural network, wherein input power weight data of a first layer of the neural network can be read in from an external address through a storage unit, and if the weight data read in from the external address is already power weight data, the weight data is directly transmitted to the storage unit, otherwise, the weight data is first converted into power weight data through a power conversion unit. Referring to fig. 3, the single-layer neural network operation method of the present embodiment includes:

step S1, command, neuron data, and power weight data are acquired.

Wherein the step S1 includes the following substeps:

s11, inputting the command, the neuron data and the weight data into a storage unit; wherein, the power weight data is directly input into the storage unit, and the non-power weight data is input into the storage unit after being converted by the power conversion unit;

s12, the data control module receives the instruction, neuron data and power weight data sent by the storage unit;

and S13, the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the power weight data sent by the data control module and distribute the instructions, the neuron data and the power weight data to the decoding module or the operation unit.

The power weight data represent the value of the weight data in a power index value form, specifically, the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data by using one or more bits, the power bit represents the power bit data of the weight data by using m bits, and m is a positive integer greater than 1. The storage unit is prestored with an encoding table and provides an exponent value corresponding to each exponent data of the exponent weight data. The coding table sets one or more power bit data (i.e. zero power bit data) to specify that the corresponding power weight data is 0. That is, when the power bit data of the power weight data is the zero power bit data in the coding table, it indicates that the power weight data is 0.

The correspondence relationship of the encoding table may be arbitrary.

For example, the correspondence of the encoding tables may be out of order. As shown in fig. 3.1, the exponent data of a part of the coding table with m being 5 corresponds to exponent value 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 3. The exponent data of 00010 corresponds to an exponent value of 4. When the power order data is 00011, the exponent value is 1. When the power bit data is 00100, the power weight data is 0.

The corresponding relation of the coding table can also be positive correlation, an integer value x and a positive integer value y are prestored in the storage unit, the minimum power bit data corresponds to the exponent value x, and any one or more other power bit data corresponds to the power weight data 0. x denotes an offset value and y denotes a step size. In one embodiment, the minimum power bit data corresponds to an exponent value x, the maximum power bit data corresponds to power weight data 0, and other power bit data than the minimum and maximum power bit data corresponds to an exponent value (power bit data + x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.

In one embodiment, y is 1 and x has a value equal to-2^m-1. The exponential range of the values represented by this power weight data is-2^m-1～2^m-1-1。

In one embodiment, as shown in fig. 3.2, a partial content of an encoding table with m being 5, x being 0, and y being 1 corresponds to an exponent value of 0 when the power bit data is 00000. The exponent data is 00001, which corresponds to an exponent value of 1. The exponent data of 00010 corresponds to an exponent value of 2. The exponent data of 00011 corresponds to an exponent value of 3. When the power bit data is 11111, the power weight data is 0. As shown in fig. 3.3, another part of the contents of the coding table where m is 5, x is 0, and y is 2 corresponds to an exponent value of 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 2. The exponent data of 00010 corresponds to an exponent value of 4. The exponent data of 00011 corresponds to an exponent value of 6. When the power bit data is 11111, the power weight data is 0.

The corresponding relation of the coding table can be negative correlation, an integer value x and a positive integer value y are prestored in the storage unit, the maximum power bit data corresponds to the exponent value x, and any one or more other power bit data correspond to the power weight data 0. x denotes an offset value and y denotes a step size. In one embodiment, the maximum power bit data corresponds to an exponent value x, the minimum power bit data corresponds to power weight data 0, and other power bit data than the minimum and maximum power bit data corresponds to an exponent value (power bit data-x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.

In one embodiment, y is 1 and x has a value equal to 2^m-1. The exponential range of the values represented by this power weight data is-2^m-1-1～2^m-1。

As shown in fig. 3.4, a partial content of the coding table with m being 5 corresponds to a value of 0 when the power-order data is 11111. The exponent data of 11110 corresponds to an exponent value of 1. The exponent data of 11101 corresponds to an exponent value of 2. The exponent data of 11100 corresponds to an exponent value of 3. When the power bit data is 00000, the corresponding power weight data is 0.

The corresponding relation of the coding table can be that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponential value. When the highest bit of the power bit data is 0, the corresponding power weight data is 0; when the highest bit of the power bit data is 1, the corresponding power weight data is not 0. Otherwise, that is, when the highest bit of the power bit data is 1, the corresponding power weight data is 0; when the highest bit of the power bit data is 0, the corresponding power weight data is not 0. Described in another language, that is, the power bit of the power weight data is divided by one bit to indicate whether the power weight data is 0.

In one embodiment, as shown in fig. 3.5, the sign bit is 1 bit, and the power order data bit is 7 bits, i.e., m is 7. The coding table is that the power weight value data is 0 when the power bit data is 11111111, and the power weight value data is corresponding to the corresponding binary complement code when the power bit data is other values. When the sign bit of the power weight data is 0 and the power bit is 0001001, it indicates that the specific value is 2⁹512, namely; the sign bit of the power weight data is 1, the power bit is 1111101, and the specific value is-2^-3I.e., -0.125. Compared with floating point data, the power data only retains the power bits of the data, and the storage space required for storing the data is greatly reduced.

The power data representation method can reduce the storage space required for storing the weight data. In the example provided in this embodiment, the power data is 8-bit data, and it should be appreciated that the data length is not fixed, and different data lengths are adopted according to the data range of the data weight in different occasions.

And step S2, performing neural network operation on the neuron data and the power weight data according to the operation instruction. Wherein the step S2 includes the following substeps:

s21, the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;

and S22, the operation unit receives the operation instruction, the power weight data and the neuron data sent by the decoding module, the input neuron cache module and the weight cache module respectively, and performs neural network operation on the neuron data and the weight data expressed by the power according to the operation instruction.

The multiplication operation of the neurons and the power weight is specifically that the sign bit of neuron data and the sign bit of power weight data are subjected to exclusive OR operation; the corresponding relation of the coding table is that the coding table is searched to find out the index value corresponding to the power bit of the power weight data under the condition of disorder, the minimum value of the index value of the coding table is recorded under the condition of positive correlation of the corresponding relation of the coding table, the addition method is carried out to find out the index value corresponding to the power bit of the power weight data, the maximum value of the coding table is recorded under the condition of negative correlation of the corresponding relation of the coding table, and the subtraction method is carried out to find out the index value corresponding to the power bit of the power weight data; and adding the exponent value and the neuron data power order, and keeping the neuron data valid bit unchanged.

As shown in fig. 3.6, the neuron data is 16-bit floating point data, the sign bit is 0, the power bit is 10101, and the valid bit is 0110100000, which indicates the actual value is 1.40625 × 2⁶. The sign bit of the power weight data is 1 bit, the data bit of the power data is 5 bits, namely m is 5. The coding table is that the power bit data is corresponding to the power weight value data of 0 when the power bit data is 11111, and the power bit data is corresponding to the corresponding binary complement when the power bit data is other values. The power weight of 000110 represents an actual value of 64, i.e., 2⁶. The result of the power bits of the power weight plus the power bits of the neuron is 11011, and the actual value of the result is 1.40625 x 2¹²I.e. the product of the neuron and the power weight. By this arithmetic operation, the multiplication operation is made to be an addition operation, reducing the amount of arithmetic operation required for calculation.

Second embodiment as shown in fig. 3.7, the neuron data is 32-bit floating point data, the sign bit is 1, the power bit is 10000011, and the valid bit is 10010010000000000000000, so that the actual value represented by the neuron data is-1.5703125 x 2⁴. The sign bit of the power weight data is 1 bit, the data bit of the power data is 5 bits, namely m is 5. The coding table is that the power bit data is corresponding to the power weight value data of 0 when the power bit data is 11111, and the power bit data is corresponding to the corresponding binary complement when the power bit data is other values. The power neuron is 111100, and the actual value represented by the power neuron is-2^-4. (the result of the power bits of the neuron plus the power weight is 01111111, and the actual value of the result is 1.5703125 x 2⁰I.e. the product of the neuron and the power weight.

Optionally, the method further includes step S3, outputting the neuron data after the neural network operation and using the neuron data as input data of the next layer neural network operation.

Wherein the step S3 may include the following sub-steps:

and S31, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit.

S32, the neuron data received by the output neuron buffer unit is transmitted to the data control module, the neuron data obtained by the output neuron buffer unit can be used as the input neuron of the next layer of the neural network operation, and the steps S1 to S3 are repeated until the last layer of the neural network operation is finished.

In addition, the power neuron data obtained by the power conversion unit can be used as the input power neuron of the next layer of the neural network operation, and the steps 1 to 3 are repeated until the operation of the last layer of the neural network is finished. The range of the power neuron data that can be expressed by the neural network operation device can be adjusted by changing the integer value x and the positive integer value y that are prestored in the storage unit.

Fourth and fourth embodiments

In addition, another neural network operation method is provided in the embodiments of the present disclosure, and fig. 4 is a flowchart of the neural network operation method in the embodiments.

Specifically, the neural network of the embodiment of the present disclosure is a multilayer neural network, and the operation can be performed on each layer of the neural network according to the operation method shown in fig. 4, wherein input power weight data of a first layer of the neural network can be read in from an external address through a storage unit, and if the read data of the external address is power weight data, the read data is directly transmitted to the storage unit, or else, the read data is converted into power weight data through a power conversion unit; the input power neuron data of the first layer of the neural network can be read from an external address through the storage unit, if the data read by the external address is power data, the data are directly transmitted into the storage unit, otherwise, the data are converted into the power neuron data through the power conversion unit, and the input neuron data of each layer of the neural network can be provided by the output power neuron data of one or more layers of the neural network before the layer. Referring to fig. 4, the single-layer neural network operation method of the present embodiment includes:

step S4, command, power neuron data, and power weight data are acquired.

Wherein the step S4 includes the following substeps:

s41, inputting the command, the neuron data and the weight data into a storage unit; the first power conversion unit converts the non-power neuron data and the non-power weight data into power neuron data and power weight data, and then the power neuron data and the power weight data are input into the storage unit;

s42, the data control module receives the instruction, the power neuron data and the power weight data sent by the storage unit;

and S43, the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the power neuron data and the power weight data sent by the data control module and distribute the instructions, the power neuron data and the power weight data to the decoding module or the operation unit.

The values of the neuron data and the weight data expressed by the power neuron data and the weight data are expressed in the form of power index values, specifically, the power neuron data and the power weight data both comprise sign bits and power bits, the sign bits use one or more bits to express the signs of the neuron data and the weight data, the power bits use m bits to express the power bit data of the neuron data and the weight data, and m is a positive integer greater than 1. The storage unit of the storage unit is prestored with a coding table and provides exponential values corresponding to each power order data of the power order neuron data and the power order weight data. The coding table sets one or more power order data (namely zero power order data) as the corresponding power order neuron data and power order weight data which are designated as 0. That is, when the power level data of the power neuron data and the power weight data is zero power level data in the coding table, it indicates that the power neuron data and the power weight data are 0.

The correspondence relationship of the encoding table may be arbitrary.

For example, the correspondence of the encoding tables may be out of order. As shown in fig. 4.1, the exponent data of a part of the coding table with m being 5 corresponds to exponent value 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 3. The exponent data of 00010 corresponds to an exponent value of 4. When the power order data is 00011, the exponent value is 1. When the power level data is 00100, the power neuron data and the power weight data are 0.

The corresponding relation of the coding table can also be positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power order data corresponds to the exponent value x, and any one or more other power order data correspond to the power neuron data and the power weight data are 0. x denotes an offset value and y denotes a step size. In one embodiment, the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.

In one embodiment, y is 1 and x has a value equal to-2^m-1. The exponential range of the value represented by the power neuron data and the power weight data is-2^m-1～2^m-1-1。

In one embodiment, as shown in fig. 4.2, a part of the contents of the coding table where m is 5, x is 0, and y is 1 corresponds to an exponent value of 0 when the power bit data is 00000. The exponent data is 00001, which corresponds to an exponent value of 1. The exponent data of 00010 corresponds to an exponent value of 2. The exponent data of 00011 corresponds to an exponent value of 3. When the power level data is 11111, the power neuron data and the power weight data are 0. As shown in fig. 4.3, another part of the contents of the coding table where m is 5, x is 0, and y is 2 corresponds to an exponent value of 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 2. The exponent data of 00010 corresponds to an exponent value of 4. The exponent data of 00011 corresponds to an exponent value of 6. When the power level data is 11111, the power neuron data and the power weight data are 0.

The corresponding relation of the coding table can be negative correlation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to the exponent value x, and any one or more other power order data correspond to the power neuron data and the power weight data are 0. x denotes an offset value and y denotes a step size. In one embodiment, the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data is 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.

In one embodiment, y is 1 and x has a value equal to 2^m-1. The exponential range of the value represented by the power neuron data and the power weight data is-2^m-1-1～2^m-1。

As shown in fig. 4.4, the partial content of the coding table with m being 5 corresponds to a value of 0 when the power-order data is 11111. The exponent data of 11110 corresponds to an exponent value of 1. The exponent data of 11101 corresponds to an exponent value of 2. The exponent data of 11100 corresponds to an exponent value of 3. When the power data is 00000, the power neuron data and the power weight data are 0.

The corresponding relation of the coding table can be that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponential value. When the highest bit of the power order data is 0, the corresponding power order neuron data and the power order weight data are 0; when the highest bit of the power order data is 1, the corresponding power order neuron data and the power order weight data are not 0. Otherwise, that is, when the highest bit of the power bit data is 1, the corresponding power neuron data and the power weight data are 0; when the highest bit of the power order data is 0, the corresponding power order neuron data and the power order weight data are not 0. Described in another language, the power bit of the power neuron data and the power weight data is divided by one bit to indicate whether the power neuron data and the power weight data are 0.

In one embodiment, as shown in fig. 4.5, the sign bit is 1 bit and the power order data bit is 7 bits, i.e., m is 7. The coding table is that the power neuron data and the power weight value data are 0 when the power bit data is 11111111, and the power neuron data and the power weight value data are corresponding to corresponding binary complement codes when the power bit data is other values. When the sign bit of the power neuron data and the power weight data is 0 and the power bit is 0001001, it indicates that the specific value is 2⁹512, namely; the sign bit of the power neuron data and the power weight data is 1, and the power bit is 1111101, which indicates that the specific value is-2^-3I.e., -0.125. Compared with floating point data, the power data only retains the power bits of the data, and the storage space required for storing the data is greatly reduced.

By the power data representation method, the storage space required for storing the neuron data and the weight data can be reduced. In the example provided in the present embodiment, the power data is 8-bit data, and it should be appreciated that the data length is not fixed, and different data lengths are adopted according to the data range of the neuron data and the weight data in different occasions.

And step S5, performing neural network operation on the power neuron data and the power weight data according to the operation instruction. Wherein the step S5 includes the following substeps:

s51, the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;

and S52, the operation unit receives the operation instruction, the power neuron data and the power weight data sent by the decoding module, the input neuron cache module and the weight cache module respectively, and performs neural network operation on the power neuron data and the power weight data according to the operation instruction.

The multiplication operation of the power neuron and the power weight is specifically that the data sign bit of the power neuron and the data sign bit of the power weight are subjected to exclusive OR operation; the corresponding relation of the coding table is that the coding table is searched under the condition of disorder to find out the exponent values corresponding to the power neuron data and the power weight data power bits, the corresponding relation of the coding table is that the exponent value minimum value of the coding table is recorded under the condition of positive correlation and the exponent values corresponding to the power neuron data and the power weight data power bits are found out by addition, and the corresponding relation of the coding table is that the maximum value of the coding table is recorded under the condition of negative correlation and the exponent values corresponding to the power neuron inscriptions and the power weight data power bits are found out by subtraction; and adding the exponential value corresponding to the power neuron data and the exponential value corresponding to the power weight data.

In a specific example, as shown in fig. 4.6, sign bits of the power neuron data and the power weight data are 1 bit, and sign bits of the power weight data are 4 bits, that is, m is 4. The coding table is that when the power bit data is 1111, the corresponding power weight data is 0, and when the power bit data is other values, the power bit data corresponds to a corresponding binary complement. The power neuron data is 00010, which represents an actual value of 2². The power weight of 00110 represents an actual value of 64, i.e., 2⁶. The product of the power neuron data and the power weight data is 01000, which represents an actual value of 2⁸。

It can be seen that the multiplication of the power neuron data and the power weight is simpler and more convenient than the multiplication of floating point data and power data.

The method of this embodiment may further include step S6, outputting the neuron data after the neural network operation as input data of the next layer neural network operation.

Wherein the step S6 includes the following substeps:

and S61, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit.

S62, the neuron data received by the output neuron buffer unit is transmitted to the data control module, the neuron data obtained by the output neuron buffer unit can be used as the input neuron of the next layer of the neural network operation, and the steps S4 to S6 are repeated until the last layer of the neural network operation is finished.

Because the neuron data obtained after the operation of the neural network is also power data, the bandwidth required by transmitting the neuron data to the data control module is greatly reduced compared with the bandwidth required by floating point data, the expenses of the storage resources and the calculation resources of the neural network are further reduced, and the operation speed of the neural network is improved.

All of the modules of the disclosed embodiments may be hardware structures, physical implementations of which include, but are not limited to, physical devices including, but not limited to, transistors, memristors, DNA computers.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Furthermore, the above definitions of the various elements and methods are not limited to the particular structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by one of ordinary skill in the art, for example:

the control unit of the present disclosure is not limited to the specific composition structure of the embodiment, and the control unit capable of implementing data and instruction interaction between the storage unit and the operation unit, which is well known to those skilled in the art, can be used to implement the present disclosure.

The above-mentioned embodiments are intended to illustrate the objects, aspects and advantages of the present disclosure in further detail, and it should be understood that the above-mentioned embodiments are only illustrative of the present disclosure and are not intended to limit the present disclosure, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A neural network operation device, comprising:

the operation unit is used for receiving the data and the operation instruction of the neural network operation and executing the neural network operation on the received neuron data and the weight data according to the operation instruction; wherein the data received by the arithmetic unit comprises power data converted by a first power conversion unit;

the power data comprises power neuron data and power weight data; wherein the content of the first and second substances,

the power weight data represent the value of the weight data in a power index value form, wherein the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data in one or more bits, the power bit represents the power bit data of the weight data in m bits, and m is a positive integer greater than 1;

the storage unit of the neural network arithmetic device is prestored with a coding table which is used for providing exponent data corresponding to each exponent data of the exponent neuron data and the exponent weight data.

2. The neural network operation device according to claim 1, further comprising: a storage unit to store data and instructions; wherein the storage unit is connected with the first power conversion unit to receive the power data.

3. The neural network operation device according to claim 2, further comprising: the control unit and the output neuron cache unit; wherein the content of the first and second substances,

4. The neural network operation device according to claim 3, wherein the control unit includes:

5. The apparatus of claim 4, wherein the first power conversion unit is configured to convert non-power weight data in input data of the neural network into power weight data.

6. The neural network operation device according to claim 4, further comprising: a second power conversion unit; wherein the content of the first and second substances,

7. The neural network operation device according to claim 6, wherein the input data of the neural network is directly stored in the neural network operation device if the input data of the neural network is power data.

8. The neural network operation device according to claim 1, wherein the coding table sets one or more power level data as zero power level data, and the corresponding power level neuron data and power level weight data are 0.

9. The neural network operation device according to claim 1, wherein the correspondence relationship of the code table is an out-of-order relationship, a positive correlation relationship, or a negative correlation relationship.

10. The neural network operation device according to claim 9, wherein the maximum power level data corresponds to power neuron data and power weight data being 0, or the minimum power level data corresponds to power neuron data and power weight data being 0.

11. The neural network arithmetic device according to claim 1, wherein the correspondence of the coding table is that a highest bit of the power-order data represents a zero position, and other m-1 bits of the power-order data correspond to an exponent value.

12. The neural network operation device according to claim 1, wherein the correspondence of the coding table is a positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power-order data corresponds to an exponent value x, and any one or more other power-order data corresponds to power neuron data and power weight data 0; where x denotes an offset value and y denotes a step size.

13. The neural network operation device according to claim 1, wherein the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data is 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y.

14. The neural network operation device according to claim 13, wherein y is 1, and x has a value of-2^m-1。

15. The neural network operation device according to claim 1, wherein the correspondence relationship of the coding table is a negative correlation relationship, the storage unit prestores an integer value x and a positive integer value y, the maximum power bit data corresponds to an exponent value x, and any one or more other power bit data correspond to power neuron data and power weight data 0; where x denotes an offset value and y denotes a step size.

16. The neural network operation device according to claim 1, wherein the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y.

17. The neural network operation device according to claim 16, wherein y is 1 and x is equal to 2^m-1。

18. The neural network operation device according to claim 17, wherein the converting the non-power neuron data and the non-power weight data into the power neuron data and the power weight data includes:

s_out＝s_in

s_out＝s_in

s_out＝s_in

d_out+＝[log₂(d_in+)]

wherein d is_inInput data for power conversion unit, d_outIs the output data of the power conversion unit; s_inFor symbols of input data, s_outIs the sign of the output data; d_in+For positive part of the input data, d_in+＝d_in×s_in，d_out+ is the positive part of the output data, d_out+＝d_out×s_out；[x]Indicating a rounding operation on data x.

19. A neural network operation method, comprising:

acquiring an operation instruction, neuron data and power weight data;

performing neural network operation on the neuron data and the power weight data according to the operation instruction;

the method further comprises the following steps: and prestoring an encoding table for providing an exponential value corresponding to each power bit data of the power weight data.

20. A neural network operation method, comprising:

acquiring an operation instruction, power neuron data and power weight data;

performing neural network operation on the power neuron data and the power weight data according to the operation instruction;

the method further comprises the following steps: an encoding table for providing an exponent value corresponding to each of the power neuron data is prestored.

21. The neural network operation method of claim 20, further comprising: outputting neuron data after neural network operation and taking the neuron data as input data of the next layer of neural network operation; repeating the operation steps of the neural network until the last layer of operation of the neural network is finished.

22. The neural network operation method of claim 20, wherein obtaining the operation command, the weight data and the neuron data comprises:

the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the weight data input into the storage unit is non-power weight data, the weight data is converted into power weight data by the first power conversion unit and input into the storage unit; if the weight data input into the storage unit is power weight data, directly inputting into the storage unit;

the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction.

23. The neural network operation method of claim 22, wherein the output neuron buffer unit receives neuron data obtained after neural network operation sent by the calculation unit and sends the neuron data to the data control module as input data of next-layer neural network operation.

24. The neural network operation method of claim 20, wherein obtaining the operation command, the weight data and the neuron data comprises:

the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the neuron data and the weight data input into the storage unit are non-power neuron data and non-power weight data, the neuron data and the non-power weight data are converted into power neuron data and power weight data through the first power conversion unit and input into the storage unit; if the neuron data and the weight data input into the storage unit are power neuron data and power weight data, directly inputting the data into the storage unit;

25. The neural network operation method of claim 24, wherein the output neuron buffer unit receives neuron data obtained after neural network operation sent by the calculation unit; the second power conversion unit receives the neuron data sent by the output neuron cache unit, converts the neuron data into power neuron data and sends the power neuron data to the data control module to be used as input data of next-layer neural network operation.

26. The neural network operation method of claim 20, performing the neural network operation on the weight data and the neuron data according to the operation instruction, comprising:

27. The neural network operation method according to any one of claims 22 to 26, wherein the range of the power neuron data and the power weight data that can be expressed by the neural network operation device is adjusted by changing an integer value x and a positive integer value y that are pre-stored in a storage unit, and the method includes:

when the corresponding relation of the coding table is positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power order data corresponds to the exponent value x, and any other one or more power order data corresponds to the power neuron data and the power weight data is 0; wherein x represents an offset value and y represents a step size; alternatively, the first and second electrodes may be,

when the corresponding relation of the coding table is a negative correlation relation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to the exponent value x, and any other one or more power order data correspond to the power neuron data and the power weight data are 0; where x denotes an offset value and y denotes a step size.