CN108805271B - Arithmetic device and method - Google Patents

Arithmetic device and method Download PDF

Info

Publication number
CN108805271B
CN108805271B CN201710312415.6A CN201710312415A CN108805271B CN 108805271 B CN108805271 B CN 108805271B CN 201710312415 A CN201710312415 A CN 201710312415A CN 108805271 B CN108805271 B CN 108805271B
Authority
CN
China
Prior art keywords
data
power
neuron
neural network
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710312415.6A
Other languages
Chinese (zh)
Other versions
CN108805271A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN201710312415.6A priority Critical patent/CN108805271B/en
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201811413244.7A priority patent/CN109344965A/en
Priority to EP19199521.6A priority patent/EP3620992A1/en
Priority to CN201811423295.8A priority patent/CN109409515B/en
Priority to EP19199528.1A priority patent/EP3624018B1/en
Priority to EP19199524.0A priority patent/EP3627437B1/en
Priority to CN201811423421.XA priority patent/CN109359736A/en
Priority to CN201880001242.9A priority patent/CN109219821B/en
Priority to EP18780474.5A priority patent/EP3579150B1/en
Priority to PCT/CN2018/081929 priority patent/WO2018184570A1/en
Priority to EP19199526.5A priority patent/EP3633526A1/en
Publication of CN108805271A publication Critical patent/CN108805271A/en
Priority to US16/283,711 priority patent/US10896369B2/en
Priority to US16/520,082 priority patent/US11010338B2/en
Priority to US16/520,041 priority patent/US11551067B2/en
Priority to US16/520,654 priority patent/US11049002B2/en
Priority to US16/520,615 priority patent/US10671913B2/en
Application granted granted Critical
Publication of CN108805271B publication Critical patent/CN108805271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/30083Power or thermal control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides an arithmetic device including: and the operation unit is used for receiving the data and the operation instruction of the neural network operation and executing the neural network operation on the received neuron data and the weight data according to the operation instruction. The present disclosure also provides an operation method. The arithmetic device and the method reduce the expenses of storage resources and calculation resources and improve the arithmetic speed.

Description

Arithmetic device and method
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a neural network operation device and method supporting idempotent neuron representation.
Background
In recent years, due to high recognition rate and high parallelism, the multilayer neural network has received wide attention from academic and industrial fields.
At present, some neural networks with better performance are usually very large, which also means that the neural networks require a large amount of computing resources and storage resources. The operation speed of the neural network can be reduced due to the large consumption of calculation and storage resources, and meanwhile, the requirements on the transmission bandwidth of hardware and an operator are greatly improved.
Disclosure of Invention
Technical problem to be solved
In view of the above technical problems, the present disclosure provides a neural network operation device and method, which support the expression of the power neurons, reduce the overhead of the storage resources and the calculation resources of the neural network by the power of the neuron data, and improve the operation speed of the neural network.
(II) technical scheme
According to an aspect of the present disclosure, there is provided a neural network operation device including:
the first power conversion unit is used for converting non-power data in the input data of the neural network into power data;
the operation unit is used for receiving data and instructions of neural network operation and executing the neural network operation on the received neuron data and weight data according to the operation instructions; wherein the data received by the arithmetic unit comprises power data converted by a first power conversion unit.
Preferably, the method further comprises the following steps: a storage unit to store data and instructions; wherein the storage unit is connected with the first power conversion unit to receive the power data.
Preferably, the method further comprises the following steps: the control unit and the output neuron cache unit; wherein
The control unit is connected with the storage unit, is used for controlling the interaction of data and instructions, receives the data and the instructions sent by the storage unit, and decodes the instructions into operation instructions;
the operation unit is connected with the control unit, receives the data and the operation instruction sent by the control unit, and executes neural network operation on the received neuron data and the weight data according to the operation instruction; and
and the output neuron cache unit is connected with the operation unit and used for receiving neuron data output by the operation unit and sending the neuron data to the control unit as input data of the next layer of neural network operation.
Preferably, the control unit includes:
the data control module is connected with the storage unit and used for realizing data and instruction interaction between the storage unit and each cache module;
the instruction cache module is connected with the data control module and used for receiving the instruction sent by the data control module;
the decoding module is connected with the instruction cache module and used for reading the instruction from the instruction cache module and decoding the instruction into an operation instruction;
the input neuron cache module is connected with the data control module and is used for acquiring corresponding input neuron data from the data control module;
the weight cache module is connected with the data control module and is used for acquiring corresponding weight data from the data control module; wherein the content of the first and second substances,
the operation unit is respectively connected with the decoding module, the input neuron cache module and the weight cache module, receives each operation instruction, neuron data and weight data, and executes corresponding neural network operation on the received neuron data and weight data according to the operation instruction.
Preferably, the first power conversion unit is configured to convert non-power weight data in the neural network input data into power weight data.
Preferably, the method further comprises the following steps: a second power conversion unit; wherein the content of the first and second substances,
the first power conversion unit is used for converting non-power neuron data and non-power weight data in the neural network input data into power neuron data and power weight data respectively and sending the power neuron data and the power weight data to the storage unit;
the second power conversion unit is connected with the output neuron cache unit and used for converting neuron data received by the second power conversion unit into power neuron data and sending the power neuron data to the control unit as input data of next-layer neural network operation.
Preferably, the neural network input data is directly stored in the storage unit if the input data is power data.
Preferably, the power data includes power neuron data and power weight data; wherein the content of the first and second substances,
the value of the neuron data representing neuron data is represented in the form of a power index value, wherein the neuron data comprises a sign bit and a power bit, the sign bit represents the sign of the neuron data by adopting one bit or a plurality of bits, the power bit represents the power bit data of the neuron data by adopting m bits, and m is a positive integer greater than 1;
the power weight data represent the value of the weight data in a power index value form, wherein the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data in one or more bits, the power bit represents the power bit data of the weight data in m bits, and m is a positive integer greater than 1.
Preferably, the storage unit has a coding table prestored therein, and is configured to provide an exponent value corresponding to each of the power neuron data and the power weight data.
Preferably, the coding table sets one or more power level data as zero power level data, and the corresponding power neuron data and power weight data are 0.
Preferably, the correspondence of the coding table is a disorder relationship, a positive correlation or a negative correlation.
Preferably, the maximum power level data corresponds to power neuron data and power weight data of 0, or the minimum power level data corresponds to power neuron data and power weight data of 0.
Preferably, the corresponding relation of the coding table is that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponent value.
Preferably, the corresponding relation of the coding table is a positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power-order data corresponds to an exponent value x, and any one or more other power-order data correspond to power neuron data and power weight data which are 0; where x denotes an offset value and y denotes a step size.
Preferably, the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y.
Preferably, y is 1 and x has the value-2m-1
Preferably, the corresponding relation of the coding table is a negative correlation relation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to an exponent value x, and any one or more other power order data correspond to power neuron data and power weight data which are 0; where x denotes an offset value and y denotes a step size.
Preferably, the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y.
Preferably, y is 1 and x has a value equal to 2m-1
Preferably, the converting the non-power neuron data and the non-power weight data into the power neuron data and the power weight data includes:
sout=sin
Figure BDA0001287514900000041
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure BDA0001287514900000044
Representing the whole operation of taking down the data x; or the like, or, alternatively,
sout=sin
Figure BDA0001287514900000042
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure BDA0001287514900000045
Representing the operation of taking the whole of the data x; or the like, or, alternatively,
sout=sin
Figure BDA0001287514900000043
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit; sinFor symbols of input data, soutIs the sign of the output data; din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout;[x]Indicating a rounding operation on data x.
According to another aspect of the present disclosure, there is provided a neural network operation method, including:
acquiring an instruction, neuron data and power weight data;
and carrying out neural network operation on the neuron data and the power weight data according to the operation instruction.
According to another aspect of the present disclosure, there is provided a neural network operation method, including:
acquiring an instruction, power neuron data and power weight data;
and performing neural network operation on the power neuron data and the power weight data according to the operation instruction.
Preferably, the method further comprises the following steps: outputting neuron data after neural network operation and taking the neuron data as input data of the next layer of neural network operation; repeating the operation steps of the neural network until the last layer of operation of the neural network is finished.
Preferably, the obtaining of the instruction, the weight data and the neuron data includes:
inputting the instruction, the neuron data and the weight data into a storage unit;
the data control module receives the instruction, the neuron data and the weight data sent by the storage unit;
the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the weight data input into the storage unit is non-power weight data, the weight data is converted into power weight data by the first power conversion unit and input into the storage unit; and if the weight data input into the storage unit is power weight data, directly inputting into the storage unit.
Preferably, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit and sends the neuron data to the data control module as input data of the next layer of neural network operation.
Preferably, the obtaining of the instruction, the weight data and the neuron data includes:
inputting the instruction, the neuron data and the weight data into a storage unit;
the data control module receives the instruction, the neuron data and the weight data sent by the storage unit;
the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the neuron data and the weight data input into the storage unit are non-power neuron data and non-power weight data, the neuron data and the non-power weight data are converted into power neuron data and power weight data through the first power conversion unit and input into the storage unit; if the neuron data and the weight data input to the storage unit are power neuron data and power weight data, the neuron data and the weight data are directly input to the storage unit.
Preferably, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit; the second power conversion unit receives the neuron data sent by the output neuron cache unit, converts the neuron data into power neuron data and sends the power neuron data to the data control module to be used as input data of next-layer neural network operation.
Preferably, the performing a neural network operation on the weight data and the neuron data according to the operation instruction includes:
the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;
the operation unit respectively receives the operation instruction, the neuron data and the weight data sent by the decoding module, the input neuron cache module and the weight cache module, and performs neural network operation on the neuron data and the weight data according to the operation instruction.
Preferably, the range of the power neuron data and the power weight data which can be expressed by the neural network operation device is adjusted by changing the integer value x and the positive integer value y pre-stored in the storage unit.
According to another aspect of the present disclosure, a method for using a neural network operation device is provided, in which a range of power neuron data and power weight data that can be expressed by the neural network operation device is adjusted by changing an integer value x and a positive integer value y that are pre-stored in a storage unit.
(III) advantageous effects
According to the technical scheme, the neural network operation device and the neural network operation method have at least one of the following beneficial effects:
(1) the neuron data and the weight data are stored by utilizing the power data representation method, the storage space required by network data storage is reduced, meanwhile, the data representation method simplifies multiplication operation of the neuron and the weight data, reduces the design requirement on an arithmetic unit and accelerates the arithmetic speed of the neural network.
(2) The neuron data obtained after operation is converted into neuron data expressed in power, so that the expenses of storage resources and calculation resources of the neural network are reduced, and the operation speed of the neural network is improved.
(3) The non-power neuron data and the non-power weight data can be subjected to power conversion before being input into the neural network operation device, and then are input into the neural network operation device, so that the expenses of neural network storage resources and calculation resources are further reduced, and the operation speed of the neural network is improved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present disclosure.
Fig. 1 is a schematic structural diagram of a neural network computing device according to a first embodiment of the disclosure.
Fig. 2 is a schematic structural diagram of a neural network computing device according to a second embodiment of the disclosure.
Fig. 3 is a flowchart of a neural network operation method according to a third embodiment of the present disclosure.
Fig. 3.1 is a schematic diagram of a coding table according to a third embodiment of the disclosure.
Fig. 3.2 is another schematic diagram of a coding table according to a third embodiment of the disclosure.
Fig. 3.3 is another schematic diagram of a coding table according to a third embodiment of the disclosure.
Fig. 3.4 is another diagram of a coding table according to a third embodiment of the disclosure.
Fig. 3.5 is a schematic diagram of a representation method of power data according to a third embodiment of the disclosure.
Fig. 3.6 is a schematic diagram of the multiplication operation of neurons and power weights according to the third embodiment of the disclosure.
Fig. 3.7 is a schematic diagram of multiplication operation of neurons and power weights according to a third embodiment of the disclosure.
Fig. 4 is a flowchart of a neural network operation method according to a fourth embodiment of the disclosure.
Fig. 4.1 is a diagram of a coding table according to a fourth embodiment of the disclosure.
Fig. 4.2 is another schematic diagram of a coding table according to a fourth embodiment of the disclosure.
Fig. 4.3 is another diagram of a coding table according to a fourth embodiment of the disclosure.
Fig. 4.4 is another schematic diagram of a coding table according to a fourth embodiment of the disclosure.
Fig. 4.5 is a schematic diagram of a method for representing power data according to a fourth embodiment of the disclosure.
Fig. 4.6 is a schematic diagram of multiplication operation of power neurons and power weights according to a fourth embodiment of the disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It should be noted that in the drawings or description, the same drawing reference numerals are used for similar or identical parts. Implementations not depicted or described in the drawings are of a form known to those of ordinary skill in the art. Additionally, while exemplifications of parameters including particular values may be provided herein, it is to be understood that the parameters need not be exactly equal to the respective values, but may be approximated to the respective values within acceptable error margins or design constraints. Directional phrases used in the embodiments, such as "upper," "lower," "front," "rear," "left," "right," and the like, refer only to the orientation of the figure. Accordingly, the directional terminology used is intended to be in the nature of words of description rather than of limitation.
First, first embodiment
The present disclosure provides a neural network operation device. Fig. 1 is a schematic diagram of a neural network computing device according to the present embodiment. Referring to fig. 1, the neural network operation device of the present embodiment includes:
a storage unit 1 for storing data and instructions;
the control unit is connected with the storage unit and used for controlling the interaction of data and instructions, receiving the data and the instructions sent by the storage unit and decoding the instructions into operation instructions;
the operation unit 7 is connected with the control unit, receives the data and the operation instruction sent by the control unit, and executes neural network operation on the received neuron data and the weight data according to the operation instruction;
the output neuron buffer unit 8 is connected with the arithmetic unit and is used for receiving neuron data output by the arithmetic unit; and sends it to the control unit. Therefore, the data can be used as input data of the next layer of neural network operation; and
and the power conversion unit 9 is connected with the storage unit and is used for converting non-power weight data in the input data of the neural network into power weight data and sending the power weight data to the storage unit. And for the power weight data in the input data of the neural network, directly storing the power weight data in the storage unit.
Specifically, the control unit includes:
the data control module 2 is connected with the storage unit and is used for data and instruction interaction between the storage unit and each cache module;
the instruction cache module 3 is connected with the data control module and used for receiving the instruction sent by the data control module;
the decoding module 4 is connected with the instruction cache module and used for reading the instructions from the instruction cache module and decoding the instructions into various operation instructions;
the input neuron cache module 5 is connected with the data control module and used for receiving neuron data sent by the data control module;
and the weight cache module 6 is connected with the data control module and is used for receiving the weight data sent from the data control module.
Further, the operation unit 7 is connected to the decoding module, the input neuron buffer module, and the weight buffer module, respectively, and is configured to receive each operation instruction, neuron data, and weight data, and execute corresponding operation on the received neuron data and weight data according to each operation instruction. The output neuron cache unit 8 is connected with the arithmetic unit and is used for receiving neuron data output by the arithmetic unit; and sends it to the data control module 2 of the control unit. Thereby being used as input data of the next layer of neural network operation
The memory unit receives data and instructions from an external address space, wherein the data comprises neural network weight data, neural network input data and the like.
Further, there are many alternative ways of the power conversion operation. The following lists three power conversion operations employed in this embodiment:
the first power conversion method:
sout=sin
Figure BDA0001287514900000091
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure BDA0001287514900000094
Indicating a round-down operation on data x.
The second power conversion method:
sout=sin
Figure BDA0001287514900000092
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure BDA0001287514900000095
Indicating that a rounding operation is performed on data x.
The third power conversion method:
sout=sin
Figure BDA0001287514900000093
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit; sinFor symbols of input data, soutIs the sign of the output data; din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout;[x]Indicating a rounding operation on data x.
Second and third embodiments
The present disclosure also provides another neural network operation device. Fig. 2 is a schematic diagram of a neural network computing device according to the present embodiment. Referring to fig. 2, the neural network operation device of the present embodiment includes:
a storage unit 101 for storing data and instructions; the memory unit receives data and instructions from an external address space, the data including neural network weight data, neural network input data, and the like.
The control unit is connected with the storage unit and used for controlling the interaction of data and instructions, receiving the data and the instructions sent by the storage unit and decoding the instructions into operation instructions;
an arithmetic unit 107, connected to the control unit, for receiving the data and the arithmetic instruction sent by the control unit, and performing neural network operation on the received weight data and neuron data according to the arithmetic instruction;
an output neuron buffer unit 108 connected to the arithmetic unit, for receiving neuron data output by the arithmetic unit and sending the neuron data to the control unit;
and the power conversion unit 109 is connected with the storage unit and is used for converting the non-power neuron data and the non-power weight data in the neural network input data into power neuron data and power weight data respectively and sending the power neuron data and the power weight data to the storage unit. The power neuron data and the power weight data in the input data of the neural network are directly stored in a storage unit; and
and the power conversion unit 110 is connected to the output neuron buffer unit 108, and is configured to convert the neuron data after the neural network operation into power neuron data and send the power neuron data to the control unit.
Further, the control unit includes:
the data control module 102 is connected with the storage unit and is used for data and instruction interaction between the storage unit and each cache module;
the instruction cache module 103 is connected with the data control module and used for receiving the instruction sent by the data control module;
a decoding module 104 connected to the instruction cache module, and configured to read an instruction from the instruction cache module and decode the instruction into each operation instruction;
an input neuron cache module 105 connected to the data control module, for receiving neuron data sent by the data control module;
and the weight caching module 106 is connected with the data control module and is used for receiving the weight data sent from the data control module.
Specifically, the operation unit 107 is connected to the decoding module, the input neuron buffer module, and the weight buffer module, respectively, and is configured to receive each operation instruction, neuron data, and weight data, and execute corresponding operation on the received neuron data and weight data according to each operation instruction.
The power conversion unit 110 is connected to the data control module, and is configured to convert the neuron data after the neural network operation into power neuron data, and send the power neuron data to the data control module 102 of the control unit. The power neuron data obtained by the power conversion unit 110 can be used as input neurons of the next layer of the neural network operation.
In addition, the specific operation method of the power conversion is the same as that of the previous embodiment, and is not described herein again.
Third and fourth embodiments
In addition, an embodiment of the present disclosure further provides a neural network operation method, and fig. 3 is a flowchart of the neural network operation method according to the embodiment. Specifically, the neural network of the embodiment of the present disclosure is a multilayer neural network, and can operate according to the operation method shown in fig. 3 for each layer of neural network, wherein input power weight data of a first layer of the neural network can be read in from an external address through a storage unit, and if the weight data read in from the external address is already power weight data, the weight data is directly transmitted to the storage unit, otherwise, the weight data is first converted into power weight data through a power conversion unit. Referring to fig. 3, the single-layer neural network operation method of the present embodiment includes:
step S1, command, neuron data, and power weight data are acquired.
Wherein the step S1 includes the following substeps:
s11, inputting the command, the neuron data and the weight data into a storage unit; wherein, the power weight data is directly input into the storage unit, and the non-power weight data is input into the storage unit after being converted by the power conversion unit;
s12, the data control module receives the instruction, neuron data and power weight data sent by the storage unit;
and S13, the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the power weight data sent by the data control module and distribute the instructions, the neuron data and the power weight data to the decoding module or the operation unit.
The power weight data represent the value of the weight data in a power index value form, specifically, the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data by using one or more bits, the power bit represents the power bit data of the weight data by using m bits, and m is a positive integer greater than 1. The storage unit is prestored with an encoding table and provides an exponent value corresponding to each exponent data of the exponent weight data. The coding table sets one or more power bit data (i.e. zero power bit data) to specify that the corresponding power weight data is 0. That is, when the power bit data of the power weight data is the zero power bit data in the coding table, it indicates that the power weight data is 0.
The correspondence relationship of the encoding table may be arbitrary.
For example, the correspondence of the encoding tables may be out of order. As shown in fig. 3.1, the exponent data of a part of the coding table with m being 5 corresponds to exponent value 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 3. The exponent data of 00010 corresponds to an exponent value of 4. When the power order data is 00011, the exponent value is 1. When the power bit data is 00100, the power weight data is 0.
The corresponding relation of the coding table can also be positive correlation, an integer value x and a positive integer value y are prestored in the storage unit, the minimum power bit data corresponds to the exponent value x, and any one or more other power bit data corresponds to the power weight data 0. x denotes an offset value and y denotes a step size. In one embodiment, the minimum power bit data corresponds to an exponent value x, the maximum power bit data corresponds to power weight data 0, and other power bit data than the minimum and maximum power bit data corresponds to an exponent value (power bit data + x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.
In one embodiment, y is 1 and x has a value equal to-2m-1. The exponential range of the values represented by this power weight data is-2m-1~2m-1-1。
In one embodiment, as shown in fig. 3.2, a partial content of an encoding table with m being 5, x being 0, and y being 1 corresponds to an exponent value of 0 when the power bit data is 00000. The exponent data is 00001, which corresponds to an exponent value of 1. The exponent data of 00010 corresponds to an exponent value of 2. The exponent data of 00011 corresponds to an exponent value of 3. When the power bit data is 11111, the power weight data is 0. As shown in fig. 3.3, another part of the contents of the coding table where m is 5, x is 0, and y is 2 corresponds to an exponent value of 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 2. The exponent data of 00010 corresponds to an exponent value of 4. The exponent data of 00011 corresponds to an exponent value of 6. When the power bit data is 11111, the power weight data is 0.
The corresponding relation of the coding table can be negative correlation, an integer value x and a positive integer value y are prestored in the storage unit, the maximum power bit data corresponds to the exponent value x, and any one or more other power bit data correspond to the power weight data 0. x denotes an offset value and y denotes a step size. In one embodiment, the maximum power bit data corresponds to an exponent value x, the minimum power bit data corresponds to power weight data 0, and other power bit data than the minimum and maximum power bit data corresponds to an exponent value (power bit data-x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.
In one embodiment, y is 1 and x has a value equal to 2m-1. The exponential range of the values represented by this power weight data is-2m-1-1~2m-1
As shown in fig. 3.4, a partial content of the coding table with m being 5 corresponds to a value of 0 when the power-order data is 11111. The exponent data of 11110 corresponds to an exponent value of 1. The exponent data of 11101 corresponds to an exponent value of 2. The exponent data of 11100 corresponds to an exponent value of 3. When the power bit data is 00000, the corresponding power weight data is 0.
The corresponding relation of the coding table can be that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponential value. When the highest bit of the power bit data is 0, the corresponding power weight data is 0; when the highest bit of the power bit data is 1, the corresponding power weight data is not 0. Otherwise, that is, when the highest bit of the power bit data is 1, the corresponding power weight data is 0; when the highest bit of the power bit data is 0, the corresponding power weight data is not 0. Described in another language, that is, the power bit of the power weight data is divided by one bit to indicate whether the power weight data is 0.
In one embodiment, as shown in fig. 3.5, the sign bit is 1 bit, and the power order data bit is 7 bits, i.e., m is 7. The coding table is that the power weight value data is 0 when the power bit data is 11111111, and the power weight value data is corresponding to the corresponding binary complement code when the power bit data is other values. When the sign bit of the power weight data is 0 and the power bit is 0001001, it indicates that the specific value is 29512, namely; the sign bit of the power weight data is 1, the power bit is 1111101, and the specific value is-2-3I.e., -0.125. Compared with floating point data, the power data only retains the power bits of the data, and the storage space required for storing the data is greatly reduced.
The power data representation method can reduce the storage space required for storing the weight data. In the example provided in this embodiment, the power data is 8-bit data, and it should be appreciated that the data length is not fixed, and different data lengths are adopted according to the data range of the data weight in different occasions.
And step S2, performing neural network operation on the neuron data and the power weight data according to the operation instruction. Wherein the step S2 includes the following substeps:
s21, the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;
and S22, the operation unit receives the operation instruction, the power weight data and the neuron data sent by the decoding module, the input neuron cache module and the weight cache module respectively, and performs neural network operation on the neuron data and the weight data expressed by the power according to the operation instruction.
The multiplication operation of the neurons and the power weight is specifically that the sign bit of neuron data and the sign bit of power weight data are subjected to exclusive OR operation; the corresponding relation of the coding table is that the coding table is searched to find out the index value corresponding to the power bit of the power weight data under the condition of disorder, the minimum value of the index value of the coding table is recorded under the condition of positive correlation of the corresponding relation of the coding table, the addition method is carried out to find out the index value corresponding to the power bit of the power weight data, the maximum value of the coding table is recorded under the condition of negative correlation of the corresponding relation of the coding table, and the subtraction method is carried out to find out the index value corresponding to the power bit of the power weight data; and adding the exponent value and the neuron data power order, and keeping the neuron data valid bit unchanged.
As shown in fig. 3.6, the neuron data is 16-bit floating point data, the sign bit is 0, the power bit is 10101, and the valid bit is 0110100000, which indicates the actual value is 1.40625 × 26. The sign bit of the power weight data is 1 bit, the data bit of the power data is 5 bits, namely m is 5. The coding table is that the power bit data is corresponding to the power weight value data of 0 when the power bit data is 11111, and the power bit data is corresponding to the corresponding binary complement when the power bit data is other values. The power weight of 000110 represents an actual value of 64, i.e., 26. The result of the power bits of the power weight plus the power bits of the neuron is 11011, and the actual value of the result is 1.40625 x 212I.e. the product of the neuron and the power weight. By this arithmetic operation, the multiplication operation is made to be an addition operation, reducing the amount of arithmetic operation required for calculation.
Second embodiment as shown in fig. 3.7, the neuron data is 32-bit floating point data, the sign bit is 1, the power bit is 10000011, and the valid bit is 10010010000000000000000, so that the actual value represented by the neuron data is-1.5703125 x 24. The sign bit of the power weight data is 1 bit, the data bit of the power data is 5 bits, namely m is 5. The coding table is that the power bit data is corresponding to the power weight value data of 0 when the power bit data is 11111, and the power bit data is corresponding to the corresponding binary complement when the power bit data is other values. The power neuron is 111100, and the actual value represented by the power neuron is-2-4. (the result of the power bits of the neuron plus the power weight is 01111111, and the actual value of the result is 1.5703125 x 20I.e. the product of the neuron and the power weight.
Optionally, the method further includes step S3, outputting the neuron data after the neural network operation and using the neuron data as input data of the next layer neural network operation.
Wherein the step S3 may include the following sub-steps:
and S31, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit.
S32, the neuron data received by the output neuron buffer unit is transmitted to the data control module, the neuron data obtained by the output neuron buffer unit can be used as the input neuron of the next layer of the neural network operation, and the steps S1 to S3 are repeated until the last layer of the neural network operation is finished.
In addition, the power neuron data obtained by the power conversion unit can be used as the input power neuron of the next layer of the neural network operation, and the steps 1 to 3 are repeated until the operation of the last layer of the neural network is finished. The range of the power neuron data that can be expressed by the neural network operation device can be adjusted by changing the integer value x and the positive integer value y that are prestored in the storage unit.
In addition, the specific operation method of the power conversion is the same as that of the previous embodiment, and is not described herein again.
Fourth and fourth embodiments
In addition, another neural network operation method is provided in the embodiments of the present disclosure, and fig. 4 is a flowchart of the neural network operation method in the embodiments.
Specifically, the neural network of the embodiment of the present disclosure is a multilayer neural network, and the operation can be performed on each layer of the neural network according to the operation method shown in fig. 4, wherein input power weight data of a first layer of the neural network can be read in from an external address through a storage unit, and if the read data of the external address is power weight data, the read data is directly transmitted to the storage unit, or else, the read data is converted into power weight data through a power conversion unit; the input power neuron data of the first layer of the neural network can be read from an external address through the storage unit, if the data read by the external address is power data, the data are directly transmitted into the storage unit, otherwise, the data are converted into the power neuron data through the power conversion unit, and the input neuron data of each layer of the neural network can be provided by the output power neuron data of one or more layers of the neural network before the layer. Referring to fig. 4, the single-layer neural network operation method of the present embodiment includes:
step S4, command, power neuron data, and power weight data are acquired.
Wherein the step S4 includes the following substeps:
s41, inputting the command, the neuron data and the weight data into a storage unit; the first power conversion unit converts the non-power neuron data and the non-power weight data into power neuron data and power weight data, and then the power neuron data and the power weight data are input into the storage unit;
s42, the data control module receives the instruction, the power neuron data and the power weight data sent by the storage unit;
and S43, the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the power neuron data and the power weight data sent by the data control module and distribute the instructions, the power neuron data and the power weight data to the decoding module or the operation unit.
The values of the neuron data and the weight data expressed by the power neuron data and the weight data are expressed in the form of power index values, specifically, the power neuron data and the power weight data both comprise sign bits and power bits, the sign bits use one or more bits to express the signs of the neuron data and the weight data, the power bits use m bits to express the power bit data of the neuron data and the weight data, and m is a positive integer greater than 1. The storage unit of the storage unit is prestored with a coding table and provides exponential values corresponding to each power order data of the power order neuron data and the power order weight data. The coding table sets one or more power order data (namely zero power order data) as the corresponding power order neuron data and power order weight data which are designated as 0. That is, when the power level data of the power neuron data and the power weight data is zero power level data in the coding table, it indicates that the power neuron data and the power weight data are 0.
The correspondence relationship of the encoding table may be arbitrary.
For example, the correspondence of the encoding tables may be out of order. As shown in fig. 4.1, the exponent data of a part of the coding table with m being 5 corresponds to exponent value 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 3. The exponent data of 00010 corresponds to an exponent value of 4. When the power order data is 00011, the exponent value is 1. When the power level data is 00100, the power neuron data and the power weight data are 0.
The corresponding relation of the coding table can also be positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power order data corresponds to the exponent value x, and any one or more other power order data correspond to the power neuron data and the power weight data are 0. x denotes an offset value and y denotes a step size. In one embodiment, the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.
In one embodiment, y is 1 and x has a value equal to-2m-1. The exponential range of the value represented by the power neuron data and the power weight data is-2m-1~2m-1-1。
In one embodiment, as shown in fig. 4.2, a part of the contents of the coding table where m is 5, x is 0, and y is 1 corresponds to an exponent value of 0 when the power bit data is 00000. The exponent data is 00001, which corresponds to an exponent value of 1. The exponent data of 00010 corresponds to an exponent value of 2. The exponent data of 00011 corresponds to an exponent value of 3. When the power level data is 11111, the power neuron data and the power weight data are 0. As shown in fig. 4.3, another part of the contents of the coding table where m is 5, x is 0, and y is 2 corresponds to an exponent value of 0 when the exponent data is 00000. The exponent data is 00001, which corresponds to an exponent value of 2. The exponent data of 00010 corresponds to an exponent value of 4. The exponent data of 00011 corresponds to an exponent value of 6. When the power level data is 11111, the power neuron data and the power weight data are 0.
The corresponding relation of the coding table can be negative correlation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to the exponent value x, and any one or more other power order data correspond to the power neuron data and the power weight data are 0. x denotes an offset value and y denotes a step size. In one embodiment, the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data is 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y. By presetting different x and y and by changing the values of x and y, the range of power representation becomes configurable and can be adapted to different application scenarios requiring different value ranges. Therefore, the neural network operation device has wider application range and more flexible and variable use, and can be adjusted according to the requirements of users.
In one embodiment, y is 1 and x has a value equal to 2m-1. The exponential range of the value represented by the power neuron data and the power weight data is-2m-1-1~2m-1
As shown in fig. 4.4, the partial content of the coding table with m being 5 corresponds to a value of 0 when the power-order data is 11111. The exponent data of 11110 corresponds to an exponent value of 1. The exponent data of 11101 corresponds to an exponent value of 2. The exponent data of 11100 corresponds to an exponent value of 3. When the power data is 00000, the power neuron data and the power weight data are 0.
The corresponding relation of the coding table can be that the highest bit of the power order data represents a zero position, and other m-1 bits of the power order data correspond to an exponential value. When the highest bit of the power order data is 0, the corresponding power order neuron data and the power order weight data are 0; when the highest bit of the power order data is 1, the corresponding power order neuron data and the power order weight data are not 0. Otherwise, that is, when the highest bit of the power bit data is 1, the corresponding power neuron data and the power weight data are 0; when the highest bit of the power order data is 0, the corresponding power order neuron data and the power order weight data are not 0. Described in another language, the power bit of the power neuron data and the power weight data is divided by one bit to indicate whether the power neuron data and the power weight data are 0.
In one embodiment, as shown in fig. 4.5, the sign bit is 1 bit and the power order data bit is 7 bits, i.e., m is 7. The coding table is that the power neuron data and the power weight value data are 0 when the power bit data is 11111111, and the power neuron data and the power weight value data are corresponding to corresponding binary complement codes when the power bit data is other values. When the sign bit of the power neuron data and the power weight data is 0 and the power bit is 0001001, it indicates that the specific value is 29512, namely; the sign bit of the power neuron data and the power weight data is 1, and the power bit is 1111101, which indicates that the specific value is-2-3I.e., -0.125. Compared with floating point data, the power data only retains the power bits of the data, and the storage space required for storing the data is greatly reduced.
By the power data representation method, the storage space required for storing the neuron data and the weight data can be reduced. In the example provided in the present embodiment, the power data is 8-bit data, and it should be appreciated that the data length is not fixed, and different data lengths are adopted according to the data range of the neuron data and the weight data in different occasions.
And step S5, performing neural network operation on the power neuron data and the power weight data according to the operation instruction. Wherein the step S5 includes the following substeps:
s51, the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction;
and S52, the operation unit receives the operation instruction, the power neuron data and the power weight data sent by the decoding module, the input neuron cache module and the weight cache module respectively, and performs neural network operation on the power neuron data and the power weight data according to the operation instruction.
The multiplication operation of the power neuron and the power weight is specifically that the data sign bit of the power neuron and the data sign bit of the power weight are subjected to exclusive OR operation; the corresponding relation of the coding table is that the coding table is searched under the condition of disorder to find out the exponent values corresponding to the power neuron data and the power weight data power bits, the corresponding relation of the coding table is that the exponent value minimum value of the coding table is recorded under the condition of positive correlation and the exponent values corresponding to the power neuron data and the power weight data power bits are found out by addition, and the corresponding relation of the coding table is that the maximum value of the coding table is recorded under the condition of negative correlation and the exponent values corresponding to the power neuron inscriptions and the power weight data power bits are found out by subtraction; and adding the exponential value corresponding to the power neuron data and the exponential value corresponding to the power weight data.
In a specific example, as shown in fig. 4.6, sign bits of the power neuron data and the power weight data are 1 bit, and sign bits of the power weight data are 4 bits, that is, m is 4. The coding table is that when the power bit data is 1111, the corresponding power weight data is 0, and when the power bit data is other values, the power bit data corresponds to a corresponding binary complement. The power neuron data is 00010, which represents an actual value of 22. The power weight of 00110 represents an actual value of 64, i.e., 26. The product of the power neuron data and the power weight data is 01000, which represents an actual value of 28
It can be seen that the multiplication of the power neuron data and the power weight is simpler and more convenient than the multiplication of floating point data and power data.
The method of this embodiment may further include step S6, outputting the neuron data after the neural network operation as input data of the next layer neural network operation.
Wherein the step S6 includes the following substeps:
and S61, the output neuron buffer unit receives neuron data obtained after the neural network operation sent by the calculation unit.
S62, the neuron data received by the output neuron buffer unit is transmitted to the data control module, the neuron data obtained by the output neuron buffer unit can be used as the input neuron of the next layer of the neural network operation, and the steps S4 to S6 are repeated until the last layer of the neural network operation is finished.
Because the neuron data obtained after the operation of the neural network is also power data, the bandwidth required by transmitting the neuron data to the data control module is greatly reduced compared with the bandwidth required by floating point data, the expenses of the storage resources and the calculation resources of the neural network are further reduced, and the operation speed of the neural network is improved.
In addition, the specific operation method of the power conversion is the same as that of the previous embodiment, and is not described herein again.
All of the modules of the disclosed embodiments may be hardware structures, physical implementations of which include, but are not limited to, physical devices including, but not limited to, transistors, memristors, DNA computers.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Furthermore, the above definitions of the various elements and methods are not limited to the particular structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by one of ordinary skill in the art, for example:
the control unit of the present disclosure is not limited to the specific composition structure of the embodiment, and the control unit capable of implementing data and instruction interaction between the storage unit and the operation unit, which is well known to those skilled in the art, can be used to implement the present disclosure.
The above-mentioned embodiments are intended to illustrate the objects, aspects and advantages of the present disclosure in further detail, and it should be understood that the above-mentioned embodiments are only illustrative of the present disclosure and are not intended to limit the present disclosure, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (27)

1. A neural network operation device, comprising:
the first power conversion unit is used for converting non-power data in the input data of the neural network into power data;
the operation unit is used for receiving the data and the operation instruction of the neural network operation and executing the neural network operation on the received neuron data and the weight data according to the operation instruction; wherein the data received by the arithmetic unit comprises power data converted by a first power conversion unit;
the power data comprises power neuron data and power weight data; wherein the content of the first and second substances,
the value of the neuron data representing neuron data is represented in the form of a power index value, wherein the neuron data comprises a sign bit and a power bit, the sign bit represents the sign of the neuron data by adopting one bit or a plurality of bits, the power bit represents the power bit data of the neuron data by adopting m bits, and m is a positive integer greater than 1;
the power weight data represent the value of the weight data in a power index value form, wherein the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data in one or more bits, the power bit represents the power bit data of the weight data in m bits, and m is a positive integer greater than 1;
the storage unit of the neural network arithmetic device is prestored with a coding table which is used for providing exponent data corresponding to each exponent data of the exponent neuron data and the exponent weight data.
2. The neural network operation device according to claim 1, further comprising: a storage unit to store data and instructions; wherein the storage unit is connected with the first power conversion unit to receive the power data.
3. The neural network operation device according to claim 2, further comprising: the control unit and the output neuron cache unit; wherein the content of the first and second substances,
the control unit is connected with the storage unit, is used for controlling the interaction of data and instructions, receives the data and the instructions sent by the storage unit, and decodes the instructions into operation instructions;
the operation unit is connected with the control unit, receives the data and the operation instruction sent by the control unit, and executes neural network operation on the received neuron data and the weight data according to the operation instruction; and
and the output neuron cache unit is connected with the operation unit and used for receiving neuron data output by the operation unit and sending the neuron data to the control unit as input data of the next layer of neural network operation.
4. The neural network operation device according to claim 3, wherein the control unit includes:
the data control module is connected with the storage unit and used for realizing data and instruction interaction between the storage unit and each cache module;
the instruction cache module is connected with the data control module and used for receiving the instruction sent by the data control module;
the decoding module is connected with the instruction cache module and used for reading the instruction from the instruction cache module and decoding the instruction into an operation instruction;
the input neuron cache module is connected with the data control module and is used for acquiring corresponding input neuron data from the data control module;
the weight cache module is connected with the data control module and is used for acquiring corresponding weight data from the data control module; wherein the content of the first and second substances,
the operation unit is respectively connected with the decoding module, the input neuron cache module and the weight cache module, receives each operation instruction, neuron data and weight data, and executes corresponding neural network operation on the received neuron data and weight data according to the operation instruction.
5. The apparatus of claim 4, wherein the first power conversion unit is configured to convert non-power weight data in input data of the neural network into power weight data.
6. The neural network operation device according to claim 4, further comprising: a second power conversion unit; wherein the content of the first and second substances,
the first power conversion unit is used for converting non-power neuron data and non-power weight data in the neural network input data into power neuron data and power weight data respectively and sending the power neuron data and the power weight data to the storage unit;
the second power conversion unit is connected with the output neuron cache unit and used for converting neuron data received by the second power conversion unit into power neuron data and sending the power neuron data to the control unit as input data of next-layer neural network operation.
7. The neural network operation device according to claim 6, wherein the input data of the neural network is directly stored in the neural network operation device if the input data of the neural network is power data.
8. The neural network operation device according to claim 1, wherein the coding table sets one or more power level data as zero power level data, and the corresponding power level neuron data and power level weight data are 0.
9. The neural network operation device according to claim 1, wherein the correspondence relationship of the code table is an out-of-order relationship, a positive correlation relationship, or a negative correlation relationship.
10. The neural network operation device according to claim 9, wherein the maximum power level data corresponds to power neuron data and power weight data being 0, or the minimum power level data corresponds to power neuron data and power weight data being 0.
11. The neural network arithmetic device according to claim 1, wherein the correspondence of the coding table is that a highest bit of the power-order data represents a zero position, and other m-1 bits of the power-order data correspond to an exponent value.
12. The neural network operation device according to claim 1, wherein the correspondence of the coding table is a positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power-order data corresponds to an exponent value x, and any one or more other power-order data corresponds to power neuron data and power weight data 0; where x denotes an offset value and y denotes a step size.
13. The neural network operation device according to claim 1, wherein the minimum power level data corresponds to an exponent value x, the maximum power level data corresponds to power neuron data and power weight data is 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data + x) y.
14. The neural network operation device according to claim 13, wherein y is 1, and x has a value of-2m-1
15. The neural network operation device according to claim 1, wherein the correspondence relationship of the coding table is a negative correlation relationship, the storage unit prestores an integer value x and a positive integer value y, the maximum power bit data corresponds to an exponent value x, and any one or more other power bit data correspond to power neuron data and power weight data 0; where x denotes an offset value and y denotes a step size.
16. The neural network operation device according to claim 1, wherein the maximum power level data corresponds to an exponent value x, the minimum power level data corresponds to power neuron data and power weight data 0, and other power level data than the minimum and maximum power level data corresponds to an exponent value (power level data-x) y.
17. The neural network operation device according to claim 16, wherein y is 1 and x is equal to 2m-1
18. The neural network operation device according to claim 17, wherein the converting the non-power neuron data and the non-power weight data into the power neuron data and the power weight data includes:
sout=sin
Figure FDA0002783397000000041
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure FDA0002783397000000042
Representing the whole operation of taking down the data x; or the like, or, alternatively,
sout=sin
Figure FDA0002783397000000043
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit, sinFor symbols of input data, soutTo output the symbols of the data, din+For positive part of the input data, din+=din×sin,dout+To output a positive part of the data, dout+=dout×sout
Figure FDA0002783397000000044
Representing the operation of taking the whole of the data x; or the like, or, alternatively,
sout=sin
dout+=[log2(din+)]
wherein d isinInput data for power conversion unit, doutIs the output data of the power conversion unit; sinFor symbols of input data, soutIs the sign of the output data; din+For positive part of the input data, din+=din×sin,dout+ is the positive part of the output data, dout+=dout×sout;[x]Indicating a rounding operation on data x.
19. A neural network operation method, comprising:
acquiring an operation instruction, neuron data and power weight data;
performing neural network operation on the neuron data and the power weight data according to the operation instruction;
the power weight data represent the value of the weight data in a power index value form, wherein the power weight data comprise a sign bit and a power bit, the sign bit represents the sign of the weight data in one or more bits, the power bit represents the power bit data of the weight data in m bits, and m is a positive integer greater than 1;
the method further comprises the following steps: and prestoring an encoding table for providing an exponential value corresponding to each power bit data of the power weight data.
20. A neural network operation method, comprising:
acquiring an operation instruction, power neuron data and power weight data;
performing neural network operation on the power neuron data and the power weight data according to the operation instruction;
the value of the neuron data representing neuron data is represented in the form of a power index value, wherein the neuron data comprises a sign bit and a power bit, the sign bit represents the sign of the neuron data by adopting one bit or a plurality of bits, the power bit represents the power bit data of the neuron data by adopting m bits, and m is a positive integer greater than 1;
the method further comprises the following steps: an encoding table for providing an exponent value corresponding to each of the power neuron data is prestored.
21. The neural network operation method of claim 20, further comprising: outputting neuron data after neural network operation and taking the neuron data as input data of the next layer of neural network operation; repeating the operation steps of the neural network until the last layer of operation of the neural network is finished.
22. The neural network operation method of claim 20, wherein obtaining the operation command, the weight data and the neuron data comprises:
inputting the instruction, the neuron data and the weight data into a storage unit;
the data control module receives the instruction, the neuron data and the weight data sent by the storage unit;
the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the weight data input into the storage unit is non-power weight data, the weight data is converted into power weight data by the first power conversion unit and input into the storage unit; if the weight data input into the storage unit is power weight data, directly inputting into the storage unit;
the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction.
23. The neural network operation method of claim 22, wherein the output neuron buffer unit receives neuron data obtained after neural network operation sent by the calculation unit and sends the neuron data to the data control module as input data of next-layer neural network operation.
24. The neural network operation method of claim 20, wherein obtaining the operation command, the weight data and the neuron data comprises:
inputting the instruction, the neuron data and the weight data into a storage unit;
the data control module receives the instruction, the neuron data and the weight data sent by the storage unit;
the instruction cache module, the input neuron cache module and the weight cache module respectively receive the instruction, the neuron data and the weight data sent by the data control module; if the neuron data and the weight data input into the storage unit are non-power neuron data and non-power weight data, the neuron data and the non-power weight data are converted into power neuron data and power weight data through the first power conversion unit and input into the storage unit; if the neuron data and the weight data input into the storage unit are power neuron data and power weight data, directly inputting the data into the storage unit;
the decoding module reads the instruction from the instruction cache module and decodes the instruction into each operation instruction.
25. The neural network operation method of claim 24, wherein the output neuron buffer unit receives neuron data obtained after neural network operation sent by the calculation unit; the second power conversion unit receives the neuron data sent by the output neuron cache unit, converts the neuron data into power neuron data and sends the power neuron data to the data control module to be used as input data of next-layer neural network operation.
26. The neural network operation method of claim 20, performing the neural network operation on the weight data and the neuron data according to the operation instruction, comprising:
the operation unit respectively receives the operation instruction, the neuron data and the weight data sent by the decoding module, the input neuron cache module and the weight cache module, and performs neural network operation on the neuron data and the weight data according to the operation instruction.
27. The neural network operation method according to any one of claims 22 to 26, wherein the range of the power neuron data and the power weight data that can be expressed by the neural network operation device is adjusted by changing an integer value x and a positive integer value y that are pre-stored in a storage unit, and the method includes:
when the corresponding relation of the coding table is positive correlation, the storage unit prestores an integer value x and a positive integer value y, the minimum power order data corresponds to the exponent value x, and any other one or more power order data corresponds to the power neuron data and the power weight data is 0; wherein x represents an offset value and y represents a step size; alternatively, the first and second electrodes may be,
when the corresponding relation of the coding table is a negative correlation relation, the storage unit prestores an integer value x and a positive integer value y, the maximum power order data corresponds to the exponent value x, and any other one or more power order data correspond to the power neuron data and the power weight data are 0; where x denotes an offset value and y denotes a step size.
CN201710312415.6A 2017-04-06 2017-05-05 Arithmetic device and method Active CN108805271B (en)

Priority Applications (16)

Application Number Priority Date Filing Date Title
CN201710312415.6A CN108805271B (en) 2017-05-05 2017-05-05 Arithmetic device and method
EP19199526.5A EP3633526A1 (en) 2017-04-06 2018-04-04 Computation device and method
CN201811423295.8A CN109409515B (en) 2017-04-06 2018-04-04 Arithmetic device and method
EP19199528.1A EP3624018B1 (en) 2017-04-06 2018-04-04 Neural network computation device and method
EP19199524.0A EP3627437B1 (en) 2017-04-06 2018-04-04 Data screening device and method
CN201811423421.XA CN109359736A (en) 2017-04-06 2018-04-04 Network processing unit and network operations method
CN201880001242.9A CN109219821B (en) 2017-04-06 2018-04-04 Arithmetic device and method
EP18780474.5A EP3579150B1 (en) 2017-04-06 2018-04-04 Operation apparatus and method for a neural network
CN201811413244.7A CN109344965A (en) 2017-04-06 2018-04-04 Arithmetic unit and method
EP19199521.6A EP3620992A1 (en) 2017-04-06 2018-04-04 Neural network processor and neural network computation method
PCT/CN2018/081929 WO2018184570A1 (en) 2017-04-06 2018-04-04 Operation apparatus and method
US16/283,711 US10896369B2 (en) 2017-04-06 2019-02-22 Power conversion in neural networks
US16/520,082 US11010338B2 (en) 2017-04-06 2019-07-23 Data screening device and method
US16/520,041 US11551067B2 (en) 2017-04-06 2019-07-23 Neural network processor and neural network computation method
US16/520,654 US11049002B2 (en) 2017-04-06 2019-07-24 Neural network computation device and method
US16/520,615 US10671913B2 (en) 2017-04-06 2019-07-24 Computation device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710312415.6A CN108805271B (en) 2017-05-05 2017-05-05 Arithmetic device and method

Publications (2)

Publication Number Publication Date
CN108805271A CN108805271A (en) 2018-11-13
CN108805271B true CN108805271B (en) 2021-03-26

Family

ID=64053718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710312415.6A Active CN108805271B (en) 2017-04-06 2017-05-05 Arithmetic device and method

Country Status (1)

Country Link
CN (1) CN108805271B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740754B (en) * 2018-12-29 2020-04-14 中科寒武纪科技股份有限公司 Neural network computing device, neural network computing method and related products
CN109978160B (en) * 2019-03-25 2021-03-02 中科寒武纪科技股份有限公司 Configuration device and method of artificial intelligence processor and related products

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201726420U (en) * 2010-08-06 2011-01-26 北京国科环宇空间技术有限公司 Blind equalization device
US9435315B2 (en) * 2014-01-23 2016-09-06 Peter Andrés Kalnay Trimming right-angularly reorienting extending segmented ocean wave power extraction system
CN106066783A (en) * 2016-06-02 2016-11-02 华为技术有限公司 The neutral net forward direction arithmetic hardware structure quantified based on power weight

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201726420U (en) * 2010-08-06 2011-01-26 北京国科环宇空间技术有限公司 Blind equalization device
US9435315B2 (en) * 2014-01-23 2016-09-06 Peter Andrés Kalnay Trimming right-angularly reorienting extending segmented ocean wave power extraction system
CN106066783A (en) * 2016-06-02 2016-11-02 华为技术有限公司 The neutral net forward direction arithmetic hardware structure quantified based on power weight

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Going Deeper with Embedded FPGA Platform for Convolutional Neural Network》;Jiantao Qiu等;《Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays》;20160223;26-35 *
《基于多核系统的混合神经网络及应用》;杨蕾等;《智能计算机与应用》;20111130;第1卷(第4期);4-6 *

Also Published As

Publication number Publication date
CN108805271A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN109284822B (en) Neural network operation device and method
US10726336B2 (en) Apparatus and method for compression coding for artificial neural network
CN108292222B (en) Hardware apparatus and method for data decompression
WO2019120114A1 (en) Data fixed point processing method, device, electronic apparatus and computer storage medium
WO2018193906A1 (en) Information processing method, information processing device and program
CN107340993B (en) Arithmetic device and method
CN107239826A (en) Computational methods and device in convolutional neural networks
CN107256422A (en) Data quantization methods and device
US7612694B1 (en) Efficient coding of small integer sets
KR20220097961A (en) Recurrent neural networks and systems for decoding encoded data
WO2020064093A1 (en) End-to-end learning in communication systems
CN108805271B (en) Arithmetic device and method
US20200293724A1 (en) Information conversion method and apparatus, storage medium, and electronic device
CN109389210B (en) Processing method and processing apparatus
CN111914987A (en) Data processing method and device based on neural network, equipment and readable medium
CN112955878B (en) Apparatus for implementing activation logic of neural network and method thereof
CN112101511A (en) Sparse convolutional neural network
CN114970827A (en) Arithmetic device and method
EP3912094A1 (en) Training in communication systems
CN111970007B (en) Decoding method, decoder, device and medium
CN110233627B (en) Hardware compression system and method based on running water
TW201915834A (en) Calculation device for and calculation method of performing convolution
CN114492778A (en) Operation method of neural network model, readable medium and electronic device
CN109416757B (en) Method, apparatus and computer-readable storage medium for processing numerical data
CN112364657A (en) Method, device, equipment and computer readable medium for generating text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant