WO2021163866A1 - Neural network weight matrix adjustment method, writing control method, and related device - Google Patents

Neural network weight matrix adjustment method, writing control method, and related device Download PDF

Info

Publication number
WO2021163866A1
WO2021163866A1 PCT/CN2020/075648 CN2020075648W WO2021163866A1 WO 2021163866 A1 WO2021163866 A1 WO 2021163866A1 CN 2020075648 W CN2020075648 W CN 2020075648W WO 2021163866 A1 WO2021163866 A1 WO 2021163866A1
Authority
WO
WIPO (PCT)
Prior art keywords
weight
neural network
array
memory cell
weight matrix
Prior art date
Application number
PCT/CN2020/075648
Other languages
French (fr)
Chinese (zh)
Inventor
王绍迪
Original Assignee
杭州知存智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州知存智能科技有限公司 filed Critical 杭州知存智能科技有限公司
Priority to PCT/CN2020/075648 priority Critical patent/WO2021163866A1/en
Publication of WO2021163866A1 publication Critical patent/WO2021163866A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode

Definitions

  • the present invention relates to the field of artificial intelligence technology, in particular to a neural network weight matrix adjustment method, a writing control method and related devices.
  • computing-in-memory (CIM) chips have received widespread attention.
  • the data transmission volume and transmission distance can reduce power consumption and improve performance at the same time.
  • the in-memory computing chip is suitable for neural network computing scenarios.
  • the signals to be processed pass Parallel input, based on Ohm's law and Kirchhoff's law, the signal to be processed and the corresponding weight are directly performed vector-matrix multiplication and addition operation in the memory cell array, and the output current/current signal of the memory cell array passes through the ADC (analog-to-digital converter) After quantization, it is used as the output result.
  • ADC analog-to-digital converter
  • the weight distribution of the neural network algorithm is too small or too large (see Figure 1 or Figure 2, where the circle represents the storage unit, the number in it represents the pre-stored weight value, and the horizontal represents the row input. That is, the input signal.
  • the input of the first row in Figure 1 is 7, the input of the second row is 5, and the input of the third row is 3, and the arrow from the top to the small indicates the output.
  • the output of the first column is 7
  • the output of the second column is 7, and the output of the third column is 248.
  • the output signal of each column is input to an ADC to convert the analog output signal of the column into a digital signal for subsequent applications; among them, Figure 1 shows The output voltage/current of the first column and the second column is too small, exceeding the lower limit of the ADC; Figure 2 shows that the output voltage/current of the three columns of memory cells is too large, exceeding the upper limit of the ADC range), or the input signal is too small or If it is too large, it may cause the analog voltage/current output value of the memory cell array to be too small or too large, which may exceed the lower or upper range of the ADC; usually the ADC has the highest quantization accuracy for the intermediate value, and the quantization accuracy for the values on both sides is poor. When the ADC input exceeds the lower limit or upper limit range, the corresponding output is directly truncated to the minimum or maximum value, thereby reducing the accuracy of the operation.
  • the present invention provides a neural network weight matrix adjustment method, a write control method, related devices, electronic equipment, and computer-readable storage media, which can at least partially solve the problems in the prior art.
  • a method for adjusting the weight matrix of a neural network including:
  • first constant and the second constant are both greater than one.
  • the method further includes:
  • the first weight array is the remaining weight array after truncating the bits whose digits of each weight value exceeds the third preset threshold, and is used to store in a memory cell array;
  • the second weight array is the weight The weight array obtained by truncating the bits whose digits of the value exceeds the third preset threshold is used for storing in another memory cell array or inputting an arithmetic operation circuit.
  • the method for adjusting the weight matrix of the neural network also includes:
  • the processed weight array is used to store in a memory cell array.
  • the method for adjusting the weight matrix of the neural network also includes:
  • the ADC output result after the memory cell array corresponding to the first weight array is combined with the ADC output result after the memory cell array corresponding to the second weight array;
  • the ADC output result after the memory cell array corresponding to the first weight array is combined with the output result of the arithmetic operation circuit.
  • dividing all the weight values in the neural network weight matrix by a second constant includes:
  • the third weight array is used to store a memory cell array; the fourth weight array is used to store another memory cell array or input an arithmetic operation circuit.
  • the method for adjusting the weight matrix of the neural network also includes:
  • the ADC output result after the memory cell array corresponding to the third weight array is combined with the ADC output result after the memory cell array corresponding to the fourth weight array;
  • the ADC output result after the memory cell array corresponding to the third weight array is combined with the output result of the arithmetic operation circuit.
  • a method for writing a neural network weight matrix including:
  • the shift register is controlled to shift its input weight values, and the shift register shifts its input weight values and the overflow bits and the weight values in the neural network weight matrix
  • the address in is stored in the buffer, where the data adjustment instruction includes: the shift direction and the number of shift bits;
  • the data adjustment instruction is generated when the weight distribution of the neural network weight matrix is uneven; the shift register is connected to a write module; the write module is connected to another memory cell array for shifting the shift register The data is written to another memory cell array.
  • the neural network weight matrix writing control method further includes:
  • the ADC output result of the memory cell array is combined with the ADC output result of the other memory cell array.
  • a neural network weight matrix adjustment device including:
  • the first judgment module judges whether the weight distribution of the neural network weight matrix is lower than the first preset threshold
  • the weight amplification module if the weight distribution of the neural network weight matrix is lower than the first preset threshold, multiply all the weight values in the neural network weight matrix by a first constant;
  • the second judgment module if the weight distribution of the neural network weight matrix is not lower than the first preset threshold, judge whether the weight distribution of the neural network weight matrix is higher than the second preset threshold, wherein the second preset threshold is greater than the The first preset threshold;
  • the weight reduction module if the weight distribution of the neural network weight matrix is higher than the second preset threshold, all the weight values in the neural network weight matrix are divided by a second constant;
  • first constant and the second constant are both greater than one.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor.
  • the processor implements the steps of the neural network weight matrix adjustment method described above when the processor executes the program.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned neural network weight matrix adjustment method are realized.
  • the neural network weight matrix adjustment method, the writing control method, and related devices, electronic equipment, and computer-readable storage medium provided by the present invention are suitable for performing neural network weight matrix training before performing neural network calculations using in-memory computing chips.
  • Processing the method includes: judging whether the weight distribution of the neural network weight matrix is lower than a first preset threshold; if so, multiplying all the weight values in the neural network weight matrix by a first constant; if not, judging the nerve Whether the weight distribution of the network weight matrix is higher than the second preset threshold, where the second preset threshold is greater than the first preset threshold; if the weight distribution of the neural network weight matrix is higher than the second preset threshold, the All weight values in the neural network weight matrix are divided by a second constant; wherein, the first constant and the second constant are both greater than one.
  • the signal obtained after the processed weight matrix storage memory cell array is stored in the ADC (the ADC is set after the memory cell array, Used to convert the output of each memory cell column into a digital signal) within the effective range, thereby improving the accuracy of the calculation.
  • Figure 1 shows that the output current/voltage of the memory cell column is too small to exceed the lower limit of the ADC when the matrix operation is performed in the memory cell array of the in-memory computing chip
  • Figure 2 shows that the output current/voltage of the memory cell column is too large to exceed the upper limit range of the ADC when the matrix operation is performed in the memory cell array of the in-memory computing chip;
  • Fig. 3 shows an application scenario of a neural network weight matrix adjustment method provided by an embodiment of the present invention
  • FIG. 4 is a first flowchart of a method for adjusting a weight matrix of a neural network in an embodiment of the present invention
  • FIG. 5 shows a schematic diagram of a method for adjusting a weight matrix of a neural network provided by an embodiment of the present invention
  • FIG. 6 shows a situation in which the weight array adjusted by the neural network weight matrix adjustment method provided by the embodiment of the present invention has a weight value exceeding the number of bits.
  • FIG. 7 is a second schematic flowchart of a method for adjusting a neural network weight matrix in an embodiment of the present invention.
  • FIG. 8 shows a schematic diagram of dividing the weight array after amplifying the weight array in the embodiment of the present invention
  • FIG. 9 shows another schematic diagram of dividing the weight array after enlarging the weight array in the embodiment of the present invention.
  • FIG. 10 shows the specific steps of step S400 in the embodiment of the present invention.
  • FIG. 11 shows a schematic diagram of dividing the weight array after shrinking the weight array in an embodiment of the present invention
  • Fig. 12 is a structural block diagram of a neural network weight matrix adjustment device in an embodiment of the present invention.
  • FIG. 13 shows an application scenario of a neural network weight matrix writing control method provided by an embodiment of the present invention
  • FIG. 14 is a schematic flowchart of a method for writing a neural network weight matrix in an embodiment of the present invention
  • Fig. 15 is a structural diagram of an electronic device according to an embodiment of the present invention.
  • the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • Fig. 3 shows an application scenario of the neural network weight matrix adjustment method provided by an embodiment of the present invention
  • the compiling software 1 is connected to the programming circuit in the in-store computing chip 2, and is used to provide Neural network weight matrix adjustment method
  • the adjusted neural network weight matrix is written into the memory cell array in the in-store calculation chip 2 through the programming circuit.
  • the input data stream is subjected to certain preprocessing It is transmitted to the memory cell array and performs neural network operation with the neural network weight matrix pre-written in the memory cell array.
  • the output data stream of the memory cell array is converted into a digital signal by the ADC module, and the operation result is output.
  • the compilation software may be a compilation processing software that has been developed or not yet developed, or a programmed computer program, which can be executed in a computer device, or can be executed in a processing chip or a mobile portable device.
  • the embodiment of the present invention There is no restriction on this.
  • a programming circuit can be set in it.
  • the programming circuit For applications that do not need to adjust the neural network weight matrix, in order to reduce the chip It is not necessary to set the programming circuit, but the final neural network weight matrix after adjustment is written into the in-memory calculation chip through the programming device in advance at the factory.
  • FIG. 4 is a first flowchart of a method for adjusting a weight matrix of a neural network in an embodiment of the present invention. As shown in FIG. 4, the method for adjusting a weight matrix of a neural network may include the following contents:
  • Step S100 Determine whether the weight distribution of the neural network weight matrix is lower than a first preset threshold
  • step S200 If yes, go to step S200; if not, go to step S300;
  • the neural network weight matrix is a neural network weight matrix trained in the neural network training stage, and the weight distribution can be a statistical index such as a mean value or a probability distribution, which can be specifically set by the designer according to the actual ADC range.
  • the first preset threshold is set by the designer according to specific statistical requirements and hardware requirements.
  • Step S200 Multiply all the weight values in the neural network weight matrix by a first constant
  • the constant N can usually be set to an integer multiple of 2, which is equivalent to shifting.
  • the original neural network weight matrix is multiplied by 8, that is, the binary number is shifted to the left by 3 bits to obtain the amplified weight matrix.
  • Step S300 Determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
  • step S400 If yes, perform step S400; if not, perform step S500;
  • the second preset threshold is set by the designer according to specific statistical requirements and hardware requirements.
  • Step S400 Divide all the weight values in the neural network weight matrix by a second constant
  • first constant and the second constant are both greater than 1, and the first constant and the second constant may be the same or different. It is worth noting that the calculation result of multiplying by 2 and dividing by 1/2 is the same. Therefore, the scope of protection claimed by the present invention, and the expression of multiplication or division does not limit the scope of protection of the present invention, but for the same or The equivalent calculation process should all be included in the scope of protection of the present invention.
  • the analog voltage/current output by the memory cell array may exceed the quantization upper limit of the ADC.
  • M is usually an integer multiple of 2, which is equivalent to shifting to the right
  • Step S500 Use the neural network weight matrix to store in a memory cell array.
  • the neural network weight matrix in this step refers to the neural network weight matrix in step S100 (if the weight distribution of the weight matrix is neither lower than the first preset threshold nor higher than the second preset threshold) or the processing in step S200 The weight matrix obtained later or the weight matrix processed in step S400.
  • the weight matrix is stored in a memory cell array, and in the application stage, an input data stream is input to the memory cell array.
  • the input data stream and the weight array are subjected to analog vector-matrix multiplication operations,
  • the result is transmitted to the ADC after the memory cell array in the form of output analog voltage/current of the memory cell array, and the ADC converts the analog voltage/current into a digital signal.
  • the neural network weight matrix adjustment method multiplies or shrinks the neural network weight matrix with uneven overall distribution, so that the processed weight matrix is stored in the memory cell array.
  • the signal obtained after the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
  • step S200 all the weight values in the neural network weight matrix are multiplied by a first constant to increase the network weight, ideally all weights in the increased weight array
  • a first constant to increase the network weight
  • the values meet the requirements, but those skilled in the art can understand that for an array with a relatively uneven weight distribution, in the increased weight array, some larger weight values may exceed the upper limit of the number of bits (also called For example, referring to Figure 6, assuming that the weight precision is 8 bits, after multiplying by the constant 8 (left shifted by 3 bits), the weight value in the third row and the second column will exceed the upper limit of the number of bits.
  • the neural network weight matrix adjustment method provided by the embodiment of the present invention may further include:
  • Step S600 Determine whether the number of digits of each weight value in the processed weight matrix exceeds a third preset threshold
  • step S700 If yes, perform step S700; if not, perform step S500;
  • the third preset threshold may be the accuracy of the algorithm, such as 8 bits, 16 bits, etc., which is set by the designer according to specific statistical requirements and hardware requirements.
  • Step S700 intercept the bits whose weight values exceed the third preset threshold to obtain a first weight array and a second weight array;
  • the first weight array (also referred to as the standard matrix) is the remaining weight array after truncating the bits whose digits of each weight value exceed the third preset threshold value, and is used to store in the memory computing chip
  • the memory cell array performs an analog vector-matrix multiplication operation
  • the second weight array is a weight array obtained by truncating the bits whose digits of each weight value exceeds the third preset threshold, and is used to store in another memory cell array Or input an arithmetic operation circuit. It is worth noting that the second weight array may be a sparse matrix or an ordinary weight matrix.
  • the sparse matrix can be input into an arithmetic operation circuit to perform the operation.
  • One input corresponding to the operation is an element in the sparse matrix, and the other input is the weight value corresponding to the element in the aforementioned input data stream. Corresponding input data.
  • the arithmetic operation circuit can be a conventional digital circuit, such as a multiplier, or a CPU.
  • the sparse matrix can be stored in a memory first, and then transferred from a memory to the CPU to perform multiplication operations.
  • the weight array can be stored in another memory cell array, and the other memory cell array performs an analog vector-matrix multiplication operation on the second weight array and the input data stream.
  • Fig. 8 shows a schematic diagram of dividing the weight array after amplifying the weight array in an embodiment of the present invention; as shown in Fig. 8, corresponding to the super-bit number situation shown in Fig. 6, the part higher than 8 bits is truncated and placed into one
  • the new weight matrix which is a sparse matrix.
  • the standard matrix is still processed based on the integration of storage and calculation (vector-matrix multiplication), while the sparse matrix part can be processed by conventional digital circuits, and finally the combination of the two is the final output.
  • Figure 8 is a step-by-step detailed description of the principle of the embodiment of the present invention.
  • Figure 9 in order to effectively save the program, see Figure 9. While increasing the weight, the overflowed bits can be directly stored in another matrix, and There is no need to distinguish between the enlargement step and the division step.
  • the adjustment method may further include:
  • the ADC output result after the memory cell array corresponding to the first weight array is combined with the ADC output result after the memory cell array corresponding to the second weight array;
  • the ADC output result after the memory cell array corresponding to the first weight array is combined with the output result of the arithmetic operation circuit.
  • the combination method can superimpose the results of the first weight array and the second weight array, and the combination method can also reduce the superimposed result by a certain multiple on the basis of superposition. It can be selected according to parameters such as the quantization accuracy of the subsequent circuit.
  • step S400 all the weight values in the neural network weight matrix are divided by a second constant to reduce the network weight, ideally the reduced weight array All the weight values in, meet the requirements, but those skilled in the art can understand that for an array with a relatively uneven weight distribution, in the reduced weight array, after some weight values are shifted to the right, overflow may occur
  • this step S400 may include the following:
  • Step S410 Divide each weight value by the second constant to obtain a third weight array
  • Step S420 Save the overflow bits obtained by dividing each weight value by the second constant as a fourth weight array
  • the third weight array is used to store in a memory cell array to perform analog vector-matrix multiplication; the fourth weight array is used to store in another memory cell array or input arithmetic operation circuit, and the operation of the fourth weight array
  • the fourth weight array is used to store in another memory cell array or input arithmetic operation circuit, and the operation of the fourth weight array
  • each weight value by the second constant is equivalent to shifting to the right, and saving the overflow bit as a fourth weight array.
  • FIG. 11 shows a schematic diagram of dividing the weight array after shrinking the weight array in the embodiment of the present invention. As shown in FIG. 11, dividing each weight value in the original neural network weight matrix by 8 is equivalent to dividing each binary weight The value is shifted by 3 bits to the right to obtain a standard matrix, and the overflow bit is saved as an overflow matrix. Since the standard matrix and overflow matrix obtained after shifting in the figure are not sparse matrices, two memory cell arrays can be used to process the standard matrix separately As well as the overflow matrix, the input data streams of the two memory cell arrays are the same. After the outputs of the two memory cell arrays are combined, they are input to the ADC for conversion.
  • the embodiments of the present invention can still be used to provide The adjustment method of the adjustment method is adjusted until the weight distribution of the matrix is appropriate. For example, for the new matrix after adjustment, the weight is too small, and you can multiply it by a constant to adjust to a proper distribution.
  • the neural network weight matrix adjustment method further includes:
  • the ADC output result after the memory cell array corresponding to the third weight array is combined with the ADC output result after the memory cell array corresponding to the fourth weight array;
  • the ADC output result after the memory cell array corresponding to the third weight array is combined with the output result of the arithmetic operation circuit.
  • the combination method can superimpose the results of the third weight array and the fourth weight array, and the combination method can also increase the superimposed result by a certain multiple on the basis of superposition. It can be selected according to parameters such as the quantization accuracy of the subsequent circuit.
  • the neural network weight matrix adjustment method multiplies or shrinks the neural network weight matrix with uneven overall distribution, so that the processed weight matrix is stored in the memory cell array.
  • the signal obtained after the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
  • an embodiment of the present application also provides a neural network weight matrix adjustment device, which can be used to implement the method described in the foregoing embodiment, as described in the following embodiment. Since the principle of the neural network weight matrix adjustment device to solve the problem is similar to the above method, the implementation of the neural network weight matrix adjustment device can refer to the implementation of the above method, and the repetition will not be repeated.
  • the term "unit” or "module” can be a combination of software and/or hardware that implements a predetermined function.
  • the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
  • FIG. 12 is a structural block diagram of a neural network weight matrix adjustment device in an embodiment of the present invention. As shown in FIG. 12, the neural network weight matrix adjustment device may include: a first judgment module 10, a weight amplification module 20, and a second judgment module 30 and the weight reduction module 40.
  • the first judgment module 10 judges whether the weight distribution of the neural network weight matrix is lower than a first preset threshold
  • the weight amplification module 20 if the weight distribution of the neural network weight matrix is lower than the first preset threshold, multiply all the weight values in the neural network weight matrix by a first constant;
  • the second judgment module 30 if the weight distribution of the neural network weight matrix is not lower than the first preset threshold, judge whether the weight distribution of the neural network weight matrix is higher than the second preset threshold, wherein the second preset threshold is greater than The first preset threshold;
  • the weight reduction module 40 if the weight distribution of the neural network weight matrix is higher than the second preset threshold, divide all the weight values in the neural network weight matrix by a second constant;
  • first constant and the second constant are both greater than one.
  • FIG. 13 shows an application scenario of the neural network weight matrix writing control method provided by an embodiment of the present invention; as shown in FIG. 13, in this scenario,
  • the weight to be written is input to the shift register in the in-memory calculation chip 2'. After the shift register shifts the weight, it is written into the memory cell corresponding to the memory cell array by the write module, and the neural network is used to process the input
  • the input data stream is transmitted to the memory cell array after a certain preprocessing, and the neural network weight matrix pre-written in the memory cell array is used for neural network operation, and the output data stream of the memory cell array is converted to digital by the ADC module Signal, output the calculation result.
  • FIG. 13 only exemplarily lists a few circuit modules in the in-memory computing chip 2'.
  • the in-memory computing chip 2' may also be provided with registers and back Processing modules and other related functional circuits.
  • each memory cell in the memory cell array can be implemented by a programmable semiconductor device, such as a floating gate MOS tube.
  • the shift register is also connected to an external buffer 1'for buffering data.
  • the control module executes the neural network weight matrix writing control method provided by the embodiment of the present invention.
  • the neural network weight matrix writing control method may include the following content:
  • Step S1000 According to the data adjustment instruction, the shift register is controlled to shift the input weight values, and the shift register shifts the input weight values and the overflow bits and the weight values are in the nerve
  • the addresses in the network weight matrix are stored in the buffer
  • the data adjustment instruction includes: the shift direction and the number of shift bits.
  • the weight distribution of the neural network weight matrix is lower than a first preset threshold, or whether it is higher than a second preset threshold, and if so, a data adjustment instruction is generated.
  • the neural network weight matrix is a neural network weight matrix trained in the neural network training stage, and the weight distribution can be a statistical indicator such as a mean value or a probability distribution.
  • the first preset threshold and the second preset threshold are set by designers according to specific statistical requirements and hardware requirements, where the second preset threshold is greater than the first preset threshold.
  • the shift direction in the data adjustment instruction is to move to the left to increase the network weight, and then write the increased weight matrix into the memory cell array.
  • the output analog current of the memory cell array can be doubled.
  • the number of shifts is set by the designer according to the specific statistical requirements and hardware requirements; if the weight distribution of the neural network weight matrix is high
  • the shift direction in the data adjustment instruction is to move to the right to reduce the network weight.
  • the reduced network weight can make the storage unit
  • the analog voltage/current output by the array is reduced to the appropriate range of the ADC, and the number of shifts is set by the designer according to the specific statistical requirements and hardware requirements.
  • the array obtained after the shift operation is used as a standard matrix, and the neural network operation is performed by the memory cell array.
  • the bits overflowed after the shift and the address of the weight value in the neural network weight matrix are saved as an array. Buffer, this array can be a sparse matrix or a normal weight array;
  • Step S2000 store the data in the buffer into a memory cell array or input an arithmetic operation circuit
  • the sparse matrix is input to an arithmetic operation circuit to perform the operation
  • a memory cell array is input for processing
  • the data adjustment instruction is generated when the weight distribution of the neural network weight matrix is uneven; the shift register is connected to a write module; the write module is connected to another memory cell array for shifting the shift register The data is written to another memory cell array.
  • the neural network weight matrix writing control method uses a shift register to multiply or reduce the overall uneven neural network weight matrix during the weight writing process, so as to make the processed
  • the signal obtained after the weight matrix storage memory cell array performs in-storage calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
  • the neural network weight matrix writing control method further includes:
  • the ADC output result of the memory cell array is combined with the ADC output result of the other memory cell array.
  • the combination method can superimpose the results of two memory cell arrays or a storage cell array and an arithmetic operation circuit, and the combination method can also be to superimpose the result after superposition on the basis of superposition. Decrease a certain multiple or increase a certain multiple, which can be specifically selected according to parameters such as the quantization accuracy of the subsequent circuit.
  • the device explained in the above embodiments may be implemented by a computer chip or entity, or implemented by a product with a certain function.
  • a typical implementation device is an electronic device.
  • the electronic device may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, Game consoles, tablet computers, wearable devices, or any combination of these devices.
  • the electronic device specifically includes a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the following steps when the program is executed:
  • first constant and the second constant are both greater than one.
  • the electronic device provided by the embodiments of the present invention can be used for neural network weight matrix adjustment, by multiplying or multiplying the neural network weight matrix with uneven overall distribution, so that the processed weight matrix can be stored
  • the signal obtained after the memory cell array performs the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
  • FIG. 15 shows a schematic structural diagram of an electronic device 600 suitable for implementing the embodiments of the present application.
  • the electronic device 600 includes a central processing unit (CPU)/MCU601, which can be loaded into a random access memory according to a program stored in a read-only memory (ROM) 602 or from a storage part 608
  • CPU central processing unit
  • MCU601 which can be loaded into a random access memory according to a program stored in a read-only memory (ROM) 602 or from a storage part 608
  • RAM random access memory
  • the program in 603 executes various appropriate tasks and processing.
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I/O interface 605: an input part 606 including a keyboard, a mouse, etc.; an output part 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and speakers, etc.; a storage part 608 including a hard disk, etc. ; And the communication part 609 including a network interface card such as a LAN card, a modem, etc. The communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I/O interface 605 as needed.
  • the removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as required, so that the computer program read from it is installed as the storage part 608 as required.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiment of the present invention includes a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • first constant and the second constant are both greater than one.
  • the computer-readable storage medium provided by the embodiments of the present invention can be used for neural network weight matrix adjustment.
  • the signal obtained after the weight matrix storage memory cell array performs in-storage calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
  • the computer program may be downloaded and installed from the network through the communication part 609, and/or installed from the removable medium 611.
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • this application can be provided as a method, a system, or a computer program product. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • This application may be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • This application can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.

Abstract

A neural network weight matrix adjustment method, a writing control method, and a related device, suitable for processing a neural network weight matrix before performing neural network operation by a computing-in-memory chip. The method comprises: determining whether weight distribution of a neural network weight matrix is lower than a first preset threshold; if yes, multiplying each weight value in the neural network weight matrix by a first constant; if not, determining whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, wherein the second preset threshold is higher than the first preset threshold; and if the weight distribution of the neural network weight matrix is higher than the second preset threshold, dividing each weight value in the neural network weight matrix by a second constant, wherein the first constant and the second constant are both greater than 1. Thus, a signal obtained after computing-in-memory is performed in a memory cell array storing the processed weight matrix is within the effective range of an ADC. The computing accuracy is improved.

Description

神经网络权重矩阵调整方法、写入控制方法以及相关装置Neural network weight matrix adjustment method, writing control method and related device 技术领域Technical field
本发明涉及人工智能技术领域,尤其涉及一种神经网络权重矩阵调整方法、写入控制方法以及相关装置。The present invention relates to the field of artificial intelligence technology, in particular to a neural network weight matrix adjustment method, a writing control method and related devices.
背景技术Background technique
为了解决传统冯诺依曼计算体系结构瓶颈,存内计算(Computing-In-Memory,CIM)芯片得到人们的广泛关注,其基本思想是直接利用存储器进行逻辑计算,从而减少存储器与处理器之间的数据传输量以及传输距离,降低功耗的同时提高性能。In order to solve the bottleneck of the traditional von Neumann computing architecture, computing-in-memory (CIM) chips have received widespread attention. The data transmission volume and transmission distance can reduce power consumption and improve performance at the same time.
由于存内计算芯片的存算一体特性,存内计算芯片适用于神经网络运算场景,通过预先将训练好的神经网络算法的权重矩阵写入存内计算芯片的存储单元阵列中,待处理信号通过并行输入,基于欧姆定律与基尔霍夫定律,待处理信号与对应权重直接在存储单元阵列中执行向量-矩阵乘加运算,存储单元阵列的输出电流/电流信号通过ADC(模数转换器)量化后作为输出结果。Due to the integrated storage and calculation characteristics of the in-memory computing chip, the in-memory computing chip is suitable for neural network computing scenarios. By pre-writing the weight matrix of the trained neural network algorithm into the memory cell array of the in-memory computing chip, the signals to be processed pass Parallel input, based on Ohm's law and Kirchhoff's law, the signal to be processed and the corresponding weight are directly performed vector-matrix multiplication and addition operation in the memory cell array, and the output current/current signal of the memory cell array passes through the ADC (analog-to-digital converter) After quantization, it is used as the output result.
在实际应用中,如果神经网络算法的权重分布过小或过大(参见图1或参见图2,其中,圆圈表示存储单元,其内的数字表示其内预存的权重值,横向表示行输入,即输入信号,比如图1中的第一行的输入为7,第二行的输入为5,第三行的输入为3,从上到小的箭头表示输出,比如第一列的输出为7,第二列的输出为7,第三列的输出为248,每列输出信号均输入一ADC,用于将该列的模拟输出信号转换为数字信号,供后续应用;其中,图1示出了第一列以及第二列的输出电压/电流太小,超出ADC下限量程;图2示出了三列存储单元的输出电压/电流太大,超出ADC上限量程),或者输入信号过小或过大,都可能导致存储单元阵列的模拟电压/电流输出值过小或过大,可能会超出ADC的下限或上限量程;通常ADC对中间值的量化精度最高,对两边的值量化精度较差,当ADC的输入超过下限或上限量程时,对应的输出被直接截断成最小或最大值,从而降低了运算精度。In practical applications, if the weight distribution of the neural network algorithm is too small or too large (see Figure 1 or Figure 2, where the circle represents the storage unit, the number in it represents the pre-stored weight value, and the horizontal represents the row input. That is, the input signal. For example, the input of the first row in Figure 1 is 7, the input of the second row is 5, and the input of the third row is 3, and the arrow from the top to the small indicates the output. For example, the output of the first column is 7 The output of the second column is 7, and the output of the third column is 248. The output signal of each column is input to an ADC to convert the analog output signal of the column into a digital signal for subsequent applications; among them, Figure 1 shows The output voltage/current of the first column and the second column is too small, exceeding the lower limit of the ADC; Figure 2 shows that the output voltage/current of the three columns of memory cells is too large, exceeding the upper limit of the ADC range), or the input signal is too small or If it is too large, it may cause the analog voltage/current output value of the memory cell array to be too small or too large, which may exceed the lower or upper range of the ADC; usually the ADC has the highest quantization accuracy for the intermediate value, and the quantization accuracy for the values on both sides is poor. When the ADC input exceeds the lower limit or upper limit range, the corresponding output is directly truncated to the minimum or maximum value, thereby reducing the accuracy of the operation.
发明内容Summary of the invention
针对现有技术中的问题,本发明提供一种神经网络权重矩阵调整方法、写入控制方法以及相关装置、电子设备以及计算机可读存储介质,能够至少部分地解决现有技术中存在的问题。In view of the problems in the prior art, the present invention provides a neural network weight matrix adjustment method, a write control method, related devices, electronic equipment, and computer-readable storage media, which can at least partially solve the problems in the prior art.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above objectives, the present invention adopts the following technical solutions:
第一方面,提供一种神经网络权重矩阵调整方法,包括:In the first aspect, a method for adjusting the weight matrix of a neural network is provided, including:
判断神经网络权重矩阵的权重分布是否低于第一预设阈值;Judging whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
若是,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;If yes, multiply all the weight values in the neural network weight matrix by a first constant;
若否,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一预设阈值;If not, determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;If the weight distribution of the neural network weight matrix is higher than the second preset threshold, all the weight values in the neural network weight matrix are divided by a second constant;
其中,该第一常数以及该第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
进一步地,该将该神经网络权重矩阵中的所有权重值均乘以一第一常数后,还包括:Further, after all the weight values in the neural network weight matrix are multiplied by a first constant, the method further includes:
判断处理后的权重矩阵中的各权重值的位数是否超出第三预设阈值;Judging whether the number of digits of each weight value in the processed weight matrix exceeds the third preset threshold;
若是,截取各权重值超出该第三预设阈值的位得到第一权重阵列以及第二权重阵列;If yes, intercept the bits whose weight values exceed the third preset threshold to obtain the first weight array and the second weight array;
其中,该第一权重阵列为将各权重值的位数超出该第三预设阈值的位截去后剩余的权重阵列,用于存入一存储单元阵列;该第二权重阵列为将各权重值的位数超出该第三预设阈值的位截取出来得到的权重阵列,用于存入另一存储单元阵列或输入算术运算电路。Wherein, the first weight array is the remaining weight array after truncating the bits whose digits of each weight value exceeds the third preset threshold, and is used to store in a memory cell array; the second weight array is the weight The weight array obtained by truncating the bits whose digits of the value exceeds the third preset threshold is used for storing in another memory cell array or inputting an arithmetic operation circuit.
进一步地,神经网络权重矩阵调整方法还包括:Further, the method for adjusting the weight matrix of the neural network also includes:
若处理后的权重矩阵中的各权重值的位数均未超出该第三预设阈值,则将处理后的权重阵列用于存入一存储单元阵列。If the number of bits of each weight value in the processed weight matrix does not exceed the third preset threshold, the processed weight array is used to store in a memory cell array.
进一步地,神经网络权重矩阵调整方法还包括:Further, the method for adjusting the weight matrix of the neural network also includes:
当该第二权重阵列存入另一存储单元阵列时,将该第一权重阵列对应的存储单元阵列后的ADC输出结果与该第二权重阵列对应的存储单元阵列后的ADC输出结果结合;When the second weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the first weight array is combined with the ADC output result after the memory cell array corresponding to the second weight array;
当该第二权重阵列输入该算术运算电路时,将该第一权重阵列对应的存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合。When the second weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the first weight array is combined with the output result of the arithmetic operation circuit.
进一步地,该将该神经网络权重矩阵中的所有权重值均除以一第二常数包括:Further, dividing all the weight values in the neural network weight matrix by a second constant includes:
将各权重值除以该第二常数得到第三权重阵列;Divide each weight value by the second constant to obtain a third weight array;
将各权重值除以该第二常数后的溢出位另存为第四权重阵列;Save the overflow bits obtained by dividing each weight value by the second constant as a fourth weight array;
其中,该第三权重阵列用于存入一存储单元阵列;该第四权重阵列用于存入另一存储单元阵列或输入算术运算电路。Wherein, the third weight array is used to store a memory cell array; the fourth weight array is used to store another memory cell array or input an arithmetic operation circuit.
进一步地,神经网络权重矩阵调整方法还包括:Further, the method for adjusting the weight matrix of the neural network also includes:
当该第四权重阵列存入另一存储单元阵列时,将该第三权重阵列对应的存储单元阵列后的ADC输出结果与该第四权重阵列对应的存储单元阵列后的ADC输出结果结合;When the fourth weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the third weight array is combined with the ADC output result after the memory cell array corresponding to the fourth weight array;
当该第四权重阵列输入该算术运算电路时,将该第三权重阵列对应的存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合。When the fourth weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the third weight array is combined with the output result of the arithmetic operation circuit.
第二方面,提供一种神经网络权重矩阵写入控制方法,包括:In a second aspect, a method for writing a neural network weight matrix is provided, including:
根据数据调整指令控制移位寄存器对其输入的各权重值进行移位操作,并且,将该移位寄存器对其输入的权重值进行移位后溢出的位及该权重值在该神经网络权重矩阵中的地址存入缓存器,其中,该数据调整指令包括:移位方向以及移位位数;According to the data adjustment instruction, the shift register is controlled to shift its input weight values, and the shift register shifts its input weight values and the overflow bits and the weight values in the neural network weight matrix The address in is stored in the buffer, where the data adjustment instruction includes: the shift direction and the number of shift bits;
将该缓存器中的数据存入一存储单元阵列或输入算术运算电路;Storing the data in the buffer into a memory cell array or inputting an arithmetic operation circuit;
其中,该数据调整指令为神经网络权重矩阵的权重分布不均时产生;该移位寄存器连接一写入模块;该写入模块连接另一存储单元阵列,用于将该移位寄存器移位后的数据写入另一存储单元阵列。Wherein, the data adjustment instruction is generated when the weight distribution of the neural network weight matrix is uneven; the shift register is connected to a write module; the write module is connected to another memory cell array for shifting the shift register The data is written to another memory cell array.
进一步地,神经网络权重矩阵写入控制方法还包括:Further, the neural network weight matrix writing control method further includes:
当该缓存器中的数据输入算术运算电路时,将该另一存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合;When the data in the buffer is input to the arithmetic operation circuit, the ADC output result after the another memory cell array is combined with the output result of the arithmetic operation circuit;
当该缓存器中的数据存入一存储单元阵列时,将该存储单元阵列后的ADC输出结果与该另一存储单元阵列后的ADC输出结果结合。When the data in the buffer is stored in a memory cell array, the ADC output result of the memory cell array is combined with the ADC output result of the other memory cell array.
第三方面,提供一种神经网络权重矩阵调整装置,包括:In a third aspect, a neural network weight matrix adjustment device is provided, including:
第一判断模块,判断神经网络权重矩阵的权重分布是否低于第一预设阈值;The first judgment module judges whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
权重放大模块,若神经网络权重矩阵的权重分布低于第一预设阈值,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;The weight amplification module, if the weight distribution of the neural network weight matrix is lower than the first preset threshold, multiply all the weight values in the neural network weight matrix by a first constant;
第二判断模块,若神经网络权重矩阵的权重分布不低于第一预设阈值,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一预设阈值;The second judgment module, if the weight distribution of the neural network weight matrix is not lower than the first preset threshold, judge whether the weight distribution of the neural network weight matrix is higher than the second preset threshold, wherein the second preset threshold is greater than the The first preset threshold;
权重缩小模块,若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;The weight reduction module, if the weight distribution of the neural network weight matrix is higher than the second preset threshold, all the weight values in the neural network weight matrix are divided by a second constant;
其中,该第一常数以及该第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
第四方面,提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,该处理器执行该程序时实现上述的神经网络权重矩阵调整方法 的步骤。In a fourth aspect, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. The processor implements the steps of the neural network weight matrix adjustment method described above when the processor executes the program.
第五方面,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述的神经网络权重矩阵调整方法的步骤。In a fifth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above-mentioned neural network weight matrix adjustment method are realized.
本发明提供的神经网络权重矩阵调整方法、写入控制方法以及相关装置、电子设备以及计算机可读存储介质,适用于在利用存内计算芯片执行神经网络运算前对训练后的神经网络权重矩阵进行处理,该方法包括:判断神经网络权重矩阵的权重分布是否低于第一预设阈值;若是,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;若否,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一预设阈值;若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;其中,该第一常数以及该第二常数均大于1。通过将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。The neural network weight matrix adjustment method, the writing control method, and related devices, electronic equipment, and computer-readable storage medium provided by the present invention are suitable for performing neural network weight matrix training before performing neural network calculations using in-memory computing chips. Processing, the method includes: judging whether the weight distribution of the neural network weight matrix is lower than a first preset threshold; if so, multiplying all the weight values in the neural network weight matrix by a first constant; if not, judging the nerve Whether the weight distribution of the network weight matrix is higher than the second preset threshold, where the second preset threshold is greater than the first preset threshold; if the weight distribution of the neural network weight matrix is higher than the second preset threshold, the All weight values in the neural network weight matrix are divided by a second constant; wherein, the first constant and the second constant are both greater than one. By multiplying or multiplying the weight matrix of the neural network with uneven overall distribution, the signal obtained after the processed weight matrix storage memory cell array is stored in the ADC (the ADC is set after the memory cell array, Used to convert the output of each memory cell column into a digital signal) within the effective range, thereby improving the accuracy of the calculation.
为让本发明的上述和其他目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附图式,作详细说明如下。In order to make the above and other objectives, features and advantages of the present invention more comprehensible, the following will specifically cite preferred embodiments in conjunction with the accompanying drawings, and describe them in detail as follows.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are For some of the embodiments of the present application, for those of ordinary skill in the art, other drawings may be obtained based on these drawings without creative work. In the attached picture:
图1示出了在存内计算芯片的存储单元阵列中执行矩阵运算时存储单元列的输出电流/电压太小导致超过ADC下限量程;Figure 1 shows that the output current/voltage of the memory cell column is too small to exceed the lower limit of the ADC when the matrix operation is performed in the memory cell array of the in-memory computing chip;
图2示出了在存内计算芯片的存储单元阵列中执行矩阵运算时存储单元列的输出电流/电压太大导致超过ADC上限量程;Figure 2 shows that the output current/voltage of the memory cell column is too large to exceed the upper limit range of the ADC when the matrix operation is performed in the memory cell array of the in-memory computing chip;
图3示出了本发明实施例提供的神经网络权重矩阵调整方法的应用场景;Fig. 3 shows an application scenario of a neural network weight matrix adjustment method provided by an embodiment of the present invention;
图4是本发明实施例中的神经网络权重矩阵调整方法的流程示意图一;FIG. 4 is a first flowchart of a method for adjusting a weight matrix of a neural network in an embodiment of the present invention;
图5示出了本发明实施例提供的神经网络权重矩阵调整方法的原理图;FIG. 5 shows a schematic diagram of a method for adjusting a weight matrix of a neural network provided by an embodiment of the present invention;
图6示出了利用本发明实施例提供的神经网络权重矩阵调整方法调整后的权重阵列存在权重值超位数的情况。FIG. 6 shows a situation in which the weight array adjusted by the neural network weight matrix adjustment method provided by the embodiment of the present invention has a weight value exceeding the number of bits.
图7是本发明实施例中的神经网络权重矩阵调整方法的流程示意图二;FIG. 7 is a second schematic flowchart of a method for adjusting a neural network weight matrix in an embodiment of the present invention;
图8示出了本发明实施例中放大权重阵列后分割权重阵列的一种原理图;FIG. 8 shows a schematic diagram of dividing the weight array after amplifying the weight array in the embodiment of the present invention; FIG.
图9示出了本发明实施例中放大权重阵列后分割权重阵列的另一种原理图;FIG. 9 shows another schematic diagram of dividing the weight array after enlarging the weight array in the embodiment of the present invention;
图10示出了本发明实施例中步骤S400的具体步骤;FIG. 10 shows the specific steps of step S400 in the embodiment of the present invention;
图11示出了本发明实施例中缩小权重阵列后分割权重阵列的一种原理图;FIG. 11 shows a schematic diagram of dividing the weight array after shrinking the weight array in an embodiment of the present invention; FIG.
图12是本发明实施例中的神经网络权重矩阵调整装置的结构框图;Fig. 12 is a structural block diagram of a neural network weight matrix adjustment device in an embodiment of the present invention;
图13示出了本发明实施例提供的神经网络权重矩阵写入控制方法的应用场景;FIG. 13 shows an application scenario of a neural network weight matrix writing control method provided by an embodiment of the present invention;
图14是本发明实施例中的神经网络权重矩阵写入控制方法的流程示意图;14 is a schematic flowchart of a method for writing a neural network weight matrix in an embodiment of the present invention;
图15为本发明实施例电子设备的结构图。Fig. 15 is a structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solutions of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only These are a part of the embodiments of this application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work should fall within the protection scope of this application.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention can be provided as a method, a system, or a computer program product. Therefore, the present invention may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "including" and "having" in the specification and claims of this application and the above-mentioned drawings and any variations of them are intended to cover non-exclusive inclusions, for example, including a series of steps or units. The process, method, system, product, or equipment of is not necessarily limited to those clearly listed steps or units, but may include other steps or units that are not clearly listed or are inherent to these processes, methods, products, or equipment.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict. Hereinafter, the present application will be described in detail with reference to the drawings and in conjunction with the embodiments.
图3示出了本发明实施例提供的神经网络权重矩阵调整方法的应用场景;如图3所示,编译软件1连接存内计算芯片2中的编程电路,用于将利用本发明实施例提供的神经网络权重矩阵调整方法调整后的神经网络权重矩阵通过编程电路写入存内计算芯片2中的存储单元阵列中,在应用该神经网络处理输入数据时,输入数据流经过一定的预处 理后传输至存储单元阵列中,与存储单元阵列中预先写入的神经网络权重矩阵进行神经网络运算,存储单元阵列的输出数据流经过ADC模块转换为数字信号,输出运算结果。Fig. 3 shows an application scenario of the neural network weight matrix adjustment method provided by an embodiment of the present invention; as shown in Fig. 3, the compiling software 1 is connected to the programming circuit in the in-store computing chip 2, and is used to provide Neural network weight matrix adjustment method The adjusted neural network weight matrix is written into the memory cell array in the in-store calculation chip 2 through the programming circuit. When the neural network is used to process the input data, the input data stream is subjected to certain preprocessing It is transmitted to the memory cell array and performs neural network operation with the neural network weight matrix pre-written in the memory cell array. The output data stream of the memory cell array is converted into a digital signal by the ADC module, and the operation result is output.
值得说明的是,图3中仅示例性列举出存内计算芯片2中的几个电路模块,本领域技术人员可以理解的是,该存内计算芯片2中还可以设有寄存器、后处理模块等相关的功能电路,另外,本领域技术人员可以理解的是,该编程电路也可以称为写入模块或者读写模块,用于对存储单元阵列中每个存储单元(可采用可编程半导体器件实现,如浮栅MOS管)进行编程或称写入数据。It is worth noting that, in Figure 3, only a few circuit modules in the in-memory computing chip 2 are exemplarily listed. Those skilled in the art can understand that the in-memory computing chip 2 may also be provided with registers and post-processing modules. And other related functional circuits. In addition, those skilled in the art can understand that the programming circuit can also be called a write module or a read/write module, which is used to control each memory cell in the memory cell array (programmable semiconductor devices can be used). Realization, such as floating gate MOS tube) for programming or writing data.
其中,编译软件可为目前已经开发或者尚未开发出的编译处理软件,也可以是一段编程的计算机程序,可在计算机设备中执行,也可以在处理芯片或移动便携设备中执行,本发明实施例对此不作限制。Wherein, the compilation software may be a compilation processing software that has been developed or not yet developed, or a programmed computer program, which can be executed in a computer device, or can be executed in a processing chip or a mobile portable device. The embodiment of the present invention There is no restriction on this.
另外,对于存内计算芯片来说,对于可重复编程使用场所的存内计算芯片来说,可以在其内设置编程电路,对于不需要调整神经网络权重矩阵的应用场所来说,为了减小芯片的体积,可以不设置编程电路,而是在出厂时预先通过编程设备将调整后的最终神经网络权重矩阵写入存内计算芯片中。In addition, for the in-memory computing chip, for the in-memory computing chip that can be repeatedly programmed and used, a programming circuit can be set in it. For applications that do not need to adjust the neural network weight matrix, in order to reduce the chip It is not necessary to set the programming circuit, but the final neural network weight matrix after adjustment is written into the in-memory calculation chip through the programming device in advance at the factory.
图4是本发明实施例中的神经网络权重矩阵调整方法的流程示意图一;如图4所示,该神经网络权重矩阵调整方法可以包括以下内容:FIG. 4 is a first flowchart of a method for adjusting a weight matrix of a neural network in an embodiment of the present invention; as shown in FIG. 4, the method for adjusting a weight matrix of a neural network may include the following contents:
步骤S100:判断神经网络权重矩阵的权重分布是否低于第一预设阈值;Step S100: Determine whether the weight distribution of the neural network weight matrix is lower than a first preset threshold;
若是,执行步骤S200;若否,执行步骤S300;If yes, go to step S200; if not, go to step S300;
其中,该神经网络权重矩阵是在神经网络训练阶段训练好的神经网络权重矩阵,权重分布可以是均值或概率分布等统计学指标,具体可由设计人员根据实际ADC的量程来设置。该第一预设阈值由设计人员根据具体统计要求以及硬件需求设置。Among them, the neural network weight matrix is a neural network weight matrix trained in the neural network training stage, and the weight distribution can be a statistical index such as a mean value or a probability distribution, which can be specifically set by the designer according to the actual ADC range. The first preset threshold is set by the designer according to specific statistical requirements and hardware requirements.
步骤S200:将所述神经网络权重矩阵中的所有权重值均乘以一第一常数;Step S200: Multiply all the weight values in the neural network weight matrix by a first constant;
即:将神经网络权重矩阵中的所有权重值均乘以一个常数N,以增大网络权重,进而将增大后的权重矩阵写入存储单元阵列后,在应用时,针对相同的输入数据流,能够成倍地增大存储单元阵列的输出模拟电流。That is: multiply all the weight values in the neural network weight matrix by a constant N to increase the network weight, and then write the increased weight matrix into the memory cell array, and apply it to the same input data stream. , Can double the output analog current of the memory cell array.
其中,常数N通常可设置为2的整数倍,相当于移位。Among them, the constant N can usually be set to an integer multiple of 2, which is equivalent to shifting.
举例来说,参见图5,对于一个二进制的权重矩阵来说,将原始神经网络权重矩阵乘以8,即将二进制数左移3位,得到放大后的权重矩阵。For example, referring to Fig. 5, for a binary weight matrix, the original neural network weight matrix is multiplied by 8, that is, the binary number is shifted to the left by 3 bits to obtain the amplified weight matrix.
步骤S300:判断所述神经网络权重矩阵的权重分布是否高于第二预设阈值,其中, 所述第二预设阈值大于所述第一预设阈值;Step S300: Determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
若是,执行步骤S400;若否,执行步骤S500;If yes, perform step S400; if not, perform step S500;
其中,第二预设阈值由设计人员根据具体统计要求以及硬件需求设置。Among them, the second preset threshold is set by the designer according to specific statistical requirements and hardware requirements.
步骤S400:将所述神经网络权重矩阵中的所有权重值均除以一第二常数;Step S400: Divide all the weight values in the neural network weight matrix by a second constant;
其中,所述第一常数以及所述第二常数均大于1,第一常数和第二常数可以相同也可以不相同。值得说明的是,乘以2与除以1/2的运算结果是相同的,因此,本发明所要求保护的范围,对于乘或除的表述并非对本发明要求保护的范围的限定,对于相同或等效的运算过程均应包括在本发明要求保护的范围内。Wherein, the first constant and the second constant are both greater than 1, and the first constant and the second constant may be the same or different. It is worth noting that the calculation result of multiplying by 2 and dividing by 1/2 is the same. Therefore, the scope of protection claimed by the present invention, and the expression of multiplication or division does not limit the scope of protection of the present invention, but for the same or The equivalent calculation process should all be included in the scope of protection of the present invention.
即:如果权重分布高于第二预设阈值,则将该权重矩阵写入存储单元阵列后,在应用阶段,存储单元阵列输出的模拟电压/电流可能超出ADC的量化上限,此时,把所有权重除以一个常数M(M通常为2的整数倍,相当于往右移位),以减小网络权重,进而把应用阶段中,针对相同的输入数据流,减小后的网络权重能够使得存储单元阵列输出的模拟电压/电流减小到ADC合适的量程。That is: if the weight distribution is higher than the second preset threshold, after the weight matrix is written into the memory cell array, in the application stage, the analog voltage/current output by the memory cell array may exceed the quantization upper limit of the ADC. At this time, all The weight is divided by a constant M (M is usually an integer multiple of 2, which is equivalent to shifting to the right) to reduce the network weight, and then in the application stage, for the same input data stream, the reduced network weight can make The analog voltage/current output by the memory cell array is reduced to a suitable range for the ADC.
步骤S500:将神经网络权重矩阵用于存入一存储单元阵列。Step S500: Use the neural network weight matrix to store in a memory cell array.
其中,本步骤中的神经网络权重矩阵指步骤S100中的神经网络权重矩阵(若该权重矩阵的权重分布既不低于第一预设阈值也不高于第二预设阈值)或者步骤S200处理后得到的权重矩阵或者步骤S400处理后的权重矩阵。Among them, the neural network weight matrix in this step refers to the neural network weight matrix in step S100 (if the weight distribution of the weight matrix is neither lower than the first preset threshold nor higher than the second preset threshold) or the processing in step S200 The weight matrix obtained later or the weight matrix processed in step S400.
具体地,将权重矩阵存入一个存储单元阵列中,在应用阶段,向该存储单元阵列输入输入数据流,在存储单元阵列中,该输入数据流与权重阵列进行模拟向量-矩阵乘法运算,运算结果以存储单元阵列的输出模拟电压/电流形式传输至存储单元阵列后的ADC,ADC将模拟电压/电流转换为数字信号。Specifically, the weight matrix is stored in a memory cell array, and in the application stage, an input data stream is input to the memory cell array. In the memory cell array, the input data stream and the weight array are subjected to analog vector-matrix multiplication operations, The result is transmitted to the ADC after the memory cell array in the form of output analog voltage/current of the memory cell array, and the ADC converts the analog voltage/current into a digital signal.
综上所述,本发明实施例提供的神经网络权重矩阵调整方法,通过将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。In summary, the neural network weight matrix adjustment method provided by the embodiment of the present invention multiplies or shrinks the neural network weight matrix with uneven overall distribution, so that the processed weight matrix is stored in the memory cell array. The signal obtained after the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
值得说明的是,对于步骤S200中将所述神经网络权重矩阵中的所有权重值均乘以一第一常数以增大网络权重来说,理想情况下是增大后的权重阵列中所有的权重值均满足要求,但是本领域技术人员可以理解的是,对于权重分布比较不均匀的阵列来说,增大后的权重阵列中,有些较大的权重值可能会超过位数上限(也可以称为移位后存在溢出位),例如,参见图6,假设权重精度为8比特,乘以常数8(左移3位)之后,则第3 行第2列的权重值将超过位数上限。对于这种情况,参见图7,本发明实施例提供的神经网络权重矩阵调整方法还可以包括:It is worth noting that, in step S200, all the weight values in the neural network weight matrix are multiplied by a first constant to increase the network weight, ideally all weights in the increased weight array The values meet the requirements, but those skilled in the art can understand that for an array with a relatively uneven weight distribution, in the increased weight array, some larger weight values may exceed the upper limit of the number of bits (also called For example, referring to Figure 6, assuming that the weight precision is 8 bits, after multiplying by the constant 8 (left shifted by 3 bits), the weight value in the third row and the second column will exceed the upper limit of the number of bits. In this case, referring to FIG. 7, the neural network weight matrix adjustment method provided by the embodiment of the present invention may further include:
步骤S600:判断处理后的权重矩阵中的各权重值的位数是否超出第三预设阈值;Step S600: Determine whether the number of digits of each weight value in the processed weight matrix exceeds a third preset threshold;
若是,执行步骤S700;若否,执行步骤S500;If yes, perform step S700; if not, perform step S500;
其中,第三预设阈值可为算法的精度位数,如8位,16位等,由设计人员根据具体统计要求以及硬件需求设置。Among them, the third preset threshold may be the accuracy of the algorithm, such as 8 bits, 16 bits, etc., which is set by the designer according to specific statistical requirements and hardware requirements.
步骤S700:截取各权重值超出该第三预设阈值的位得到第一权重阵列以及第二权重阵列;Step S700: intercept the bits whose weight values exceed the third preset threshold to obtain a first weight array and a second weight array;
其中,该第一权重阵列(也可称为标准矩阵)为将各权重值的位数超出该第三预设阈值的位截去后剩余的权重阵列,用于存入存内计算芯片中的存储单元阵列,执行模拟向量-矩阵乘法运算;该第二权重阵列为将各权重值的位数超出该第三预设阈值的位截取出来得到的权重阵列,用于存入另一存储单元阵列或输入算术运算电路,值得说明的是,第二权重阵列可能是一个稀疏矩阵或者是普通的权重矩阵。Wherein, the first weight array (also referred to as the standard matrix) is the remaining weight array after truncating the bits whose digits of each weight value exceed the third preset threshold value, and is used to store in the memory computing chip The memory cell array performs an analog vector-matrix multiplication operation; the second weight array is a weight array obtained by truncating the bits whose digits of each weight value exceeds the third preset threshold, and is used to store in another memory cell array Or input an arithmetic operation circuit. It is worth noting that the second weight array may be a sparse matrix or an ordinary weight matrix.
对于稀疏矩阵来说,可将稀疏矩阵输入一个算术运算电路中执行运算,运算所对应的一个输入为稀疏矩阵中的元素,另一个输入为上述的输入数据流中对应该元素对应的权重值所对应的输入数据。For sparse matrices, the sparse matrix can be input into an arithmetic operation circuit to perform the operation. One input corresponding to the operation is an element in the sparse matrix, and the other input is the weight value corresponding to the element in the aforementioned input data stream. Corresponding input data.
举例来说,该算术运算电路可为常规的数字电路,比如乘法器,也可以为CPU,具体可先将该稀疏矩阵存储一存储器,再由一存储器中调入CPU进行乘法运算。For example, the arithmetic operation circuit can be a conventional digital circuit, such as a multiplier, or a CPU. Specifically, the sparse matrix can be stored in a memory first, and then transferred from a memory to the CPU to perform multiplication operations.
当第二权重阵列为普通权重阵列时,可将该权重阵列存入另一存储单元阵列,由另一存储单元阵列对该第二权重阵列以及输入数据流执行模拟向量-矩阵乘法运算。When the second weight array is a normal weight array, the weight array can be stored in another memory cell array, and the other memory cell array performs an analog vector-matrix multiplication operation on the second weight array and the input data stream.
值得说明的是,在进行截取时,需要对应记录数据的地址,以便对截取得到的矩阵进行运算时,选择对应的输入数据。It is worth noting that when performing interception, it is necessary to correspond to the address of the recorded data, so that when the matrix obtained by the interception is operated, the corresponding input data is selected.
图8示出了本发明实施例中放大权重阵列后分割权重阵列的一种原理图;如图8所示,对应图6所示超位数情况,将高于8比特的部分截断放到一个新的权重矩阵,该权重矩阵为一个稀疏矩阵。标准矩阵仍然基于存算一体(向量-矩阵乘法运算)的方式进行处理,而稀疏矩阵部分则可以通过常规数字电路进行处理,最后把两者进行结合即为最后的输出。Fig. 8 shows a schematic diagram of dividing the weight array after amplifying the weight array in an embodiment of the present invention; as shown in Fig. 8, corresponding to the super-bit number situation shown in Fig. 6, the part higher than 8 bits is truncated and placed into one The new weight matrix, which is a sparse matrix. The standard matrix is still processed based on the integration of storage and calculation (vector-matrix multiplication), while the sparse matrix part can be processed by conventional digital circuits, and finally the combination of the two is the final output.
图8是分步骤对本发明实施例的原理进行详细说明,在实际实施时,为了有效节约程序,参见图9,可在增大权重的同时,直接将溢出的位存入另一矩阵中,而不需要区分增大步骤以及分割步骤。Figure 8 is a step-by-step detailed description of the principle of the embodiment of the present invention. In actual implementation, in order to effectively save the program, see Figure 9. While increasing the weight, the overflowed bits can be directly stored in another matrix, and There is no need to distinguish between the enlargement step and the division step.
在一个可选的实施例,该调整方法还可以包括:In an optional embodiment, the adjustment method may further include:
当该第二权重阵列存入另一存储单元阵列时,将该第一权重阵列对应的存储单元阵列后的ADC输出结果与该第二权重阵列对应的存储单元阵列后的ADC输出结果结合;When the second weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the first weight array is combined with the ADC output result after the memory cell array corresponding to the second weight array;
当该第二权重阵列输入该算术运算电路时,将该第一权重阵列对应的存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合。When the second weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the first weight array is combined with the output result of the arithmetic operation circuit.
本领域技术人员可以理解的是,结合的方式可以对第一权重阵列以及第二权重阵列的结果进行叠加,结合的方式也可以是在叠加的基础上对叠加后的结果减小一定倍数,具体可以根据后续电路量化精度等参数进行选择。Those skilled in the art can understand that the combination method can superimpose the results of the first weight array and the second weight array, and the combination method can also reduce the superimposed result by a certain multiple on the basis of superposition. It can be selected according to parameters such as the quantization accuracy of the subsequent circuit.
在一个可选的实施例中,对于步骤S400中将所述神经网络权重矩阵中的所有权重值均除以一第二常数以减小网络权重来说,理想情况下是减小后的权重阵列中所有的权重值均满足要求,但是本领域技术人员可以理解的是,对于权重分布比较不均匀的阵列来说,减小后的权重阵列中,有些权重值右移位后,可能会出现溢出位,对于这种情况,参见图10,该步骤S400可以包括以下内容:In an optional embodiment, in step S400, all the weight values in the neural network weight matrix are divided by a second constant to reduce the network weight, ideally the reduced weight array All the weight values in, meet the requirements, but those skilled in the art can understand that for an array with a relatively uneven weight distribution, in the reduced weight array, after some weight values are shifted to the right, overflow may occur For this case, referring to Fig. 10, this step S400 may include the following:
步骤S410:将各权重值除以该第二常数得到第三权重阵列;Step S410: Divide each weight value by the second constant to obtain a third weight array;
步骤S420:将各权重值除以该第二常数后的溢出位另存为第四权重阵列;Step S420: Save the overflow bits obtained by dividing each weight value by the second constant as a fourth weight array;
其中,该第三权重阵列用于存入一存储单元阵列,执行模拟向量-矩阵乘法运算;该第四权重阵列用于存入另一存储单元阵列或输入算术运算电路,第四权重阵列的运算具体参见第二权重阵列的运算,在此不再赘述。Wherein, the third weight array is used to store in a memory cell array to perform analog vector-matrix multiplication; the fourth weight array is used to store in another memory cell array or input arithmetic operation circuit, and the operation of the fourth weight array For details, please refer to the calculation of the second weight array, which will not be repeated here.
具体地,将各权重值除以该第二常数相当于往右移位,将溢出位另存为第四权重阵列。Specifically, dividing each weight value by the second constant is equivalent to shifting to the right, and saving the overflow bit as a fourth weight array.
值得说明的是,在进行移位时或将溢出位另存为一个矩阵时,需要对应记录数据的地址(即在矩阵中的地址),以便对截取得到的矩阵进行运算时,选择对应的输入数据。It is worth noting that when shifting or saving the overflow bit as a matrix, it is necessary to correspond to the address of the recorded data (that is, the address in the matrix), so that when the matrix obtained by interception is operated, the corresponding input data is selected .
图11示出了本发明实施例中缩小权重阵列后分割权重阵列的一种原理图;如图11所示,将原始神经网络权重矩阵中的各权重值除以8,相当于将各二进制权重值右移3位,得到标准矩阵,将溢出位另存为溢出矩阵,由于图示中移位后得到的标准矩阵以及溢出矩阵均不是稀疏矩阵,因此,可利用两个存储单元阵列分别处理标准矩阵以及溢出矩阵,两个存储单元阵列的输入数据流相同,将两个存储单元阵列的输出结合后,输入ADC中进行转换。FIG. 11 shows a schematic diagram of dividing the weight array after shrinking the weight array in the embodiment of the present invention; as shown in FIG. 11, dividing each weight value in the original neural network weight matrix by 8 is equivalent to dividing each binary weight The value is shifted by 3 bits to the right to obtain a standard matrix, and the overflow bit is saved as an overflow matrix. Since the standard matrix and overflow matrix obtained after shifting in the figure are not sparse matrices, two memory cell arrays can be used to process the standard matrix separately As well as the overflow matrix, the input data streams of the two memory cell arrays are the same. After the outputs of the two memory cell arrays are combined, they are input to the ADC for conversion.
值得说明的是,调整后产生的新矩阵(例如图8、图9或图11中所示的标准矩阵、稀疏矩阵或溢出矩阵),若权重分布不均,还可以继续利用本发明实施例提供的调整方法进行调整,直到矩阵的权重分布合适为止,例如,对于调整后产生的新矩阵,权重偏小,可以再乘以一个常数以调整到合适的分布。It is worth noting that if the weight distribution of the new matrix (such as the standard matrix, sparse matrix or overflow matrix shown in Figure 8, Figure 9 or Figure 11) generated after adjustment is not uniform, the embodiments of the present invention can still be used to provide The adjustment method of the adjustment method is adjusted until the weight distribution of the matrix is appropriate. For example, for the new matrix after adjustment, the weight is too small, and you can multiply it by a constant to adjust to a proper distribution.
在一个进一步地实施例中,该神经网络权重矩阵调整方法还包括:In a further embodiment, the neural network weight matrix adjustment method further includes:
当该第四权重阵列存入另一存储单元阵列时,将该第三权重阵列对应的存储单元阵列后的ADC输出结果与该第四权重阵列对应的存储单元阵列后的ADC输出结果结合;When the fourth weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the third weight array is combined with the ADC output result after the memory cell array corresponding to the fourth weight array;
当该第四权重阵列输入该算术运算电路时,将该第三权重阵列对应的存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合。When the fourth weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the third weight array is combined with the output result of the arithmetic operation circuit.
本领域技术人员可以理解的是,结合的方式可以对第三权重阵列以及第四权重阵列的结果进行叠加,结合的方式也可以是在叠加的基础上对叠加后的结果增大一定倍数,具体可以根据后续电路量化精度等参数进行选择。Those skilled in the art can understand that the combination method can superimpose the results of the third weight array and the fourth weight array, and the combination method can also increase the superimposed result by a certain multiple on the basis of superposition. It can be selected according to parameters such as the quantization accuracy of the subsequent circuit.
综上所述,本发明实施例提供的神经网络权重矩阵调整方法,通过将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。In summary, the neural network weight matrix adjustment method provided by the embodiment of the present invention multiplies or shrinks the neural network weight matrix with uneven overall distribution, so that the processed weight matrix is stored in the memory cell array. The signal obtained after the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
基于同一发明构思,本申请实施例还提供了一种神经网络权重矩阵调整装置,可以用于实现上述实施例所描述的方法,如下面的实施例所述。由于神经网络权重矩阵调整装置解决问题的原理与上述方法相似,因此神经网络权重矩阵调整装置的实施可以参见上述方法的实施,重复之处不再赘述。以下所使用的,术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。Based on the same inventive concept, an embodiment of the present application also provides a neural network weight matrix adjustment device, which can be used to implement the method described in the foregoing embodiment, as described in the following embodiment. Since the principle of the neural network weight matrix adjustment device to solve the problem is similar to the above method, the implementation of the neural network weight matrix adjustment device can refer to the implementation of the above method, and the repetition will not be repeated. As used below, the term "unit" or "module" can be a combination of software and/or hardware that implements a predetermined function. Although the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
图12是本发明实施例中的神经网络权重矩阵调整装置的结构框图;如图12所示,该神经网络权重矩阵调整装置可以包括:第一判断模块10、权重放大模块20、第二判断模块30以及权重缩小模块40。FIG. 12 is a structural block diagram of a neural network weight matrix adjustment device in an embodiment of the present invention; as shown in FIG. 12, the neural network weight matrix adjustment device may include: a first judgment module 10, a weight amplification module 20, and a second judgment module 30 and the weight reduction module 40.
第一判断模块10,判断神经网络权重矩阵的权重分布是否低于第一预设阈值;The first judgment module 10 judges whether the weight distribution of the neural network weight matrix is lower than a first preset threshold;
权重放大模块20,若神经网络权重矩阵的权重分布低于第一预设阈值,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;The weight amplification module 20, if the weight distribution of the neural network weight matrix is lower than the first preset threshold, multiply all the weight values in the neural network weight matrix by a first constant;
第二判断模块30,若神经网络权重矩阵的权重分布不低于第一预设阈值,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一 预设阈值;The second judgment module 30, if the weight distribution of the neural network weight matrix is not lower than the first preset threshold, judge whether the weight distribution of the neural network weight matrix is higher than the second preset threshold, wherein the second preset threshold is greater than The first preset threshold;
权重缩小模块40,若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;The weight reduction module 40, if the weight distribution of the neural network weight matrix is higher than the second preset threshold, divide all the weight values in the neural network weight matrix by a second constant;
其中,该第一常数以及该第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
本发明实施例还提供一种神经网络权重矩阵写入控制方法,图13示出了本发明实施例提供的神经网络权重矩阵写入控制方法的应用场景;如图13所示,该场景下,待写入权重输入存内计算芯片2’中的移位寄存器中,移位寄存器对权重进行移位后,经写入模块写入存储单元阵列对应的存储单元中,在应用该神经网络处理输入数据时,输入数据流经过一定的预处理后传输至存储单元阵列中,与存储单元阵列中预先写入的神经网络权重矩阵进行神经网络运算,存储单元阵列的输出数据流经过ADC模块转换为数字信号,输出运算结果。The embodiment of the present invention also provides a neural network weight matrix writing control method. FIG. 13 shows an application scenario of the neural network weight matrix writing control method provided by an embodiment of the present invention; as shown in FIG. 13, in this scenario, The weight to be written is input to the shift register in the in-memory calculation chip 2'. After the shift register shifts the weight, it is written into the memory cell corresponding to the memory cell array by the write module, and the neural network is used to process the input When data, the input data stream is transmitted to the memory cell array after a certain preprocessing, and the neural network weight matrix pre-written in the memory cell array is used for neural network operation, and the output data stream of the memory cell array is converted to digital by the ADC module Signal, output the calculation result.
值得说明的是,图13中仅示例性列举出存内计算芯片2’中的几个电路模块,本领域技术人员可以理解的是,该存内计算芯片2’中还可以设有寄存器、后处理模块等相关的功能电路,另外,存储单元阵列中每个存储单元可采用可编程半导体器件实现,如浮栅MOS管。It is worth noting that FIG. 13 only exemplarily lists a few circuit modules in the in-memory computing chip 2'. Those skilled in the art can understand that the in-memory computing chip 2'may also be provided with registers and back Processing modules and other related functional circuits. In addition, each memory cell in the memory cell array can be implemented by a programmable semiconductor device, such as a floating gate MOS tube.
另外,该移位寄存器还连接一片外缓存器1’,用于缓存数据。In addition, the shift register is also connected to an external buffer 1'for buffering data.
其中,由控制模块执行本发明实施例提供的神经网络权重矩阵写入控制方法,参见图14,该神经网络权重矩阵写入控制方法可以包括以下内容:Wherein, the control module executes the neural network weight matrix writing control method provided by the embodiment of the present invention. Referring to FIG. 14, the neural network weight matrix writing control method may include the following content:
步骤S1000:根据数据调整指令控制移位寄存器对其输入的各权重值进行移位操作,并且,将该移位寄存器对其输入的权重值进行移位后溢出的位及该权重值在该神经网络权重矩阵中的地址存入缓存器;Step S1000: According to the data adjustment instruction, the shift register is controlled to shift the input weight values, and the shift register shifts the input weight values and the overflow bits and the weight values are in the nerve The addresses in the network weight matrix are stored in the buffer;
其中,该数据调整指令包括:移位方向以及移位位数。Among them, the data adjustment instruction includes: the shift direction and the number of shift bits.
具体地,可预先判断该神经网络权重矩阵的权重分布是否低于第一预设阈值,或者,是否高于第二预设阈值,若是,产生数据调整指令。Specifically, it can be judged in advance whether the weight distribution of the neural network weight matrix is lower than a first preset threshold, or whether it is higher than a second preset threshold, and if so, a data adjustment instruction is generated.
另外,神经网络权重矩阵是在神经网络训练阶段训练好的神经网络权重矩阵,权重分布可以是均值或概率分布等统计学指标。该第一预设阈值以及第二预设阈值由设计人员根据具体统计要求以及硬件需求设置,其中,第二预设阈值大于第一预设阈值。In addition, the neural network weight matrix is a neural network weight matrix trained in the neural network training stage, and the weight distribution can be a statistical indicator such as a mean value or a probability distribution. The first preset threshold and the second preset threshold are set by designers according to specific statistical requirements and hardware requirements, where the second preset threshold is greater than the first preset threshold.
若神经网络权重矩阵的权重分布低于第一预设阈值,则数据调整指令中的移位方向为向左移动,以增大网络权重,进而将增大后的权重矩阵写入存储单元阵列后,在应用时,针对相同的输入数据流,能够成倍地增大存储单元阵列的输出模拟电流,移动位数 由设计人员根据具体统计要求以及硬件需求设置;若神经网络权重矩阵的权重分布高于第二预设阈值,则数据调整指令中的移位方向为向右移动,以减小网络权重,进而把应用阶段中,针对相同的输入数据流,减小后的网络权重能够使得存储单元阵列输出的模拟电压/电流减小到ADC合适的量程,移动位数设计人员根据具体统计要求以及硬件需求设置。If the weight distribution of the neural network weight matrix is lower than the first preset threshold, the shift direction in the data adjustment instruction is to move to the left to increase the network weight, and then write the increased weight matrix into the memory cell array. In application, for the same input data stream, the output analog current of the memory cell array can be doubled. The number of shifts is set by the designer according to the specific statistical requirements and hardware requirements; if the weight distribution of the neural network weight matrix is high At the second preset threshold, the shift direction in the data adjustment instruction is to move to the right to reduce the network weight. In the application phase, for the same input data stream, the reduced network weight can make the storage unit The analog voltage/current output by the array is reduced to the appropriate range of the ADC, and the number of shifts is set by the designer according to the specific statistical requirements and hardware requirements.
值得说明的是,移位操作后得到的阵列作为标准矩阵,由存储单元阵列执行神经网络运算,移位后溢出的位及该权重值在该神经网络权重矩阵中的地址另存为一个阵列存入缓存器,这个阵列可以是一个稀疏矩阵也可以是一个普通的权重阵列;It is worth noting that the array obtained after the shift operation is used as a standard matrix, and the neural network operation is performed by the memory cell array. The bits overflowed after the shift and the address of the weight value in the neural network weight matrix are saved as an array. Buffer, this array can be a sparse matrix or a normal weight array;
步骤S2000:将该缓存器中的数据存入一存储单元阵列或输入算术运算电路;Step S2000: store the data in the buffer into a memory cell array or input an arithmetic operation circuit;
具体地,当缓存器中的数据为一个稀疏矩阵时,则将该稀疏矩阵输入一个算术运算电路执行运算,当该缓存器中的数据为一个普通权重矩阵时,输入一个存储单元阵列进行处理,具体可参见第二权重阵列的处理过程,在此不再赘述。Specifically, when the data in the buffer is a sparse matrix, the sparse matrix is input to an arithmetic operation circuit to perform the operation, and when the data in the buffer is a normal weight matrix, a memory cell array is input for processing, For details, please refer to the processing procedure of the second weight array, which will not be repeated here.
其中,该数据调整指令为神经网络权重矩阵的权重分布不均时产生;该移位寄存器连接一写入模块;该写入模块连接另一存储单元阵列,用于将该移位寄存器移位后的数据写入另一存储单元阵列。Wherein, the data adjustment instruction is generated when the weight distribution of the neural network weight matrix is uneven; the shift register is connected to a write module; the write module is connected to another memory cell array for shifting the shift register The data is written to another memory cell array.
本发明实施例提供的神经网络权重矩阵写入控制方法,通过利用移位寄存器在权重写入过程中将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。The neural network weight matrix writing control method provided by the embodiment of the present invention uses a shift register to multiply or reduce the overall uneven neural network weight matrix during the weight writing process, so as to make the processed The signal obtained after the weight matrix storage memory cell array performs in-storage calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
在一个可选的实施例中,神经网络权重矩阵写入控制方法还包括:In an optional embodiment, the neural network weight matrix writing control method further includes:
当该缓存器中的数据输入算术运算电路时,将该另一存储单元阵列后的ADC输出结果与该算术运算电路的输出结果结合;When the data in the buffer is input to the arithmetic operation circuit, the ADC output result after the another memory cell array is combined with the output result of the arithmetic operation circuit;
当该缓存器中的数据存入一存储单元阵列时,将该存储单元阵列后的ADC输出结果与该另一存储单元阵列后的ADC输出结果结合。When the data in the buffer is stored in a memory cell array, the ADC output result of the memory cell array is combined with the ADC output result of the other memory cell array.
本领域技术人员可以理解的是,结合的方式可以对两个存储单元阵列或者一个存储单元阵列和一个算术运算电路的结果进行叠加,结合的方式也可以是在叠加的基础上对叠加后的结果减小一定倍数或者增大一定倍数,具体可以根据后续电路量化精度等参数进行选择。Those skilled in the art can understand that the combination method can superimpose the results of two memory cell arrays or a storage cell array and an arithmetic operation circuit, and the combination method can also be to superimpose the result after superposition on the basis of superposition. Decrease a certain multiple or increase a certain multiple, which can be specifically selected according to parameters such as the quantization accuracy of the subsequent circuit.
上述实施例阐明的装置,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为电子设备,具体的,电子设备例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The device explained in the above embodiments may be implemented by a computer chip or entity, or implemented by a product with a certain function. A typical implementation device is an electronic device. Specifically, the electronic device may be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, Game consoles, tablet computers, wearable devices, or any combination of these devices.
在一个典型的实例中电子设备具体包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现下述步骤:In a typical example, the electronic device specifically includes a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the following steps when the program is executed:
判断神经网络权重矩阵的权重分布是否低于第一预设阈值;Judging whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
若是,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;If yes, multiply all the weight values in the neural network weight matrix by a first constant;
若否,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一预设阈值;If not, determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;If the weight distribution of the neural network weight matrix is higher than the second preset threshold, all the weight values in the neural network weight matrix are divided by a second constant;
其中,该第一常数以及该第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
从上述描述可知,本发明实施例提供的电子设备,可用于神经网络权重矩阵调整,通过将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。It can be seen from the above description that the electronic device provided by the embodiments of the present invention can be used for neural network weight matrix adjustment, by multiplying or multiplying the neural network weight matrix with uneven overall distribution, so that the processed weight matrix can be stored The signal obtained after the memory cell array performs the memory calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
下面参考图15,其示出了适于用来实现本申请实施例的电子设备600的结构示意图。Reference is now made to FIG. 15, which shows a schematic structural diagram of an electronic device 600 suitable for implementing the embodiments of the present application.
如图15所示,电子设备600包括中央处理单元(CPU)/MCU601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器As shown in FIG. 15, the electronic device 600 includes a central processing unit (CPU)/MCU601, which can be loaded into a random access memory according to a program stored in a read-only memory (ROM) 602 or from a storage part 608
(RAM))603中的程序而执行各种适当的工作和处理。在RAM603中,还存储有系统600操作所需的各种程序和数据。CPU601、ROM602、以及RAM603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。(RAM)) The program in 603 executes various appropriate tasks and processing. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡,调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口 605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装如存储部分608。The following components are connected to the I/O interface 605: an input part 606 including a keyboard, a mouse, etc.; an output part 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and speakers, etc.; a storage part 608 including a hard disk, etc. ; And the communication part 609 including a network interface card such as a LAN card, a modem, etc. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I/O interface 605 as needed. The removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as required, so that the computer program read from it is installed as the storage part 608 as required.
特别地,根据本发明的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本发明的实施例包括一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现下述步骤:In particular, according to an embodiment of the present invention, the process described above with reference to the flowchart can be implemented as a computer software program. For example, the embodiment of the present invention includes a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
判断神经网络权重矩阵的权重分布是否低于第一预设阈值;Judging whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
若是,将该神经网络权重矩阵中的所有权重值均乘以一第一常数;If yes, multiply all the weight values in the neural network weight matrix by a first constant;
若否,判断该神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,该第二预设阈值大于该第一预设阈值;If not, determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
若该神经网络权重矩阵的权重分布高于第二预设阈值,则将该神经网络权重矩阵中的所有权重值均除以一第二常数;If the weight distribution of the neural network weight matrix is higher than the second preset threshold, all the weight values in the neural network weight matrix are divided by a second constant;
其中,该第一常数以及该第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
从上述描述可知,本发明实施例提供的计算机可读存储介质,可用于神经网络权重矩阵调整,通过将整体分布不均匀的神经网络权重矩阵成倍增大或成倍缩小,以便使得将处理后的权重矩阵存储存储单元阵列进行存内计算后得到的信号处于ADC(ADC设置在存储单元阵列之后,用于将各存储单元列的输出转换为数字信号)的有效量程内,进而提高运算精度。It can be seen from the above description that the computer-readable storage medium provided by the embodiments of the present invention can be used for neural network weight matrix adjustment. The signal obtained after the weight matrix storage memory cell array performs in-storage calculation is within the effective range of the ADC (the ADC is arranged after the memory cell array and is used to convert the output of each memory cell column into a digital signal), thereby improving the calculation accuracy.
在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。In such an embodiment, the computer program may be downloaded and installed from the network through the communication part 609, and/or installed from the removable medium 611.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above device, the functions are divided into various units and described separately. Of course, when implementing this application, the functions of each unit can be implemented in the same or multiple software and/or hardware.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment can be used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements not only includes those elements, but also includes Other elements that are not explicitly listed, or also include elements inherent to such processes, methods, commodities, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity, or equipment that includes the element.
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application can be provided as a method, a system, or a computer program product. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。This application may be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. This application can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。The various embodiments in this specification are described in a progressive manner, and the same or similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the difference from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。The above descriptions are only examples of the present application, and are not used to limit the present application. For those skilled in the art, this application can have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the scope of the claims of this application.

Claims (11)

  1. 一种神经网络权重矩阵调整方法,其特征在于,包括:A method for adjusting the weight matrix of a neural network is characterized in that it comprises:
    判断神经网络权重矩阵的权重分布是否低于第一预设阈值;Judging whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
    若是,将所述神经网络权重矩阵中的所有权重值均乘以一第一常数;If yes, multiply all the weight values in the neural network weight matrix by a first constant;
    若否,判断所述神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,所述第二预设阈值大于所述第一预设阈值;If not, determine whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, where the second preset threshold is greater than the first preset threshold;
    若所述神经网络权重矩阵的权重分布高于第二预设阈值,则将所述神经网络权重矩阵中的所有权重值均除以一第二常数;If the weight distribution of the neural network weight matrix is higher than the second preset threshold, dividing all the weight values in the neural network weight matrix by a second constant;
    其中,所述第一常数以及所述第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
  2. 根据权利要求1所述的神经网络权重矩阵调整方法,其特征在于,所述将所述神经网络权重矩阵中的所有权重值均乘以一第一常数后,还包括:The method for adjusting a neural network weight matrix according to claim 1, wherein said multiplying all the weight values in the neural network weight matrix by a first constant further comprises:
    判断处理后的权重矩阵中的各权重值的位数是否超出第三预设阈值;Judging whether the number of digits of each weight value in the processed weight matrix exceeds the third preset threshold;
    若是,截取各权重值超出所述第三预设阈值的位得到第一权重阵列以及第二权重阵列;If yes, intercept the bits whose weight values exceed the third preset threshold to obtain a first weight array and a second weight array;
    其中,所述第一权重阵列为将各权重值的位数超出所述第三预设阈值的位截去后剩余的权重阵列,用于存入一存储单元阵列;所述第二权重阵列为将各权重值的位数超出所述第三预设阈值的位截取出来得到的权重阵列,用于存入另一存储单元阵列或输入算术运算电路。Wherein, the first weight array is the remaining weight array after truncating the bits whose digits of each weight value exceeds the third preset threshold, and is used to store in a memory cell array; the second weight array is The weight array obtained by truncating the bits whose digits of each weight value exceeds the third preset threshold is used for storing in another memory cell array or inputting an arithmetic operation circuit.
  3. 根据权利要求2所述的神经网络权重矩阵调整方法,其特征在于,还包括:The method for adjusting the weight matrix of a neural network according to claim 2, further comprising:
    若处理后的权重矩阵中的各权重值的位数均未超出所述第三预设阈值,则将处理后的权重阵列用于存入一存储单元阵列。If the number of bits of each weight value in the processed weight matrix does not exceed the third preset threshold, the processed weight array is used to store in a memory cell array.
  4. 根据权利要求2所述的神经网络权重矩阵调整方法,其特征在于,还包括:The method for adjusting the weight matrix of a neural network according to claim 2, further comprising:
    当所述第二权重阵列存入另一存储单元阵列时,将所述第一权重阵列对应的存储单元阵列后的ADC输出结果与所述第二权重阵列对应的存储单元阵列后的ADC输出结果结合;When the second weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the first weight array is combined with the ADC output result after the memory cell array corresponding to the second weight array Combine
    当所述第二权重阵列输入所述算术运算电路时,将所述第一权重阵列对应的存储单元阵列后的ADC输出结果与所述算术运算电路的输出结果结合。When the second weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the first weight array is combined with the output result of the arithmetic operation circuit.
  5. 根据权利要求1所述的神经网络权重矩阵调整方法,其特征在于,所述将所述神经网络权重矩阵中的所有权重值均除以一第二常数包括:The method for adjusting a neural network weight matrix according to claim 1, wherein said dividing all the weight values in the neural network weight matrix by a second constant comprises:
    将各权重值除以所述第二常数得到第三权重阵列;Dividing each weight value by the second constant to obtain a third weight array;
    将各权重值除以所述第二常数后的溢出位另存为第四权重阵列;Saving the overflow bits obtained by dividing each weight value by the second constant as a fourth weight array;
    其中,所述第三权重阵列用于存入一存储单元阵列;所述第四权重阵列用于存入另一存储单元阵列或输入算术运算电路。Wherein, the third weight array is used to store in a memory cell array; the fourth weight array is used to store in another memory cell array or input an arithmetic operation circuit.
  6. 根据权利要求5所述的神经网络权重矩阵调整方法,其特征在于,还包括:The method for adjusting the weight matrix of a neural network according to claim 5, further comprising:
    当所述第四权重阵列存入另一存储单元阵列时,将所述第三权重阵列对应的存储单元阵列后的ADC输出结果与所述第四权重阵列对应的存储单元阵列后的ADC输出结果结合;When the fourth weight array is stored in another memory cell array, the ADC output result after the memory cell array corresponding to the third weight array is combined with the ADC output result after the memory cell array corresponding to the fourth weight array Combine
    当所述第四权重阵列输入所述算术运算电路时,将所述第三权重阵列对应的存储单元阵列后的ADC输出结果与所述算术运算电路的输出结果结合。When the fourth weight array is input to the arithmetic operation circuit, the ADC output result after the memory cell array corresponding to the third weight array is combined with the output result of the arithmetic operation circuit.
  7. 一种神经网络权重矩阵写入控制方法,其特征在于,包括:A neural network weight matrix writing control method, which is characterized in that it comprises:
    根据数据调整指令控制移位寄存器对其输入的各权重值进行移位操作,并且,将所述移位寄存器对其输入的权重值进行移位后溢出的位及所述权重值在所述神经网络权重矩阵中的地址存入缓存器,其中,所述数据调整指令包括:移位方向以及移位位数;According to the data adjustment instruction, the shift register is controlled to perform the shift operation of each weight value input, and the shift register is shifted to the weight value input and the overflow bit and the weight value are in the nerve The address in the network weight matrix is stored in the buffer, where the data adjustment instruction includes: a shift direction and a shift bit;
    将所述缓存器中的数据存入一存储单元阵列或输入算术运算电路;Storing the data in the buffer into a memory cell array or inputting an arithmetic operation circuit;
    其中,所述数据调整指令为神经网络权重矩阵的权重分布不均时产生;所述移位寄存器连接一写入模块;所述写入模块连接另一存储单元阵列,用于将所述移位寄存器移位后的数据写入另一存储单元阵列。Wherein, the data adjustment instruction is generated when the weight distribution of the neural network weight matrix is uneven; the shift register is connected to a write module; the write module is connected to another memory cell array for shifting the The shifted data of the register is written into another memory cell array.
  8. 根据权利要求7所述的神经网络权重矩阵写入控制方法,其特征在于,还包括:The neural network weight matrix writing control method according to claim 7, characterized in that it further comprises:
    当所述缓存器中的数据输入算术运算电路时,将所述另一存储单元阵列后的ADC输出结果与所述算术运算电路的输出结果结合;When the data in the buffer is input to the arithmetic operation circuit, combining the ADC output result after the another memory cell array with the output result of the arithmetic operation circuit;
    当所述缓存器中的数据存入一存储单元阵列时,将所述存储单元阵列后的ADC输出结果与所述另一存储单元阵列后的ADC输出结果结合。When the data in the buffer is stored in a memory cell array, the ADC output result after the memory cell array is combined with the ADC output result after the other memory cell array.
  9. 一种神经网络权重矩阵调整装置,其特征在于,包括:A neural network weight matrix adjustment device, which is characterized in that it comprises:
    第一判断模块,判断神经网络权重矩阵的权重分布是否低于第一预设阈值;The first judgment module judges whether the weight distribution of the neural network weight matrix is lower than the first preset threshold;
    权重放大模块,若神经网络权重矩阵的权重分布低于第一预设阈值,将所述神经网络权重矩阵中的所有权重值均乘以一第一常数;The weight amplification module, if the weight distribution of the neural network weight matrix is lower than the first preset threshold, multiply all the weight values in the neural network weight matrix by a first constant;
    第二判断模块,若神经网络权重矩阵的权重分布不低于第一预设阈值,判断所述神经网络权重矩阵的权重分布是否高于第二预设阈值,其中,所述第二预设阈值大于所述第一预设阈值;The second judgment module, if the weight distribution of the neural network weight matrix is not lower than the first preset threshold, judge whether the weight distribution of the neural network weight matrix is higher than a second preset threshold, wherein the second preset threshold Greater than the first preset threshold;
    权重缩小模块,若所述神经网络权重矩阵的权重分布高于第二预设阈值,则将所述 神经网络权重矩阵中的所有权重值均除以一第二常数;The weight reduction module, if the weight distribution of the neural network weight matrix is higher than a second preset threshold, divide all the weight values in the neural network weight matrix by a second constant;
    其中,所述第一常数以及所述第二常数均大于1。Wherein, the first constant and the second constant are both greater than one.
  10. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至6任一项所述的神经网络权重矩阵调整方法的步骤。An electronic device, comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor implements any one of claims 1 to 6 when the program is executed The steps of the neural network weight matrix adjustment method.
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1至6任一项所述的神经网络权重矩阵调整方法的步骤。A computer-readable storage medium with a computer program stored thereon, wherein the computer program implements the steps of the neural network weight matrix adjustment method according to any one of claims 1 to 6 when the computer program is executed by a processor.
PCT/CN2020/075648 2020-02-18 2020-02-18 Neural network weight matrix adjustment method, writing control method, and related device WO2021163866A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/075648 WO2021163866A1 (en) 2020-02-18 2020-02-18 Neural network weight matrix adjustment method, writing control method, and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/075648 WO2021163866A1 (en) 2020-02-18 2020-02-18 Neural network weight matrix adjustment method, writing control method, and related device

Publications (1)

Publication Number Publication Date
WO2021163866A1 true WO2021163866A1 (en) 2021-08-26

Family

ID=77390305

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/075648 WO2021163866A1 (en) 2020-02-18 2020-02-18 Neural network weight matrix adjustment method, writing control method, and related device

Country Status (1)

Country Link
WO (1) WO2021163866A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114995783A (en) * 2022-07-26 2022-09-02 中科南京智能技术研究院 Memory computing unit
CN115985380A (en) * 2023-03-17 2023-04-18 之江实验室 FeFET array data verification method based on digital circuit control
CN116070685A (en) * 2023-03-27 2023-05-05 南京大学 Memory computing unit, memory computing array and memory computing chip
CN116504281A (en) * 2022-01-18 2023-07-28 浙江力德仪器有限公司 Computing unit, array and computing method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320495A (en) * 2014-07-22 2016-02-10 英特尔公司 Weight-shifting mechanism for convolutional neural network
CN107994570A (en) * 2017-12-04 2018-05-04 华北电力大学(保定) A kind of method for estimating state and system based on neutral net
CN109791626A (en) * 2017-12-29 2019-05-21 清华大学 The coding method of neural network weight, computing device and hardware system
CN109784485A (en) * 2018-12-26 2019-05-21 中国科学院计算技术研究所 A kind of optical neural network processor and its calculation method
CN110413255A (en) * 2018-04-28 2019-11-05 北京深鉴智能科技有限公司 Artificial neural network method of adjustment and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105320495A (en) * 2014-07-22 2016-02-10 英特尔公司 Weight-shifting mechanism for convolutional neural network
CN107994570A (en) * 2017-12-04 2018-05-04 华北电力大学(保定) A kind of method for estimating state and system based on neutral net
CN109791626A (en) * 2017-12-29 2019-05-21 清华大学 The coding method of neural network weight, computing device and hardware system
CN110413255A (en) * 2018-04-28 2019-11-05 北京深鉴智能科技有限公司 Artificial neural network method of adjustment and device
CN109784485A (en) * 2018-12-26 2019-05-21 中国科学院计算技术研究所 A kind of optical neural network processor and its calculation method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116504281A (en) * 2022-01-18 2023-07-28 浙江力德仪器有限公司 Computing unit, array and computing method
CN114995783A (en) * 2022-07-26 2022-09-02 中科南京智能技术研究院 Memory computing unit
CN115985380A (en) * 2023-03-17 2023-04-18 之江实验室 FeFET array data verification method based on digital circuit control
CN116070685A (en) * 2023-03-27 2023-05-05 南京大学 Memory computing unit, memory computing array and memory computing chip
CN116070685B (en) * 2023-03-27 2023-07-21 南京大学 Memory computing unit, memory computing array and memory computing chip

Similar Documents

Publication Publication Date Title
WO2021163866A1 (en) Neural network weight matrix adjustment method, writing control method, and related device
CN113344170B (en) Neural network weight matrix adjustment method, write-in control method and related device
EP3144805B1 (en) Method and processing apparatus for performing arithmetic operation
CN110750232B (en) SRAM-based parallel multiplication and addition device
US20170270073A1 (en) Memory Reduction Method For Fixed Point Matrix Multiply
US10860320B1 (en) Orthogonal data transposition system and method during data transfers to/from a processing array
US20210279557A1 (en) Dynamic variable bit width neural processor
US10747501B2 (en) Providing efficient floating-point operations using matrix processors in processor-based systems
US20200293863A1 (en) System and method for efficient utilization of multipliers in neural-network computations
US20220021399A1 (en) Bit string compression
KR20210059623A (en) Electronic device and method for inference Binary and Ternary Neural Networks
WO2020172950A1 (en) Dynamic bias analog vector-matrix multiplication operation circuit and operation control method therefor
Reis et al. A fast and energy efficient computing-in-memory architecture for few-shot learning applications
US20230005529A1 (en) Neuromorphic device and electronic device including the same
US20220083336A1 (en) Dynamic precision bit string accumulation
US20220108203A1 (en) Machine learning hardware accelerator
CN114723024A (en) Linear programming-based neural network mapping method for storage and calculation integrated chip
US11763162B2 (en) Dynamic gradient calibration method for computing-in-memory neural network and system thereof
US20220206554A1 (en) Processor and power supply ripple reduction method
US20210357317A1 (en) Memory device and operation method
Zhang et al. Yolov3-tiny Object Detection SoC Based on FPGA Platform
WO2023245757A1 (en) In-memory computing circuit, method and semiconductor memory
CN116129973A (en) In-memory computing method and circuit, semiconductor memory and memory structure
US20230153067A1 (en) In-memory computing method and circuit, semiconductor memory, and memory structure
CN116189732B (en) Integrated memory chip and method for optimizing read-out circuit

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920009

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920009

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20920009

Country of ref document: EP

Kind code of ref document: A1