CN111160540A

CN111160540A - Neural network parameter storage method and device

Info

Publication number: CN111160540A
Application number: CN201911220315.6A
Authority: CN
Inventors: 张树华; 仝杰; 雷煜卿; 张鋆; 李荡; 张明皓; 王兰若
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2019-12-03
Filing date: 2019-12-03
Publication date: 2020-05-15

Abstract

The invention discloses a method for storing neural network parameters, which comprises the following steps: optimizing the network scale of the neural network; the parameters of the neural network are stored by adopting the cache difference value, so that the processing efficiency of the neural network is improved, the problem of overlarge resource consumption of a chip in the prior art is solved, and the processing efficiency of the neural network is improved.

Description

Neural network parameter storage method and device

Technical Field

The application relates to an artificial intelligence technology, mainly relates to a storage method of neural network parameters, and also relates to a storage device of the neural network parameters.

Background

In the defect identification of power transmission and transformation and distribution equipment, multispectral spectral images such as visible light, infrared light and the like in the operation process of the equipment are acquired through routing inspection equipment such as a power transmission line unmanned aerial vehicle, a mobile type distribution control ball in a capital construction field, a mobile operation terminal of an maintainer, a substation routing inspection robot, a fixed binocular camera and the like, and the images are processed in real time through an artificial intelligence deep learning algorithm, so that the abnormal operation state, the appearance defect and the latent fault of the equipment can be effectively found.

Deep learning is an important branch of the field of artificial intelligence, and a computer has the ability of learning without explicit programming. This means that a program is created that can be trained to learn how to do some intelligent behavior, and then the program can complete the task itself. The advantages of an efficient machine learning algorithm are apparent. A machine learning algorithm, through training alone, can solve every new problem in a certain domain, rather than programming every new problem specifically.

A computational model, to be divided into neural networks, usually requires a large number of nodes (also called 'neurons',) connected to each other and has two properties: each neuron processes the weighted input values from other neighboring neurons by a certain output function (also called excitation function) calculation. The strength of the information transfer between neurons is defined by a so-called weighting value, which the algorithm learns by itself and adjusts.

The classical neural network model occupies hundreds of MB of memory. In chip design, the storage of deep neural network parameters needs to occupy a large amount of space and reading time, and the storage space and the reading time of the parameters need to be reduced according to the characteristics of the neural network parameters, so that the calculation speed of the neural network is increased, and the resource requirements of an algorithm on a chip are reduced.

Disclosure of Invention

The application provides a storage method of neural network parameters, which solves the problem of overlarge resource consumption of a chip in the prior art.

The application provides a storage method of neural network parameters, which comprises the following steps:

optimizing the network scale of the neural network;

and the parameters of the neural network are stored by adopting the cache difference value, so that the processing efficiency of the neural network is improved.

Preferably, the network size of the optimized neural network includes:

removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold value;

the remaining parameters serve as the active parameters of the neural network.

Preferably, the storing the parameters of the neural network by using the cache difference value specifically includes:

and adding a difference value between the front system and the back system in the memory for storing effective parameters of the neural network.

Preferably, the difference value is related to the connection read-write length of the cache; and when the number of the stored effective parameters of the neural network exceeds the connection read-write length of the cache, starting from the initial value.

This application provides a storage device of neural network parameter simultaneously, its characterized in that includes:

the optimization unit is used for optimizing the network scale of the neural network;

and the storage unit is used for storing the parameters of the neural network by adopting the cache difference value, so that the processing efficiency of the parameters of the neural network is improved.

Preferably, the optimization unit includes:

the removing subunit is used for removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold;

and the effective parameter determining subunit is used for determining the residual parameters as effective parameters of the neural network.

Preferably, the storage unit includes:

and the effective parameter storage subunit is used for adding a difference value between the front system and the rear system in the memory and storing the effective parameters of the neural network.

The application provides a storage method of neural network parameters, which is characterized in that the network scale of the neural network is optimized, and the parameters of the neural network are stored by adopting a cache difference value, so that the problem of overlarge resource consumption of a chip in the prior art is solved, and the processing efficiency of the neural network is improved.

Drawings

FIG. 1 is a schematic flow chart of a method for storing neural network parameters provided in the present application;

FIG. 2 is a schematic diagram of an artificial neural network neuron architecture to which the present application relates;

FIG. 3 is a schematic illustration of neural network lightweight to which the present application relates;

FIG. 4 is a schematic diagram of neural network data screening to which the present application relates;

FIG. 5 is a schematic diagram of a parameter storage manner of a neural network related to the present application;

fig. 6 is a schematic diagram of a storage device for neural network parameters provided in the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

Fig. 1 is a schematic flow chart of a method for storing neural network parameters, and the method provided in the embodiment of the present application is described in detail below with reference to fig. 1.

And S101, optimizing the network scale of the neural network.

A computational model, to be divided into neural networks, usually requires a large number of nodes (also called 'neurons', as shown in fig. 2) connected to each other, and has two characteristics: each neuron processes the weighted input values from other neighboring neurons by a certain output function (also called excitation function) calculation. The strength of the information transfer between neurons is defined by a so-called weighting value, which the algorithm learns by itself and adjusts. On this basis, the computational model of the neural network relies on a large amount of data for training. In fig. 2, a1, a2, and … an are features of previous layer outputs, w1, w2, and … wn are feature weights, and b is offset, and the feature vectors and the weight vectors are subjected to inner product and then subjected to a nonlinear function to generate outputs.

	AlexNet	VGG16	Inception-v3
				Model Memory (MB)	＞200	＞500	90-100
Parameter (million)	60	138	23.2
				Calculated amount (million)	720	15300	5000

The classical neural network model occupies hundreds of MB of memory. As shown in the above table, in the chip design, the storage of the deep neural network parameters needs to occupy a large amount of space and reading time, and the storage space and the reading time of the parameters need to be reduced according to the characteristics of the neural network parameters, so as to accelerate the calculation speed of the neural network.

In the deep neural network, although a lot of weight coefficients of some network models are generated, the proportion of partial parameters is very small, pruning can be performed without losing precision, the network scale is optimized, as shown in fig. 3, the weight with the weight value smaller than the preset weight threshold value in the neural network parameters is removed, and the rest parameters are used as effective parameters of the neural network.

And step S102, storing the parameters of the neural network by adopting the cache difference value, and improving the processing efficiency of the neural network.

Fig. 4 shows the object relationship between data and weight, and the conventional processing method is sequential storage and sequential reading. It can be seen that if the register is used for data storage, when the network parameters reach millions, the register is used for storage, which occupies a large area and has an excessive influence on the chip timing sequence, and a new storage scheme needs to be considered, so that the area and the power consumption of the chip are reduced.

In the storage of the network parameters, the buf (buffer) storage of fig. 5 is changed, and a difference diff _ buf between two systems before and after is added in the memory for storing the effective parameters of the neural network.

A difference value (diff _ buf) related to the connection read-write length (burst _ length) of the cache; and when the number of the stored effective parameters of the neural network exceeds the cached connection read-write length (burst _ length), starting from the initial value by the difference value.

Based on the same inventive concept, the present application also provides a storage device 600 for neural network parameters, as shown in fig. 6, including:

an optimizing unit 610, configured to optimize a network scale of the neural network;

the storage unit 620 is configured to store the parameters of the neural network by using the cache difference, so as to improve the processing efficiency of the parameters of the neural network.

The optimization unit comprises:

The memory cell includes:

Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims

1. A method for storing neural network parameters, comprising:

optimizing the network scale of the neural network;

2. The method of claim 1, wherein optimizing the network size of the neural network comprises:

the remaining parameters serve as the active parameters of the neural network.

3. The method according to claim 1, wherein the parameters of the neural network are stored using the buffered difference values, specifically comprising:

4. The method of claim 3, wherein the difference is related to a connection read/write length of the buffer; and when the number of the stored effective parameters of the neural network exceeds the connection read-write length of the cache, starting from the initial value.

5. An apparatus for storing neural network parameters, comprising:

6. The apparatus of claim 1, wherein the optimization unit comprises:

7. The apparatus of claim 1, wherein the storage unit comprises: