CN111160540A - Neural network parameter storage method and device - Google Patents

Neural network parameter storage method and device Download PDF

Info

Publication number
CN111160540A
CN111160540A CN201911220315.6A CN201911220315A CN111160540A CN 111160540 A CN111160540 A CN 111160540A CN 201911220315 A CN201911220315 A CN 201911220315A CN 111160540 A CN111160540 A CN 111160540A
Authority
CN
China
Prior art keywords
neural network
parameters
storing
effective
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911220315.6A
Other languages
Chinese (zh)
Inventor
张树华
仝杰
雷煜卿
张鋆
李荡
张明皓
王兰若
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Corp of China SGCC
Priority to CN201911220315.6A priority Critical patent/CN111160540A/en
Publication of CN111160540A publication Critical patent/CN111160540A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for storing neural network parameters, which comprises the following steps: optimizing the network scale of the neural network; the parameters of the neural network are stored by adopting the cache difference value, so that the processing efficiency of the neural network is improved, the problem of overlarge resource consumption of a chip in the prior art is solved, and the processing efficiency of the neural network is improved.

Description

Neural network parameter storage method and device
Technical Field
The application relates to an artificial intelligence technology, mainly relates to a storage method of neural network parameters, and also relates to a storage device of the neural network parameters.
Background
In the defect identification of power transmission and transformation and distribution equipment, multispectral spectral images such as visible light, infrared light and the like in the operation process of the equipment are acquired through routing inspection equipment such as a power transmission line unmanned aerial vehicle, a mobile type distribution control ball in a capital construction field, a mobile operation terminal of an maintainer, a substation routing inspection robot, a fixed binocular camera and the like, and the images are processed in real time through an artificial intelligence deep learning algorithm, so that the abnormal operation state, the appearance defect and the latent fault of the equipment can be effectively found.
Deep learning is an important branch of the field of artificial intelligence, and a computer has the ability of learning without explicit programming. This means that a program is created that can be trained to learn how to do some intelligent behavior, and then the program can complete the task itself. The advantages of an efficient machine learning algorithm are apparent. A machine learning algorithm, through training alone, can solve every new problem in a certain domain, rather than programming every new problem specifically.
A computational model, to be divided into neural networks, usually requires a large number of nodes (also called 'neurons',) connected to each other and has two properties: each neuron processes the weighted input values from other neighboring neurons by a certain output function (also called excitation function) calculation. The strength of the information transfer between neurons is defined by a so-called weighting value, which the algorithm learns by itself and adjusts.
The classical neural network model occupies hundreds of MB of memory. In chip design, the storage of deep neural network parameters needs to occupy a large amount of space and reading time, and the storage space and the reading time of the parameters need to be reduced according to the characteristics of the neural network parameters, so that the calculation speed of the neural network is increased, and the resource requirements of an algorithm on a chip are reduced.
Disclosure of Invention
The application provides a storage method of neural network parameters, which solves the problem of overlarge resource consumption of a chip in the prior art.
The application provides a storage method of neural network parameters, which comprises the following steps:
optimizing the network scale of the neural network;
and the parameters of the neural network are stored by adopting the cache difference value, so that the processing efficiency of the neural network is improved.
Preferably, the network size of the optimized neural network includes:
removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold value;
the remaining parameters serve as the active parameters of the neural network.
Preferably, the storing the parameters of the neural network by using the cache difference value specifically includes:
and adding a difference value between the front system and the back system in the memory for storing effective parameters of the neural network.
Preferably, the difference value is related to the connection read-write length of the cache; and when the number of the stored effective parameters of the neural network exceeds the connection read-write length of the cache, starting from the initial value.
This application provides a storage device of neural network parameter simultaneously, its characterized in that includes:
the optimization unit is used for optimizing the network scale of the neural network;
and the storage unit is used for storing the parameters of the neural network by adopting the cache difference value, so that the processing efficiency of the parameters of the neural network is improved.
Preferably, the optimization unit includes:
the removing subunit is used for removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold;
and the effective parameter determining subunit is used for determining the residual parameters as effective parameters of the neural network.
Preferably, the storage unit includes:
and the effective parameter storage subunit is used for adding a difference value between the front system and the rear system in the memory and storing the effective parameters of the neural network.
The application provides a storage method of neural network parameters, which is characterized in that the network scale of the neural network is optimized, and the parameters of the neural network are stored by adopting a cache difference value, so that the problem of overlarge resource consumption of a chip in the prior art is solved, and the processing efficiency of the neural network is improved.
Drawings
FIG. 1 is a schematic flow chart of a method for storing neural network parameters provided in the present application;
FIG. 2 is a schematic diagram of an artificial neural network neuron architecture to which the present application relates;
FIG. 3 is a schematic illustration of neural network lightweight to which the present application relates;
FIG. 4 is a schematic diagram of neural network data screening to which the present application relates;
FIG. 5 is a schematic diagram of a parameter storage manner of a neural network related to the present application;
fig. 6 is a schematic diagram of a storage device for neural network parameters provided in the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
Fig. 1 is a schematic flow chart of a method for storing neural network parameters, and the method provided in the embodiment of the present application is described in detail below with reference to fig. 1.
And S101, optimizing the network scale of the neural network.
A computational model, to be divided into neural networks, usually requires a large number of nodes (also called 'neurons', as shown in fig. 2) connected to each other, and has two characteristics: each neuron processes the weighted input values from other neighboring neurons by a certain output function (also called excitation function) calculation. The strength of the information transfer between neurons is defined by a so-called weighting value, which the algorithm learns by itself and adjusts. On this basis, the computational model of the neural network relies on a large amount of data for training. In fig. 2, a1, a2, and … an are features of previous layer outputs, w1, w2, and … wn are feature weights, and b is offset, and the feature vectors and the weight vectors are subjected to inner product and then subjected to a nonlinear function to generate outputs.
AlexNet VGG16 Inception-v3
Model Memory (MB) >200 >500 90-100
Parameter (million) 60 138 23.2
Calculated amount (million) 720 15300 5000
The classical neural network model occupies hundreds of MB of memory. As shown in the above table, in the chip design, the storage of the deep neural network parameters needs to occupy a large amount of space and reading time, and the storage space and the reading time of the parameters need to be reduced according to the characteristics of the neural network parameters, so as to accelerate the calculation speed of the neural network.
In the deep neural network, although a lot of weight coefficients of some network models are generated, the proportion of partial parameters is very small, pruning can be performed without losing precision, the network scale is optimized, as shown in fig. 3, the weight with the weight value smaller than the preset weight threshold value in the neural network parameters is removed, and the rest parameters are used as effective parameters of the neural network.
And step S102, storing the parameters of the neural network by adopting the cache difference value, and improving the processing efficiency of the neural network.
Fig. 4 shows the object relationship between data and weight, and the conventional processing method is sequential storage and sequential reading. It can be seen that if the register is used for data storage, when the network parameters reach millions, the register is used for storage, which occupies a large area and has an excessive influence on the chip timing sequence, and a new storage scheme needs to be considered, so that the area and the power consumption of the chip are reduced.
In the storage of the network parameters, the buf (buffer) storage of fig. 5 is changed, and a difference diff _ buf between two systems before and after is added in the memory for storing the effective parameters of the neural network.
A difference value (diff _ buf) related to the connection read-write length (burst _ length) of the cache; and when the number of the stored effective parameters of the neural network exceeds the cached connection read-write length (burst _ length), starting from the initial value by the difference value.
Based on the same inventive concept, the present application also provides a storage device 600 for neural network parameters, as shown in fig. 6, including:
an optimizing unit 610, configured to optimize a network scale of the neural network;
the storage unit 620 is configured to store the parameters of the neural network by using the cache difference, so as to improve the processing efficiency of the parameters of the neural network.
The optimization unit comprises:
the removing subunit is used for removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold;
and the effective parameter determining subunit is used for determining the residual parameters as effective parameters of the neural network.
The memory cell includes:
and the effective parameter storage subunit is used for adding a difference value between the front system and the rear system in the memory and storing the effective parameters of the neural network.
The application provides a storage method of neural network parameters, which is characterized in that the network scale of the neural network is optimized, and the parameters of the neural network are stored by adopting a cache difference value, so that the problem of overlarge resource consumption of a chip in the prior art is solved, and the processing efficiency of the neural network is improved.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the invention.

Claims (7)

1. A method for storing neural network parameters, comprising:
optimizing the network scale of the neural network;
and the parameters of the neural network are stored by adopting the cache difference value, so that the processing efficiency of the neural network is improved.
2. The method of claim 1, wherein optimizing the network size of the neural network comprises:
removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold value;
the remaining parameters serve as the active parameters of the neural network.
3. The method according to claim 1, wherein the parameters of the neural network are stored using the buffered difference values, specifically comprising:
and adding a difference value between the front system and the back system in the memory for storing effective parameters of the neural network.
4. The method of claim 3, wherein the difference is related to a connection read/write length of the buffer; and when the number of the stored effective parameters of the neural network exceeds the connection read-write length of the cache, starting from the initial value.
5. An apparatus for storing neural network parameters, comprising:
the optimization unit is used for optimizing the network scale of the neural network;
and the storage unit is used for storing the parameters of the neural network by adopting the cache difference value, so that the processing efficiency of the parameters of the neural network is improved.
6. The apparatus of claim 1, wherein the optimization unit comprises:
the removing subunit is used for removing the weight of which the weight value in the neural network parameters is smaller than a preset weight threshold;
and the effective parameter determining subunit is used for determining the residual parameters as effective parameters of the neural network.
7. The apparatus of claim 1, wherein the storage unit comprises:
and the effective parameter storage subunit is used for adding a difference value between the front system and the rear system in the memory and storing the effective parameters of the neural network.
CN201911220315.6A 2019-12-03 2019-12-03 Neural network parameter storage method and device Pending CN111160540A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911220315.6A CN111160540A (en) 2019-12-03 2019-12-03 Neural network parameter storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911220315.6A CN111160540A (en) 2019-12-03 2019-12-03 Neural network parameter storage method and device

Publications (1)

Publication Number Publication Date
CN111160540A true CN111160540A (en) 2020-05-15

Family

ID=70555684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911220315.6A Pending CN111160540A (en) 2019-12-03 2019-12-03 Neural network parameter storage method and device

Country Status (1)

Country Link
CN (1) CN111160540A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436600A (en) * 2020-10-29 2021-03-02 山东理工大学 Power equipment monitoring system based on flight inspection mode
CN113378835A (en) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 Labeling model training method, sample labeling method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108122031A (en) * 2017-12-20 2018-06-05 杭州国芯科技股份有限公司 A kind of neutral net accelerator architecture of low-power consumption
CN108764471A (en) * 2018-05-17 2018-11-06 西安电子科技大学 The neural network cross-layer pruning method of feature based redundancy analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108122031A (en) * 2017-12-20 2018-06-05 杭州国芯科技股份有限公司 A kind of neutral net accelerator architecture of low-power consumption
CN108764471A (en) * 2018-05-17 2018-11-06 西安电子科技大学 The neural network cross-layer pruning method of feature based redundancy analysis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112436600A (en) * 2020-10-29 2021-03-02 山东理工大学 Power equipment monitoring system based on flight inspection mode
CN113378835A (en) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 Labeling model training method, sample labeling method and related device

Similar Documents

Publication Publication Date Title
CN109816032B (en) Unbiased mapping zero sample classification method and device based on generative countermeasure network
JP2018014114A (en) Methods and systems for replaceable synaptic weight storage in neuro-processors
CN111160540A (en) Neural network parameter storage method and device
CN111445020A (en) Graph-based convolutional network training method, device and system
CN111465943A (en) On-chip computing network
US20200218777A1 (en) Signal Processing Method and Apparatus
CN112668716A (en) Training method and device of neural network model
CN113627545A (en) Image classification method and system based on isomorphic multi-teacher guidance knowledge distillation
CN113144624B (en) Data processing method, device, equipment and storage medium
CN108629410B (en) Neural network processing method based on principal component analysis dimension reduction and/or dimension increase
Bhadouria et al. A study on genetic expression programming-based approach for impulse noise reduction in images
JP2023076839A (en) Machine learning device and pruning method
CN113962378A (en) Convolution hardware accelerator based on RS data stream and method thereof
CN112396072B (en) Image classification acceleration method and device based on ASIC (application specific integrated circuit) and VGG16
CN110490312B (en) Pooling calculation method and circuit
KR20210038027A (en) Method for Training to Compress Neural Network and Method for Using Compressed Neural Network
CN108629409B (en) Neural network processing system for reducing IO overhead based on principal component analysis
CN113627587A (en) Multichannel convolutional neural network acceleration method and device
CN112669270A (en) Video quality prediction method and device and server
CN111506522A (en) Data processing apparatus and method
KR102579116B1 (en) Apparatus and method for automatically learning and distributing artificial intelligence based on the cloud
CN116665064B (en) Urban change map generation method based on distillation generation and characteristic disturbance and application thereof
WO2021147276A1 (en) Data processing method and apparatus, and chip, electronic device and storage medium
US20240104375A1 (en) Method and system for lightweighting artificial neural network model, and non-transitory computer-readable recording medium
US11741349B2 (en) Performing matrix-vector multiply operations for neural networks on electronic devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination