CN114372572A - Residual error neural network compression method based on activation information - Google Patents

Residual error neural network compression method based on activation information Download PDF

Info

Publication number
CN114372572A
CN114372572A CN202210045279.XA CN202210045279A CN114372572A CN 114372572 A CN114372572 A CN 114372572A CN 202210045279 A CN202210045279 A CN 202210045279A CN 114372572 A CN114372572 A CN 114372572A
Authority
CN
China
Prior art keywords
residual
degree
gradient
calculating
hidden layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210045279.XA
Other languages
Chinese (zh)
Inventor
秦国庆
夏应林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Xuyuguan Technology Co.,Ltd.
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202210045279.XA priority Critical patent/CN114372572A/en
Publication of CN114372572A publication Critical patent/CN114372572A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, in particular to a residual error neural network compression method based on activation information. The method trains a residual error neural network to determine the weight of each layer in the network; acquiring the integral gradient abnormal degree of each hidden layer based on the activation information, and respectively calculating the necessity of performing residual error operation between the associated combinations of the hidden layers according to the integral gradient abnormal degree; and compressing and simplifying the residual error neural network according to the reasonable removal degree of each residual error operation combination according to the necessity. Residual operations with small influence in the neural network are removed through carrying out necessity and removal rationality evaluation on the residual operations, so that the compression of the neural network is realized, the requirements of the neural network on the storage space and the computing performance of hardware equipment are reduced, and the device can be arranged on equipment with low power consumption.

Description

Residual error neural network compression method based on activation information
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a residual error neural network compression method based on activation information.
Background
In order to increase the feature extraction capability and the inference accuracy of the network, so that the loss function can converge faster and the convergence value is lower, the number of layers of the neural network is often increased to achieve the purpose. The increase of the number of network layers is practically always accompanied by the problems of gradient disappearance or gradient explosion. To solve this problem, it is currently solved by residual operations inside the neural network.
But some residual operations in current networks are not really effective, but still take up some computation. Therefore, in order to apply the network to the low power consumption device, the residual operation that has little influence in the network needs to be removed to implement the compression of the network.
Disclosure of Invention
In order to solve the above technical problem, an object of the present invention is to provide a residual neural network compression method based on activation information, and the adopted technical solution is specifically as follows:
training a residual error neural network to determine the weight of each layer in the network;
acquiring activation information of each neuron in the current hidden layer based on the weight, calculating the gradient abnormal degree of each neuron by using the activation information, and calculating the overall gradient abnormal degree of the current hidden layer according to the gradient abnormal degree of each neuron; respectively calculating the necessity of performing residual operation between each hidden layer association combination based on the overall gradient abnormal degree;
calculating the preference degree of each residual operation according to the necessity of the residual operation, calculating the average preference degree of each residual operation combination by using the preference degree of each residual operation, and sequencing a plurality of residual operation combinations from large to small based on the average preference degree to obtain a combination sequence; and sequentially removing one residual operation combination in the combination sequence, calculating the reasonable removal degree of the residual operation combination according to the loss function of the residual neural network, and compressing and simplifying the residual neural network according to the reasonable removal degree.
Further, the method for obtaining activation information of each neuron in the current hidden layer based on the weight includes:
and carrying out weighted summation on the input value of the previous layer of the current hidden layer and the corresponding weight value of the previous layer, and substituting the summation result into an activation function formula to obtain the activation information of the corresponding neuron in the current hidden layer.
Further, the method for calculating the abnormal gradient degree of each neuron by using the activation information comprises the following steps:
and deriving the activation information to obtain a derivative, setting the maximum gradient of the activation function, calculating the gradient abnormal degree of the corresponding neuron by combining the derivative and the maximum gradient, wherein the derivative and the gradient abnormal degree are in a negative correlation relationship.
Further, the method for calculating the overall gradient anomaly degree of the current hidden layer from the gradient anomaly degree of each neuron comprises the following steps:
setting an abnormal degree threshold, when the abnormal degree of the gradient is greater than or equal to the abnormal degree threshold, considering the gradient of the neuron to be normal, counting a first number of the neurons corresponding to the normal gradient in the current hidden layer, calculating a ratio between the first number and the total number of the neurons in the current hidden layer, and taking the ratio as the whole abnormal degree of the gradient of the current hidden layer.
Further, the method for calculating the necessity of the residual error operation includes:
and carrying out weighted summation on the overall gradient abnormal degree of each hidden layer in the hidden layer association combination, further calculating an abnormal degree average value, taking the abnormal degree average value as the necessity of carrying out residual error operation on the corresponding hidden layer association combination, wherein the weight corresponding to each hidden layer is set according to the sequence of the layer number of the hidden layer.
Further, the method for calculating the preference degree of each residual operation according to the necessity of the residual operation comprises:
sequencing each residual operation from large to small according to the importance of the residual operation, and numbering;
and calculating the ratio between the total number of residual operations and the corresponding number of the current residual operations, and calculating the preference degree of the current residual operations by combining the ratio and the necessity thereof.
Further, the method for obtaining the reasonable degree of removal of the residual operation combination includes:
calculating a difference value between loss function values of the corresponding residual error neural networks before and after the residual error operation combination is removed, and obtaining the reasonable removal degree corresponding to the residual error operation combination according to the difference value, wherein the difference value and the reasonable removal degree are in a positive correlation relationship.
Further, the method for compressing and simplifying the residual neural network according to the reasonable degree of removal comprises the following steps:
and sequentially calculating the removal reasonable degree of each residual operation combination according to the combination sequence, and removing the corresponding residual operation combination and the subsequent residual operation combinations when the removal reasonable degree is greater than a reasonable threshold value.
Further, the calculation formula of the number of residual operation combinations is:
Figure BDA0003471899310000021
wherein SN is the total number of the residual operations; s is the number of residual operations contained in the combination of residual operations.
The embodiment of the invention at least has the following beneficial effects: residual operations with small influence in the neural network are removed through carrying out necessity and removal rationality evaluation on the residual operations, so that the compression of the neural network is realized, the requirements of the neural network on the storage space and the computing performance of hardware equipment are reduced, and the device can be arranged on equipment with low power consumption.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart illustrating steps of a residual neural network compression method based on activation information according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a residual error operation according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a residual neural network ResNet-152 according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of the method for compressing a residual error neural network based on activation information, its specific implementation, structure, features and effects will be given in conjunction with the accompanying drawings and the preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following describes a specific scheme of the residual neural network compression method based on activation information in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating steps of a residual neural network compression method based on activation information according to an embodiment of the present invention is shown, where the method includes the following steps:
and S001, training a residual error neural network to determine the weight of each layer in the network.
Specifically, referring to fig. 2, the content of the residual operation refers to an operation adopted to avoid the problem of gradient disappearance or gradient explosion of the deep (large number of layers) neural network, that is, f (X) is obtained by performing feature extraction (operation such as convolution) on data X, and in order to prevent the problem of subsequent feature extraction due to abnormality of the extracted information in f (X), it is necessary to ensure that the information contained in the extracted data is not inferior to the original data X after the feature extraction, and therefore, the original data, that is, h (X) ═ f (X) + X, is linearly superimposed after the feature extraction.
The residual neural network refers to a network which needs to be linearly overlapped with data before extraction after each feature extraction (residual operation), and referring to fig. 3, a schematic structural diagram of a residual neural network ResNet-152 is shown, the network has 152 convolutional layers in total, belongs to a deep neural network, and in order to prevent the gradient problem, the residual operation needs to be performed after every two convolutional operations (Conv), namely black curves of every two convolutional layers in the graph.
Training a residual error neural network by using the collected data set, wherein the specific training process comprises the following steps:
(1) and dividing the acquired data in a ratio of 8:2, wherein a data set of 0.8 is used as a training set of the network, and a data set of 0.2 is used as a network test set.
(2) The network obtains the initial internal weight parameters in a random mode, then obtains an inference result by utilizing random parameter calculation, and reversely adjusts the internal weight parameters of the network according to the difference between the inference value and the label value.
(3) The activation function of the network adopts a sigmoid function.
(4) The loss functions of the network can be roughly divided into two types according to different task types, cross entropy loss functions are adopted for classification tasks, and mean square error loss functions are adopted for regression tasks.
(5) When the number of rounds of network training reaches a set stopping condition or the loss function converges to a set value, the network training can be stopped, and the network training is completed at the moment.
It should be noted that, in the trained residual error neural network, the weights of all layers in the network are determined and are fixed values.
S002, acquiring activation information of each neuron in the current hidden layer based on the weight, calculating the gradient abnormal degree of each neuron by using the activation information, and calculating the overall gradient abnormal degree of the current hidden layer according to the gradient abnormal degree of each neuron; and respectively calculating the necessity of performing residual operation between each implicit layer association combination based on the overall gradient abnormal degree.
Specifically, because the disappearance of the gradient or the explosion of the gradient is more caused by the too small gradient of the activation function, the data of the test set needs to be input into the trained residual neural network one by one to obtain the activation information of each hidden layer, and the obtaining method is as follows:
(1) and carrying out weighted summation on the input value of the previous layer of the current hidden layer and the corresponding weight value of the previous layer, and substituting the summation result into an activation function formula to obtain the activation information of the corresponding neuron in the current hidden layer.
Specifically, assuming that the summation result of any neuron in the current hidden layer is x, the summation result is substituted into an activation function formula to obtain activation information sg of the neuron, where the activation information sg is:
Figure BDA0003471899310000041
(2) and deriving the activation information to obtain a derivative, setting the maximum gradient of the activation function, calculating the gradient abnormal degree of the corresponding neuron by combining the derivative and the maximum gradient, wherein the derivative and the gradient abnormal degree are in a negative correlation relationship.
Specifically, the derivative formula is:
Figure BDA0003471899310000042
the purpose of calculating the derivative of the activation information corresponding to each neuron is to perform a residual operation more necessary if the total number of derivatives of the activation function in the current hidden layer is too small.
And calculating the gradient abnormal degree corresponding to each neuron according to the derivative, wherein the calculation formula is as follows:
pb(x)=1-e(sd(x)-yk)/yk
wherein pb (x) is the degree of gradient anomaly; yk is the maximum gradient of the activation function, and the value is set to 0.25 in the embodiment of the invention; since the values of sd (x) are all smaller than yk, (sd (x) -yk)/yk has the value range of [ -1,0 ].
It should be noted that the smaller sd (x) tends to be 0, the more sd (x) -yk tends to be-yk, corresponding to e(sd(x)-yk)/ykTends to e-1The smaller the value thereof is, the larger the gradient abnormality degree pb is; conversely, the larger sd (x) tends to yk, the larger sd (x) yk tends to 0, corresponding to e(sd(x)-yk)/ykTends to e0The more equal the value thereof is to 1, the smaller the gradient abnormality degree pb.
(3) Setting an abnormal degree threshold, when the abnormal degree of the gradient is greater than or equal to the abnormal degree threshold, considering the gradient of the neuron to be normal, counting a first number of the neurons corresponding to the normal gradient in the current hidden layer, calculating a ratio between the first number and the total number of the neurons in the current hidden layer, and taking the ratio as the overall abnormal degree of the gradient of the current hidden layer.
Specifically, an abnormal degree threshold nk is set, when the abnormal degree of the gradient is greater than the abnormal degree threshold, the gradient of the corresponding neuron is normal, otherwise, when the abnormal degree of the gradient is less than the abnormal degree threshold, the gradient of the corresponding neuron is abnormal, and then the judgment formula is:
Figure BDA0003471899310000051
preferably, in the embodiment of the present invention, the threshold value of the degree of abnormality is an empirical value, and nk is 0.7.
Counting a first quantity S of neurons corresponding to normal gradient in each hidden layer, substituting the first quantity S into a calculation formula of the overall gradient abnormal degree to obtain the overall gradient abnormal degree of the corresponding hidden layer, wherein the calculation formula of the overall gradient abnormal degree is as follows:
yc=S/I
wherein yc is the overall gradient anomaly; i is the total number of neurons contained in the hidden layer.
Further, since the residual operation is across the hidden layer and the number of spans is variable, the necessity of calculating the residual operation of the associated N layers of hidden layers needs to be combined with the number of spans, and then the calculation method of the necessity is as follows: and carrying out weighted summation on the overall gradient abnormal degree of each hidden layer in the hidden layer association combination, further calculating an abnormal degree average value, taking the abnormal degree average value as the necessity of carrying out residual error operation on the corresponding hidden layer association combination, wherein the weight corresponding to each hidden layer is set according to the sequence of the layer number of the hidden layer.
As an example, the calculation formula of the necessity of the residual operation in the embodiment of the present invention is:
Figure BDA0003471899310000052
wherein By is the necessity of residual operation; n is the number of hidden layers contained in the hidden layer association combination; n is the number of layers corresponding to each hidden layer in the hidden layer association combination.
Step S003, calculating the preference degree of each residual operation according to the necessity of the residual operation, calculating the average preference degree of each residual operation combination by using the preference degree of each residual operation, and sequencing a plurality of residual operation combinations from large to small based on the average preference degree to obtain a combination sequence; and sequentially removing a residual operation combination in the combination sequence, calculating the reasonable removal degree of the residual operation combination according to the loss function of the residual neural network, and compressing and simplifying the residual neural network according to the reasonable removal degree.
Specifically, after obtaining the necessity of each residual operation, a combination for removing the residual operation according to the necessity is required, and since the neural network is in a serial structure, the subsequent residual operation is affected by the previous operation, so the embodiment of the present invention calculates the removal priority degree by combining the position and the necessity of the residual operation, thereby obtaining the priority degrees of various combinations, and facilitating the subsequent verification according to the sequence, the processing method is as follows:
(1) sequencing each residual operation from large to small according to the importance of the residual operation, and numbering; and calculating the ratio between the total number of the residual operations and the corresponding number of the current residual operation, and calculating the preference degree of the current residual operation by combining the ratio and the necessity thereof.
Specifically, the calculation formula of the preferred degree is as follows:
Figure BDA0003471899310000061
wherein, yxsThe preferred degree of the s-th residual operation; SN is the total number of residual operations; bysIs the necessity of the s-th residual operation.
(2) Obtaining the number of residual operation combinations according to the total number SN of the residual operations, wherein the calculation formula of the number of the residual operation combinations is as follows:
Figure BDA0003471899310000062
wherein SN is the total number of the residual operations; s is the number of residual operations contained in the combination of residual operations.
(3) And calculating the average preference degree of each residual operation combination according to the preference degree of each residual operation in the residual operation combinations, and sequencing all the residual operation combinations from large to small according to the average preference degree to obtain a combination sequence.
Specifically, the calculation formula of the average preference degree is as follows:
Figure BDA0003471899310000063
wherein xz is the average preference degree of the residual error operation combination; k is the number of residual operations included in the combination of residual operations.
Further, in order to verify and determine that it is reasonable to remove the corresponding combination of residual operations, i.e., not to have an excessive influence on the neural network, it is necessary to calculate the rationality of the removal of the combination of residual operations using the verification set. After removing the corresponding residual operation combination, comparing the change degree of the network output after the verification set data is output to the neural network before and after the removal, wherein the smaller the change is, the more reasonable the removal combination is, and the obtaining method of the removal reasonable degree of the residual operation combination is as follows: for the residual error neural network, the difference between the network inference value and the label value is measured by adopting the cross entropy loss function, and after the structure of the network is changed, the inference value is changed, and the label value is not changed, so that the loss function value is correspondingly changed, and the influence degree after the residual error operation combination is removed can be judged by only comparing the change degree of the loss function.
Specifically, calculating a difference value between the loss function values of the corresponding residual error neural networks before and after the residual error operation combination is removed, obtaining a removal reasonable degree of the corresponding residual error operation combination according to the difference value, wherein the difference value and the removal reasonable degree are in a positive correlation relationship, and then the calculation formula for removing the reasonable degree is as follows:
Figure BDA0003471899310000071
wherein po is the reasonable degree of removal; esqRepresenting the loss function value obtained before the residual error operation combination removal; espRepresenting a loss function value obtained after the corresponding residual error operation combination is removed; min (es)q) And representing the convergence value of the loss function of the residual operation combination before removing the pre-residual neural network.
And setting a reasonable threshold, sequentially calculating the removal reasonable degree of each residual operation combination according to the combination sequence, and removing the corresponding residual operation combination and the residual operation combinations behind the corresponding residual operation combination when the removal reasonable degree is greater than the reasonable threshold, so that the compression and simplification of the residual neural network can be completed.
In summary, the embodiment of the present invention provides a residual error neural network compression method based on activation information, which trains a residual error neural network to determine weights of layers in the network; acquiring the integral gradient abnormal degree of each hidden layer based on the activation information, and respectively calculating the necessity of performing residual error operation between the associated combinations of the hidden layers according to the integral gradient abnormal degree; and compressing and simplifying the residual error neural network according to the reasonable removal degree of each residual error operation combination according to the necessity. Residual operations with small influence in the neural network are removed through carrying out necessity and removal rationality evaluation on the residual operations, so that the compression of the neural network is realized, the requirements of the neural network on the storage space and the computing performance of hardware equipment are reduced, and the device can be arranged on equipment with low power consumption.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A residual error neural network compression method based on activation information is characterized by comprising the following steps:
training a residual error neural network to determine the weight of each layer in the network;
acquiring activation information of each neuron in the current hidden layer based on the weight, calculating the gradient abnormal degree of each neuron by using the activation information, and calculating the overall gradient abnormal degree of the current hidden layer according to the gradient abnormal degree of each neuron; respectively calculating the necessity of performing residual operation between each hidden layer association combination based on the overall gradient abnormal degree;
calculating the preference degree of each residual operation according to the necessity of the residual operation, calculating the average preference degree of each residual operation combination by using the preference degree of each residual operation, and sequencing a plurality of residual operation combinations from large to small based on the average preference degree to obtain a combination sequence; and sequentially removing one residual operation combination in the combination sequence, calculating the reasonable removal degree of the residual operation combination according to the loss function of the residual neural network, and compressing and simplifying the residual neural network according to the reasonable removal degree.
2. The method of claim 1, wherein the method for obtaining activation information of each neuron in a current hidden layer based on the weight comprises:
and carrying out weighted summation on the input value of the previous layer of the current hidden layer and the corresponding weight value of the previous layer, and substituting the summation result into an activation function formula to obtain the activation information of the corresponding neuron in the current hidden layer.
3. The method of claim 2, wherein the method of using the activation information to calculate the degree of gradient abnormality for each neuron comprises:
and deriving the activation information to obtain a derivative, setting the maximum gradient of the activation function, calculating the gradient abnormal degree of the corresponding neuron by combining the derivative and the maximum gradient, wherein the derivative and the gradient abnormal degree are in a negative correlation relationship.
4. The method of claim 1, wherein said method of calculating an overall gradient anomaly degree for a current hidden layer from said gradient anomaly degree for each neuron comprises:
setting an abnormal degree threshold, when the abnormal degree of the gradient is greater than or equal to the abnormal degree threshold, considering the gradient of the neuron to be normal, counting a first number of the neurons corresponding to the normal gradient in the current hidden layer, calculating a ratio between the first number and the total number of the neurons in the current hidden layer, and taking the ratio as the whole abnormal degree of the gradient of the current hidden layer.
5. The method of claim 1, wherein the calculation of the necessity of the residual operation comprises:
and carrying out weighted summation on the overall gradient abnormal degree of each hidden layer in the hidden layer association combination, further calculating an abnormal degree average value, taking the abnormal degree average value as the necessity of carrying out residual error operation on the corresponding hidden layer association combination, wherein the weight corresponding to each hidden layer is set according to the sequence of the layer number of the hidden layer.
6. The method of claim 1, wherein the method of calculating a degree of preference for each residual operation based on the necessity of the residual operation comprises:
sequencing each residual operation from large to small according to the importance of the residual operation, and numbering;
and calculating the ratio between the total number of residual operations and the corresponding number of the current residual operations, and calculating the preference degree of the current residual operations by combining the ratio and the necessity thereof.
7. The method of claim 1, wherein the method for obtaining a reasonable degree of elimination of the residual operation combination comprises:
calculating a difference value between loss function values of the corresponding residual error neural networks before and after the residual error operation combination is removed, and obtaining the reasonable removal degree corresponding to the residual error operation combination according to the difference value, wherein the difference value and the reasonable removal degree are in a positive correlation relationship.
8. The method of claim 1, wherein the method of compression reduction of the residual neural network by a reasonable degree of said removing comprises:
and sequentially calculating the removal reasonable degree of each residual operation combination according to the combination sequence, and removing the corresponding residual operation combination and the subsequent residual operation combinations when the removal reasonable degree is greater than a reasonable threshold value.
9. The method of claim 1, wherein the number of combinations of residual operations is calculated by the formula:
Figure FDA0003471899300000021
wherein SN is the total number of the residual operations; s is the number of residual operations contained in the combination of residual operations.
CN202210045279.XA 2022-01-15 2022-01-15 Residual error neural network compression method based on activation information Pending CN114372572A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210045279.XA CN114372572A (en) 2022-01-15 2022-01-15 Residual error neural network compression method based on activation information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210045279.XA CN114372572A (en) 2022-01-15 2022-01-15 Residual error neural network compression method based on activation information

Publications (1)

Publication Number Publication Date
CN114372572A true CN114372572A (en) 2022-04-19

Family

ID=81144303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210045279.XA Pending CN114372572A (en) 2022-01-15 2022-01-15 Residual error neural network compression method based on activation information

Country Status (1)

Country Link
CN (1) CN114372572A (en)

Similar Documents

Publication Publication Date Title
WO2021068454A1 (en) Method for identifying energy of micro-energy device on basis of bp neural network
CN112016237B (en) Deep learning method, device and system for lithium battery life prediction
CN112232476A (en) Method and device for updating test sample set
CN112433896B (en) Method, device, equipment and storage medium for predicting server disk faults
CN111814975B (en) Neural network model construction method and related device based on pruning
US20200342315A1 (en) Method, device and computer program for creating a deep neural network
CN111856287A (en) Lithium battery health state detection method based on stacked residual causal convolutional neural network
CN110335466B (en) Traffic flow prediction method and apparatus
CN113159345A (en) Power grid fault identification method and system based on fusion neural network model
Ni et al. Light YOLO for high-speed gesture recognition
CN110705708A (en) Compression method and device of convolutional neural network model and computer storage medium
CN113203914A (en) Underground cable early fault detection and identification method based on DAE-CNN
CN115001937B (en) Smart city Internet of things-oriented fault prediction method and device
CN113608140A (en) Battery fault diagnosis method and system
CN115016965A (en) Method, device, equipment and storage medium for detecting faults of metering automation master station
Liu et al. Research on the strategy of locating abnormal data in IOT management platform based on improved modified particle swarm optimization convolutional neural network algorithm
CN114372572A (en) Residual error neural network compression method based on activation information
CN117310533A (en) Service life acceleration test method and system for proton exchange membrane fuel cell
CN115099405B (en) Neural network hybrid approximation and error compensation method based on approximation multiplication
CN113807541B (en) Fairness repair method, system, equipment and storage medium for decision system
CN114842425A (en) Abnormal behavior identification method for petrochemical process and electronic equipment
CN112881518B (en) Method for predicting residual life of dynamic filter compensator
CN114781598A (en) Fault prediction method based on hierarchical neural network distributed training
CN114065923A (en) Compression method, system and accelerating device of convolutional neural network
CN114826948A (en) SDN network flow prediction method based on graph convolution network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230526

Address after: Zone C, No. 56 Hangzhou Road, Kuancheng District, Changchun City, Jilin Province, 130000 DS99-3-D538-928

Applicant after: Jilin Xuyuguan Technology Co.,Ltd.

Address before: 100084 Tsinghua Yuan, Beijing, Haidian District

Applicant before: Qin Guoqing