CN112819252A

CN112819252A - Convolutional neural network model construction method

Info

Publication number: CN112819252A
Application number: CN202110224274.9A
Authority: CN
Inventors: 聂鼎; 宋忧乐; 范黎涛; 施冬明
Original assignee: Electric Power Research Institute of Yunnan Power Grid Co Ltd
Current assignee: Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2021-05-18

Abstract

The invention relates to a convolutional neural network model construction method, which comprises the following steps: collecting data related to the power distribution network lines; preprocessing data related to the power distribution network line; constructing a convolutional neural network model, and training the convolutional neural network model based on the preprocessed data related to the power distribution network line; and carrying out lightweight compression processing on the trained convolutional neural network model. The method can compress and accelerate the traditional convolutional neural network model, and is embodied as follows on the premise of basically not influencing the accuracy of the model: the volume of the model is reduced, and the model is more convenient to transmit and deploy; the running speed of the model is higher; the computation resources occupied by the model when running are smaller.

Description

Convolutional neural network model construction method

Technical Field

The application relates to the technical field of power distribution network fault prediction, in particular to a convolutional neural network model construction method.

Background

With the advent of the 5G era, more and more data and information are collected and processed by various intelligent terminals to develop more intelligent applications. The processing and recognition capabilities of the image class data, among other things, determine the upper limit of the image class application.

Research shows that deep learning, particularly a convolutional neural network method, achieves better performance in the field of image recognition. Because the deep learning method is a self-adaptive algorithm, a large amount of characteristic engineering is not required to be carried out manually, so that the manual workload is reduced, and the unsatisfactory model effect caused by improper characteristic selection is effectively avoided. As the name suggests, the convolutional neural network is a type of neural network which mainly uses convolutional layers to carry out feature extraction and result calculation; the convolutional layer is essentially a combination of a series of feature filters to extract different features in the image.

Through analysis of existing research results on convolutional neural networks, although the models achieve the performance close to that of human beings in the field of image recognition, one of the greatest defects is the consumption of resources. Specifically, the size of the existing convolutional neural network model is often about several hundred megabits, and training of the model needs to be performed on a GPU and a TPU with better performance, although the training time is counted in units of days or weeks, and the training time is too long.

The huge model not only occupies more resources in the aspects of storage and transmission, but also cannot meet the real-time requirement if the calculation period of the model is long, and the application effect of the model in some scenes is greatly reduced.

Disclosure of Invention

The application provides a convolutional neural network model construction method, which aims to solve the problems that the model in the prior art not only occupies more resources in the aspects of storage and transmission, but also has a longer calculation period and cannot meet the real-time requirement.

The technical scheme adopted by the application is as follows:

the invention provides a convolutional neural network model construction method, which comprises the following steps:

collecting data related to the power distribution network lines;

preprocessing data related to the power distribution network line;

constructing a convolutional neural network model, and training the convolutional neural network model based on the preprocessed data related to the power distribution network line;

and carrying out lightweight compression processing on the trained convolutional neural network model.

Further, after performing a lightweight compression process on the trained model, the method further includes:

and performing maximum pooling on the pooling layer segments of the convolutional neural network model based on the preprocessed power distribution network line related data.

Further, the subsection is divided into a line operation fault stage and a line operation normal stage.

Further, after performing maximum pooling processing on the pooling layer segment of the convolutional neural network model based on the preprocessed power distribution network line-related data, the method further includes:

and performing secondary training on the convolutional neural network model after the maximum pooling by using the preprocessed data related to the power distribution network lines.

Further, the data related to the distribution network line includes:

the network structure of the power distribution network line, historical operation data and historical fault overhaul data.

Further, the method for performing lightweight compression processing on the trained model comprises the following steps:

and carrying out lightweight low-precision lightweight and/or model pruning compression treatment on the trained model.

The technical scheme of the application has the following beneficial effects:

the invention discloses a convolutional neural network model construction method, which comprises the following steps: collecting data related to the power distribution network lines; preprocessing data related to the power distribution network line; constructing a convolutional neural network model, and training the convolutional neural network model based on the preprocessed data related to the power distribution network line; and carrying out lightweight compression processing on the trained convolutional neural network model.

Because the light weight technology is adopted, the size of the convolutional neural network model is further reduced, the response time is further prolonged, and the resource consumption of the model is reduced; meanwhile, the adopted segmented maximum pooling technology can capture topological characteristic information among the net rack devices, so that the fault prediction effect is improved.

Drawings

In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a convolutional neural network model construction method according to an embodiment of the present invention;

fig. 2 is a diagram illustrating a typical convolutional neural network structure in a convolutional neural network model construction method according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following examples do not represent all embodiments consistent with the present application. But merely as exemplifications of systems and methods consistent with certain aspects of the application, as recited in the claims.

According to the background art, the huge model in the prior art not only occupies more resources in the aspects of storage and transmission, but also cannot meet the real-time requirement if the calculation period of the model is long, and the application effect of the model in some scenes is greatly reduced. Therefore, the research on the technology for carrying out lightweight processing on the model can reduce the resource occupation and improve the response speed on the premise of ensuring a certain accuracy rate, and has very important significance.

See fig. 1 and 2.

The application provides a convolutional neural network model construction method, which comprises the following steps:

s01: collecting data related to the power distribution network lines;

wherein, the relevant data of distribution network line includes: the network structure of the power distribution network line, historical operation data and historical fault overhaul data.

The power distribution network fault prediction mainly aims at trend faults with a longer development period, faults with an accumulative effect, faults with a certain statistical rule and the like, and takes the state of the current equipment or system as a starting point, and the future state of the equipment or system is predicted by combining the structural characteristics, the operating parameters, meteorological factors, environmental conditions, historical data and the like of a known test object, so that the risk of the power distribution network is predicted, the power distribution network is overhauled or planned in advance, and the stable and safe operation of the power distribution network is guaranteed.

S02: preprocessing data related to the power distribution network line;

the goal of the preprocessing is to make the collected data more canonical to facilitate the utilization of the model. Common pretreatment means include: processing missing values, including filling and deleting the missing values; processing abnormal values, including identification, deletion and modification of the abnormal values; the processing of inconsistent values includes normalizing inconsistent values, such as: and performing unification processing and the like when the units are not consistent.

The preprocessed data can facilitate the model to more easily identify the characteristics among different types of samples.

S03: constructing a convolutional neural network model, and training the convolutional neural network model based on the preprocessed data related to the power distribution network line;

fig. 1 is a diagram of a basic convolutional neural network model architecture. As can be seen from the figure, the convolutional neural network model comprises a convolutional layer, a pooling layer, a nonlinear unit and a full connection layer. The main role of each layer is as follows:

1) and (3) rolling layers: the method is the core of a convolutional neural network, is also called a feature extraction layer and is mainly used for extracting features of images. It consists of a set of convolution kernels, and the weight values of the convolution kernels can be automatically learned and updated according to an objective function.

2) A pooling layer: also called as down-sampling layer, generally, dimension reduction operation is performed between two consecutive convolutional layers, which can effectively reduce the number of model parameters and relieve the over-fitting phenomenon of the network, and generally includes Max Pooling layer (Max Pool) and Average Pooling layer (Average Pool) and the like.

3) Nonlinear unit: the nonlinear activation function is composed of nonlinear activation functions, and is divided into a saturated nonlinear activation function such as a Sigmod function and a Tanh function according to the characteristics of the nonlinear activation function, and an unsaturated nonlinear activation function such as a Relu function and a Leaky Relu function. The nonlinear unit is used for carrying out nonlinear mapping on the output result of the convolutional layer, so that the neural network can be arbitrarily approximated to any nonlinear function, and the characteristic expression capability of the model is improved.

4) Full connection layer: generally, the vector is used for classification at the tail of a neural network, and a two-dimensional vector output by a convolutional layer is converted into a one-dimensional vector.

S04: carrying out lightweight compression processing on the trained convolutional neural network model;

the neural network model constructed through S03 tends to have many layers, i.e., great depth. One problem that results from this is that the model training and application process not only needs to consume huge resources, but also is difficult to meet the requirement of timeliness. Therefore, it is necessary to "reduce weight" on a light weight basis.

Namely, the trained model is subjected to low-precision light weight and/or model pruning compression processing. In short, either or both of the low-precision lightweight processing and the model pruning compression processing may be selected, which are relatively mature methods.

1) Low precision and light weight

The essence of the model is that the variables are combined according to a certain structure, and the low-precision and light-weight operation is carried out on the variables in the model. For example, 16-bit or even 8-bit floating point precision may be used instead of 64-bit precision for each convolution kernel, thus reducing the space occupied by each variable to 1/4 or even 1/8. Of course, variations in the precision of the variables can introduce errors, and therefore careful analysis of which variables are subject to low precision processing is required. Since the total number of parameters of the convolutional neural network is huge, in general, the size of the model is significantly reduced after low-precision processing.

2) Model pruning

Model pruning is an effective model compression method. By pruning unimportant neurons, filters or channels, the parameters and the computational load of the model can be efficiently compressed. The invention adopts a model pruning method based on a sparse convolutional neural network. And sparse regularization training is carried out, so that partial parameters of the network tend to 0 or equal to 0 in the training process, a deep neural network model with sparse weight is obtained, then the model is pruned, sparse filters and channels are pruned, and finally the model is finely adjusted to recover the model accuracy.

S05: performing maximum pooling on the pooling layer segments of the convolutional neural network model based on the preprocessed power distribution network line-related data;

the subsection is divided into a line operation fault stage and a line operation normal stage. For example: if the operation data of a certain line in one month is collected, and the line has a fault in two days, the collected data in the time period is fault data; the remaining time data is normal data.

Because the features captured by the conventional max-pooling layer or average pooling layer are too coarse, the present invention employs a segmented max-pooling layer as the pooling layer in the model. The data in the training set (the preprocessed data related to the distribution network line) is divided into two sections (a line operation fault stage and a line operation normal stage), and then a maximum pooling method is applied to each section respectively.

S06: and performing secondary training on the convolutional neural network model after the maximum pooling by using the preprocessed data related to the power distribution network lines.

The purpose of the secondary trained model is to validate the max pooling layer. In fact, the maximum pooling layer is not added, and the method is also a complete fault prediction model, but the effect can still be improved by adding the pooling layer and carrying out maximum pooling treatment in a segmented manner; the purpose of adding the maximum pooling layer is to improve the accuracy of the prediction. After the maximum pooling layer is added, the weight of the maximum pooling layer still needs to be adjusted according to the fault condition, so that secondary training is needed.

From the above, the present invention provides a convolutional neural network construction method based on a lightweight technique and a segmented maximum pooling, comprising: (1) compressing the size of the convolutional neural network model by adopting a model lightweight technology; (2) and further, the effect of the model on the power distribution network line fault prediction problem is improved by adopting a segmented maximum pooling method. Namely, the size of the convolutional neural network model is further reduced due to the adoption of a lightweight technology, the response time is further prolonged, and the resource consumption of the model is reduced; meanwhile, the adopted segmented maximum pooling technology can capture topological characteristic information among the net rack devices, so that the fault prediction effect is improved.

The method can compress and accelerate the traditional convolutional neural network model, and is embodied as follows on the premise of basically not influencing the accuracy of the model: the volume of the model is reduced, and the model is more convenient to transmit and deploy; the running speed of the model is higher; the computation resources occupied by the model when running are smaller.

The embodiments provided in the present application are only a few examples of the general concept of the present application, and do not limit the scope of the present application. Any other embodiments extended according to the scheme of the present application without inventive efforts will be within the scope of protection of the present application for a person skilled in the art.

Claims

1. A convolutional neural network model construction method is characterized by comprising the following steps:

collecting data related to the power distribution network lines;

preprocessing data related to the power distribution network line;

2. The convolutional neural network model building method according to claim 1, further comprising, after performing a lightweight compression process on the trained model:

3. The convolutional neural network model building method of claim 2, wherein the segment is divided into a line operation fault phase and a line operation normal phase.

4. The convolutional neural network model building method as claimed in claim 2 or 3, wherein after performing maximum pooling on the pooling layer segment of the convolutional neural network model based on the preprocessed power distribution network line-related data, further comprising:

5. The convolutional neural network model building method of claim 1, wherein the power distribution network line-related data comprises:

6. The convolutional neural network model building method according to claim 1, wherein performing a lightweight compression process on the trained model includes: