CN113591773A

CN113591773A - Power distribution room object detection method, device and equipment based on convolutional neural network

Info

Publication number: CN113591773A
Application number: CN202110914881.8A
Authority: CN
Inventors: 程津
Original assignee: Wuhan Zhongdian Smart Technology Co ltd
Current assignee: Wuhan Zhongdian Smart Technology Co ltd
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2021-11-02
Anticipated expiration: 2041-08-10
Also published as: CN113591773B

Abstract

The invention discloses a power distribution room object detection method based on a convolutional neural network, which comprises the following steps: constructing an image data set of the detected target object; wherein the image dataset comprises a labeled first image and an unlabeled second image; constructing a network structure of a convolutional neural network model; inputting the first image and the second image into the convolutional neural network model alternately for training, and continuously optimizing the network structure parameters of the convolutional neural network model; and detecting the target object of the power distribution room by using the optimized convolutional neural network model. According to the method, the convolutional neural network model is constructed, and the labeled first image and the unlabeled second image are alternately input into the convolutional neural network model for training, so that the second image does not need to be labeled, and time cost, labor cost and material cost for labeling a large number of images are reduced; and meanwhile, the accuracy rate of detecting objects in the power distribution room can be ensured.

Description

Power distribution room object detection method, device and equipment based on convolutional neural network

Technical Field

The invention belongs to the technical field of object detection, and particularly relates to a method, a device and equipment for detecting objects in a power distribution room based on a convolutional neural network.

Background

In recent years, with the development of deep learning technology, a plurality of objects based on power distribution room target detection are researched, and the algorithm performs optimal fitting on a model by combining a large amount of object labels on a camera image of a power distribution room and the deep learning technology, so that the aim of assisting object inspection is fulfilled. However, in the automatic inspection process of the target in the distribution room, due to the fact that scenes of the distribution room are varied, the edge distribution difference of images around the day and night is large, and the existing detection method of the target object in the distribution room based on the deep learning technology has the following defects:

(1) the image heterogeneity is serious, images often have different edge distributions due to the influence of illumination and focal length in a power distribution room, and the image heterogeneity can directly cause the reduction of the generalization performance of the model, namely, similar objects are detected as different classes under different edge distribution conditions.

(2) If the detection quality of the night image is to be improved, a large amount of labels need to be added to the night image, so that the cost of image labeling is high.

Disclosure of Invention

The invention aims to provide a power distribution room object detection method, device and equipment based on a convolutional neural network, which are used for solving at least one technical problem in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, the present invention provides a method for detecting objects in a power distribution room based on a convolutional neural network, including:

constructing an image data set of the detected target object; wherein the image dataset comprises a labeled first image and an unlabeled second image;

constructing a network structure of a convolutional neural network model;

inputting the first image and the second image into the convolutional neural network model alternately for training, and continuously optimizing the network structure parameters of the convolutional neural network model;

and detecting the target object of the power distribution room by using the optimized convolutional neural network model.

In one possible design, the convolutional neural network model includes: the system comprises an encoder, a region generation network, an example-level coordinate regressor, an example-level classifier, an image-level discriminator, an example-level discriminator, an image-level category regularization module and a classification consistency regularization module.

In one possible design, the alternately inputting the first image and the second image into the convolutional neural network model for training, and continuously optimizing the network structure parameters of the convolutional neural network model includes:

alternately inputting the first image and the second image into the encoder, and extracting a first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) (ii) a Wherein x is_i ^sRepresenting the ith first image, x_i ^tRepresenting the ith second image;

the first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) Inputting the area generation network to obtain a first candidate frame R (G)_E(x_i ^s) And a second frame candidate R (G)_E(x_i ^t))；

Separately aligning the first candidate box R (G) using the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) performing coordinate regression and classification to obtain a first positioning result

First classification result

Second positioning result

And a second classification result

And calculating the first positioning result

And first classification result

A first loss function of;

using the image-level discriminator to pair the first features G_E(x_i ^s) And a second feature G_E(x_i ^t) Classifying and judging the first characteristics G_E(x_i ^s) And a second feature G_E(x_i ^t) And calculating a second loss function of the image-level discriminator;

in the first feature G_E(x_i ^s) On the basis of the first image, predicting a multi-label classification result C (G) of the first image by utilizing the image-level classification regularization module_E(x_i ^s))；

Utilizing the classification consistency regularization module to classify the first classification result

With the multi-label classification result C (G)_E(x_i ^s) Carrying out consistency judgment to obtain a classification consistency weight matrix lambda;

using the instance-level discriminator to pair the first candidate frames R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Classifying and discriminating the first frame candidate R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) and computing a third loss function for the instance-level arbiter based on the classification consistency weight matrix λ;

calculating a total loss function of the convolutional neural network model according to the first loss function, the second loss function and the third loss function;

and continuously optimizing the network structure parameters of the convolutional neural network model by using the total loss function.

In one possible design, the first candidate box R (G) is separately aligned using the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Before performing coordinate regression and classification, further comprising:

using non-maximum value suppression method to the first candidate frame R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) the region confidence of the image data are respectively sorted;

respectively selecting the first T candidate frames as the final first candidate frame R_f(G_E(x_i ^s) ) and the final second candidate frame R_f(G_E(x_i ^t))。

In one possible design, the first positioning result is calculated

And first classification result

Comprises:

aligning the first positioning result with a cross-entropy penalty

And first classification result

And carrying out strong supervision, and calculating a strong supervision loss function, wherein the calculation formula is as follows:

L_s＝L_{s_class}+L_{s_pos}； (3)

wherein L is_sIs the first positioning result

And first classification result

N represents the total number of the first images, i represents the number of the ith first image, M represents the number of labels contained in the ith first image, j represents the jth label in the ith first image, C represents the total number of prediction categories, K represents the number of coordinates of the current label frame, K represents the current kth category or kth label coordinate, y _ class and y _ pos and represents the label value of the current label;

wherein L is_{s_class}And L_{s_pos}The more towards 0, the more accurate the detection classification result and the positioning result.

In one possible design, the second loss function of the image-level discriminator is calculated as follows:

wherein n is_t，n_sRespectively representing a first total number of image samples and a second total number of image samples,

an example level discriminator representing a first image is shown,

an example level discriminator representing a second image, D_imageA representation of the image-level discriminator,

represents the loss function of the image-level discriminator when

When the image level is close to 0.5, it indicates that the image level discriminator cannot distinguish the first feature G_E(x_i ^s) And the second feature G_E(x_i ^t)。

In one possible design, the first classification result is normalized by the classification consistency regularization module

And carrying out consistency judgment on the multi-label classification result to obtain a classification consistency weight matrix lambda, wherein the method comprises the following steps:

utilizing the classification consistency regularization module to classify the result into a first classification result

As a basis, traverse the multi-label classification result C (G)_E(x_i ^s) Finding the multi-label classification result C (G)_E(x_i ^s) Whether or not to contain the first classification result

And obtaining the classification consistency weight matrix lambda according to the search result.

In one possible design, a third loss function of the example-level discriminator is calculated based on the classification consistency weight matrix λ, as follows:

wherein nt and ns respectively represent the total number of the first image samples and the total number of the second image samples,

an example level discriminator representing the first image,

an example level discriminator representing a second image, D_instanceThere is shown an example level of the discriminator,

represents the penalty function of the instance level discriminator when

Approaching 0.5, it means that the example-level discriminator has been unable to distinguish the first frame candidate feature R (G)_E(x_i ^s) And a second frame candidate feature R (G)_E(x_i ^t))。

In one possible design, the first image is a distribution room daytime image and the second image is a distribution room nighttime image.

In a second aspect, the present invention provides a power distribution room object detection apparatus based on a convolutional neural network, the apparatus comprising:

the data set construction module is used for constructing an image data set of the detected target object; wherein the image dataset comprises a labeled first image and an unlabeled second image;

the neural network model building module is used for building a network structure of the convolutional neural network model;

the training module is used for inputting the first image and the second image into the convolutional neural network model alternately for training and optimizing the network structure parameters of the convolutional neural network model continuously;

and the detection module is used for detecting the target object of the power distribution room by using the optimized convolutional neural network model.

In one possible design, the training module specifically includes:

a feature extraction unit for alternately inputting the first image and the second image into the encoder and extracting a first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) (ii) a Wherein x is_i ^sRepresenting the ith first image, x_i ^tRepresenting the ith second image;

a candidate frame extracting unit for extracting the first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) Inputting the area generation network to obtain a first candidate frame R (G)_E(x_i ^s) And a second frame candidate R (G)_E(x_i ^t))；

A coordinate regression and classification unit for separately classifying the first candidate frame R (G) using the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) performing coordinate regression and classification to obtain a first positioning result

First classification result

Second positioning result

And a second classification result

And calculating the first positioning result

And first classification result

A first loss function of;

a first classification discrimination unit for discriminating the first feature G by the image-level discriminator_E(x_i ^s) And a second feature G_E(x_i ^t) Classifying and judging the first characteristics G_E(x_i ^s) And a second feature G_E(x_i ^t) And calculating a second loss function of the image-level discriminator;

a classification result prediction unit for predicting the first feature G_E(x_i ^s) On the basis of the first image, predicting a multi-label classification result C (G) of the first image by utilizing the image-level classification regularization module_E(x_i ^s))；

A classification result judgment unit for utilizing the classification consistency regularization module to classify the first classification result

a second classification discrimination unit for discriminating the first candidate frame R (G) using the instance-level discriminator_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Classifying and discriminating the first frame candidate R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) and computing a third loss function for the instance-level arbiter based on the classification consistency weight matrix λ;

a loss function calculation unit, configured to calculate a total loss function of the convolutional neural network model according to the first loss function, the second loss function, and the third loss function;

and the model optimization unit is used for continuously optimizing the network structure parameters of the convolutional neural network model by utilizing the total loss function.

In one possible design, the first candidate box R (G) is separately aligned using the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Before performing coordinate regression and classification, the training module is further configured to:

In one possible design, the first positioning result is calculated

And first classification result

The coordinate regression and classification unit is specifically configured to:

aligning the first positioning result with a cross-entropy penalty

And first classification result

L_s＝L_{s_class}+L_{s_pos}； (3)

wherein L is_sIs the first positioning result

And first classification result

In one possible design, when calculating the second loss function of the image-level discriminator, the first classification discrimination unit uses the following calculation formula:

an example level discriminator representing a first image is shown,

represents the loss function of the image-level discriminator when

And when consistency judgment is carried out on the multi-label classification result to obtain a classification consistency weight matrix lambda, the classification result judgment unit is specifically used for:

an example level discriminator representing the first image,

represents the penalty function of the instance level discriminator when

In a third aspect, the present invention provides a computer device, comprising a memory, a processor and a transceiver, which are communicatively connected in sequence, wherein the memory is used for storing a computer program, the transceiver is used for sending and receiving messages, and the processor is used for reading the computer program and executing the convolutional neural network-based object detection method for an electric distribution room as described in any one of the possible designs of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when executed on a computer, perform a method for detecting objects in a power distribution room based on a convolutional neural network as set forth in any one of the possible designs of the first aspect.

In a fifth aspect, the present invention provides a computer program product containing instructions which, when run on a computer, cause the computer to perform a method of convolutional neural network-based object detection in a power distribution room as described in any one of the possible designs of the first aspect.

Has the advantages that:

1. the method comprises the steps of constructing an image data set of a detected target object; wherein the image dataset comprises a labeled first image and an unlabeled second image; constructing a network structure of a convolutional neural network model; then, the first image and the second image are alternately input into the convolutional neural network model for training, and the network structure parameters of the convolutional neural network model are continuously optimized; and finally, detecting the target object of the power distribution room by using the optimized convolutional neural network model. According to the method, the convolutional neural network model is constructed, and the labeled first image and the unlabeled second image are alternately input into the convolutional neural network model for training, so that the second image does not need to be labeled, and time cost, labor cost and material cost for labeling a large number of images are reduced; and meanwhile, the accuracy and robustness of object detection of the distribution room can be ensured.

2. The method utilizes the image-level discriminator and the example-level discriminator in the countermeasure generation network to classify the first feature and the second feature, and the first candidate frame and the second candidate frame respectively, so as to extract the consistency feature, and the consistency feature is used in the subsequent model training, so that more accurate image features can be provided for the neural network model, and the applicability of the neural network model is improved; secondly, the weak positioning characteristic of a multi-label classifier in the image-level classification regularization module is utilized to promote the neural network model to pay attention to important areas containing main targets, and the training efficiency is improved; and finally, calculating the consistency between image-level prediction and example-level prediction by using a classification consistency regularization module after obtaining the image-level multi-classification result, and further constraining the target detection classification result, thereby improving the detection capability of the neural network model on the difficult samples.

Drawings

Fig. 1 is a flowchart of a convolutional neural network-based object detection method for a power distribution room in the present embodiment;

fig. 2 is a network configuration block diagram of a neural network model in the present embodiment;

fig. 3 is a block diagram of the structure of the distribution room object detection apparatus based on the convolutional neural network in this embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments in the present description, belong to the protection scope of the present invention.

Example one

As shown in fig. 1-2, in a first aspect, the present embodiment provides a method for detecting objects in a power distribution room based on a convolutional neural network, including but not limited to the following steps S101 to S104:

s101, constructing an image data set of a detected target object; wherein the image dataset comprises a labeled first image and an unlabeled second image;

the first image is a daytime image of the distribution room, and the second image is a nighttime image of the distribution room, wherein objects in the distribution room, such as a distribution cabinet, a worker's clothes, a worker's hat, and the like, are marked in the nighttime image.

S102, constructing a network structure of a convolutional neural network model;

it should be noted that the convolutional neural network model includes, but is not limited to, an encoder, a region generation network, an example-level coordinate regressor, an example-level classifier, an image-level discriminator, an example-level discriminator, an image-level category regularization module, and a classification consistency regularization module.

Wherein, the encoder is composed of N coding blocks, preferably, the encoder is composed of 5 coding blocks, each coding block can include 1 3 × 3 convolution layers, 1 batch processing normalization layer, one relu activation layer and 1 pooling layer; wherein the area generation network is composed of a series of convolutional layers; wherein, the example-level coordinate regressor and the example-level classifier are composed of 3 full connection layers; the image-level discriminator and the example-level discriminator are respectively composed of three convolution layers, a maximum pooling layer, a full-connection layer and a sigmoid activation layer; the image-level classification regularization module is composed of three convolution layers, a maximum pooling layer, a full connection layer and a softmax activation layer.

S103, alternately inputting the first image and the second image into the convolutional neural network model for training, and continuously optimizing network structure parameters of the convolutional neural network model;

in step S103, alternately inputting the first image and the second image into the convolutional neural network model for training, and continuously optimizing network structure parameters of the convolutional neural network model, including:

step S1031, alternately inputting the first image and the second image into the encoder, and extracting a first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) (ii) a Wherein x is_i ^sRepresenting the ith first image, x_i ^tRepresenting the ith second image;

for example: when the encoder comprises 5 encoding blocks, if the size of the input image is 512 × 3, the image is subjected to convolution, nonlinear activation and down-sampling processing through the first encoding block to obtain a first scale feature F1_ 1256 × 256 × 128, wherein 256 × 256 is the size, and 128 is the number of channels; the first scale feature F1_1 is subjected to convolution, nonlinear activation and down-sampling processing through a second coding block to obtain a second scale feature F1_ 2128 × 128; the second scale feature F1_2 is subjected to convolution, nonlinear activation and down-sampling processing through a third coding block to obtain a third scale feature F1_ 364 × 64 × 256; the third scale feature F1_3 is subjected to convolution, nonlinear activation and downsampling processing through a fourth coding block, so that a fourth scale feature F1_ 432 × 32 × 512 is obtained; the fourth scale feature F1_4 is subjected to convolution, nonlinear activation, and downsampling processing by the fifth coding block, and a fifth scale feature F1_ 416 × 16 × 1024 is obtained.

The scale of the n-th scale feature extracted by the M-th coding block is larger than the scale of the n-1-th scale feature extracted by the M-1-th coding block, the M-th scale is the largest scale in the multiple scales, M is a positive integer larger than or equal to 2, and M is smaller than or equal to M and larger than or equal to 2. Preferably, M may be 5.

Step S1032. will the first characteristic G_E(x_i ^s) And a second feature G_E(x_i ^t) Inputting the area generation network to obtain a first candidate frame R (G)_E(x_i ^s) And a second frame candidate R (G)_E(x_i ^t))；

Wherein, the first characteristic G is_E(x_is) or second feature G_E(x_it) firstly carrying out convolution by 3 × 3 to obtain a 256 × 16 × 6 feature map, and then carrying out convolution by two times by 1 × 1 to respectively obtain an 18 × 16 × 6 feature map and a 36 × 16 × 6 feature map, wherein each feature map comprises 2 fractional features and 4 coordinate features, two fractional features express that the candidate frame is a target and not a target, and four coordinates are four coordinate features of the candidate frame.

Step S1033, respectively aligning the first candidate frames R (G) by using the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) performing coordinate regression and classification to obtain a first positioning result

First classification result

Second positioning result

And a second classification result

And calculating the first positioning result

And first classification result

A first loss function of;

wherein, before step S1033, the method further comprises:

respectively selecting the first T candidate frames as the final first candidate frame R_f(G_E(x_i ^s) ) and the final second candidate frame R_f(G_E(x_i ^t)). Where T is a hyper-parameter, which needs to be set manually, in this example, T is preferably 10.

Wherein, in step S1033, the first positioning result is calculated

And first classification result

Comprises:

aligning the first positioning result with a cross-entropy penalty

And first classification result

L_s＝L_{s_class}+L_{s_pos}； (3)

wherein L is_sIs the first positioning result

And first classification result

N represents the total number of the first images, i represents the number of the ith first image, M represents the number of labels contained in the ith first image, j represents the jth label in the ith first image, C represents the total number of prediction categories, K represents the number of coordinates of the current label frame, K represents the current kth category or kth label coordinate, y _ class and y _ pos and represents the label value of the current label; wherein L is_{s_class}And L_{s_pos}The more towards 0, the more accurate the detection classification result and the positioning result.

Step S1034, utilizing the image-level discriminator to discriminate the first characteristic G of the first characteristic_E(x_i ^s) And a second feature G_E(x_i ^t) Classifying and judging the first characteristics G_E(x_i ^s) And a second feature G_E(x_i ^t) And calculating a second loss function of the image-level discriminator;

wherein the first feature G is aligned using the image-level discriminator_E(x_i ^s) And a second feature G_E(x_i ^t) ) classification, including in particular:

based on the antagonistic learning mechanism, the first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) Is inputted to theIn the image discriminator, a first feature consistent confidence coefficient and a second feature consistent confidence coefficient are respectively obtained; wherein the label of the first feature is 1 and the label of the second feature is 0.

Wherein the final goal of the counterlearning mechanism is when the image-level discriminator cannot discriminate the first feature G_E(x_i ^s) And a second feature G_E(x_i ^t) Valid classification) indicates that the distributions of the two features are very similar at this point, and thus the feature is referred to as a consistent feature.

Wherein, a second loss function of the image-level discriminator is calculated, and the calculation formula is as follows:

an example level discriminator representing a first image is shown,

represents the loss function of the image-level discriminator when

Step S1035. in the first characteristic G_E(x_i ^s) On the basis of the first image, predicting a multi-label classification result C (G) of the first image by utilizing the image-level classification regularization module_E(x_i ^s))；

Wherein, for the labeled first image, the following categories are included in the first image, but not limited to: the multi-label classification result C (G) of the first image can be obtained after the worker clothes, the wearing safety helmet and the non-wearing safety helmet are predicted by the image-level classification regularization module_E(x_i ^s) Including but not limited to, wearing work clothes, wearing safety helmets, and not wearing safety helmets. The image-level classification regularization module promotes the neural network model to only concern important areas containing main target objects in the power distribution room by using the weak positioning characteristic of the multi-label classifier, and improves the classification efficiency.

Step S1036, utilizing the classification consistency regularization module to classify the first classification result

wherein, in step S1036, the method includes:

Wherein if the weight of the category is 1, otherwise, the weight is 3; and then obtaining the classification consistency weight matrix lambda according to the search result.

Step S1037, utilizing the instance-level discriminator to pair the first candidate framesR(G_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Classifying and discriminating the first frame candidate R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) and computing a third loss function for the instance-level arbiter based on the classification consistency weight matrix λ;

wherein the first frame candidate R (G) is subjected to the instance-level discriminator_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) classification, including in particular:

based on the antagonistic learning mechanism, the first candidate frame R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Input into the example-level discriminator to respectively obtain a first feature consistent confidence coefficient and a second feature consistent confidence coefficient; wherein the label of the first feature is 1 and the label of the second feature is 0.

Wherein the final goal of the counterlearning mechanism is when the image-level discriminator cannot match the first candidate frame R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Valid classification) indicates that the distributions of the two features are very similar at this point, and thus the feature is referred to as a consistent feature.

an example level discriminator representing the first image,

represents the penalty function of the instance level discriminator when

Step S1038, calculating a total loss function of the convolutional neural network model according to the first loss function, the second loss function and the third loss function;

wherein the calculation formula of the total loss function is as follows:

and S1038, continuously optimizing network structure parameters of the convolutional neural network model by using the total loss function.

And S104, detecting the target object of the power distribution room by using the optimized convolutional neural network model.

After the application of the embodiment, it is found that when the convolutional neural network model is used for detecting the target object in the distribution room, compared with a method for predicting the object in the night image by directly using the distribution room daytime image training model, when the neural network model of the embodiment is used for predicting the object in the night image, the average accuracy and the recall rate of object detection can be improved by about 10%.

In a second aspect, as shown in fig. 3, the present invention provides a convolutional neural network-based object detection apparatus for a power distribution room, the apparatus comprising:

In one possible design, the training module specifically includes:

First classification result

Second positioning result

And a second classification result

And calculating the first positioning result

And first classification result

A first loss function of;

In one possible design, the first positioning result is calculated

And first classification result

aligning the first positioning result with a cross-entropy penalty

And first classification result

L_s＝L_{s_class}+L_{s_pos}； (3)

wherein L is_sIs the first positioning result

And first classification result

an example level discriminator representing a first image is shown,

represents the loss function of the image-level discriminator when

In one possible design, in calculating the third loss function of the example-level discriminator based on the classification consistency weight matrix λ, the second classification discrimination unit uses the following calculation formula:

an example level discriminator representing the first image,

represents the penalty function of the instance level discriminator when

Finally, it should be noted that: the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for detecting objects in a power distribution room based on a convolutional neural network is characterized by comprising the following steps:

constructing a network structure of a convolutional neural network model;

2. A method of convolutional neural network based object detection in a power distribution room, as claimed in claim 1, wherein said convolutional neural network model comprises: the system comprises an encoder, a region generation network, an example-level coordinate regressor, an example-level classifier, an image-level discriminator, an example-level discriminator, an image-level category regularization module and a classification consistency regularization module.

3. The convolutional neural network-based electric distribution room object detection method as claimed in claim 2, wherein the alternately inputting the first image and the second image into the convolutional neural network model for training and continuously optimizing the network structure parameters of the convolutional neural network model comprises:

First classification result

Second positioning result

And a second classification result

And calculating the first positioning result

And first classification result

A first loss function of;

using the instance-level discriminator to pair the first candidate frames R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Classifying and discriminating the first frame candidate R (G)_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) ) and based onCalculating a third loss function of the example-level discriminator by the classification consistency weight matrix lambda;

4. The convolutional neural network-based object detection method for a power distribution room of claim 3, wherein the first candidate frame R (G) is separately processed by the example-level coordinate regressor and the example-level classifier_E(x_i ^s) And the second frame candidate R (G)_E(x_i ^t) Before performing coordinate regression and classification, further comprising:

5. The convolutional neural network-based electrical distribution room object detection method of claim 3, wherein the first positioning result is calculated

And first classification result

Comprises:

aligning the first positioning result with a cross-entropy penalty

And first classification result

L_s＝L_{s_class}+L_{s_pos}； (3)

wherein L is_sIs the first positioning result

And first classification result

6. The convolutional neural network-based electric distribution room object detection method of claim 3, wherein the second loss function of the image-level discriminator is calculated as follows:

an example level discriminator representing a first image is shown,

represents the loss function of the image-level discriminator when

7. The convolutional neural network-based electric distribution room object detection method of claim 3, wherein the first classification result is normalized by the classification consistency regularization module

As a basis, traverse theMulti-label classification result C (G)_E(x_i ^s) Finding the multi-label classification result C (G)_E(x_i ^s) Whether or not to contain the first classification result

8. The convolutional neural network-based object detection method for a power distribution room, as claimed in claim 3, wherein the third loss function of the instance-level discriminator is calculated based on the classification consistency weight matrix λ, and the calculation formula is as follows:

an example level discriminator representing the first image,

represents the penalty function of the instance level discriminator when

9. The method of claim 1, wherein the first image is a distribution room daytime image and the second image is a distribution room nighttime image.

10. A power distribution room object detection apparatus based on a convolutional neural network, the apparatus comprising:

11. A computer device comprising a memory, a processor and a transceiver communicatively connected in sequence, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving messages, and the processor is used for reading the computer program and executing the method for detecting objects in the power distribution room based on the convolutional neural network as claimed in any one of claims 1 to 9.