CN116596895A

CN116596895A - Substation equipment image defect identification method and system

Info

Publication number: CN116596895A
Application number: CN202310581658.5A
Authority: CN
Inventors: 陈明芽; 赵利博
Original assignee: Huayan Zhike Hangzhou Information Technology Co ltd
Current assignee: Huayan Zhike Hangzhou Information Technology Co ltd
Priority date: 2023-05-22
Filing date: 2023-05-22
Publication date: 2023-08-15

Abstract

The application discloses a method and a system for identifying image defects of power transformation equipment, and relates to the technical field of power transformation equipment hardware defect detection, wherein the method and the system comprise the steps of dividing an unmanned aerial vehicle inspection image set into a first training set and a second training set; detecting each sample in the first training set by using a first recognition model to obtain a first recognition result, generating a candidate position area of the sample based on the first recognition result, cutting the candidate position area, then cutting by using a sliding window to obtain a plurality of cutting subgraphs, and detecting a plurality of cutting subgraphs corresponding to the sample represented by the first recognition result by using a second recognition model to obtain a second recognition result; and combining the first identification result and the second identification result and outputting the combined result. The method and the system adopt a mode of a hierarchical recognition model for detection, can greatly improve the detection precision, adopt a network training mode based on sliding window cutting in a second hierarchy, and solve the problem of insufficient sample size.

Description

Substation equipment image defect identification method and system

Technical Field

The application relates to the technical field of power transformation equipment hardware defect detection, in particular to a power transformation equipment image defect identification method and system.

Background

In the transmission line, the power transformation equipment is required to be inspected frequently, and the proportion of the defects of the small-size hardware fitting is up to more than 60% according to the statistics of the inspection condition. Defect identification of connecting hardware and small hardware in power transformation equipment is mainly realized by an image identification technology at present.

In the prior art, when a deep learning neural network is used for identifying and processing a connecting fitting and a fine fitting, there are situations that training samples are insufficient and detection accuracy is not high.

In view of this, the present application has been made.

Disclosure of Invention

The application aims to provide a method and a system for identifying image defects of power transformation equipment, which adopt a hierarchical identification model to detect, can greatly improve detection precision, and simultaneously adopt a sliding window cutting-based network training mode in a second hierarchy, so that the problem of insufficient sample size is solved, and the detection precision of the model is further improved.

Embodiments of the present application are implemented as follows:

in a first aspect, a method for identifying an image defect of a power transformation device includes the steps of: acquiring an unmanned aerial vehicle inspection image set, dividing the unmanned aerial vehicle inspection image set into a first training set and a second training set, wherein the first training set is an original inspection image set, and the second training set is an image set obtained by randomly sampling the first training set; detecting each sample in the first training set by using a first identification model to obtain a first identification result, wherein the first identification result comprises the position information of at least one first target, and the first target refers to a part to be inspected of the power transformation equipment; the first recognition model is a pre-training model obtained based on the first training set; combining and generating a candidate position area of the sample based on the position information of at least one first target, and cutting the candidate position area to obtain a cutting result diagram; sliding window cutting is carried out on each cutting result graph to obtain a plurality of cutting subgraphs, and the plurality of cutting subgraphs are added into a second training set to obtain an enhanced training subset; detecting a plurality of clipping subgraphs corresponding to the samples represented by the first recognition results by using a second recognition model to obtain the second recognition results, wherein the second recognition model is a pre-training model obtained based on the enhanced training subset; and combining the first identification result and the second identification result and outputting the combined result.

In an optional implementation manner, the detecting the plurality of clipping subgraphs corresponding to the samples in the first training set by using the second recognition model further includes the following steps: determining the original inspection picture and all corresponding cutting subgraphs represented by the sample, and recording the frame coordinate position of each cutting subgraph in the original inspection picture; obtaining the coordinate position of a detection frame of the clipping subgraph; the coordinate position of the detection frame refers to the coordinate position of the detection frame in the second recognition result; and obtaining a detection result of the cutting subgraph on the original inspection picture based on the frame coordinate position and the detection frame coordinate position, and taking the detection result as a second adjusted identification result.

In an alternative embodiment, the frame coordinate position and the upper left corner coordinate of the detection frame coordinate position are added to obtain the detection result of the clipping subgraph on the original inspection picture.

In an alternative embodiment, the step of post-processing is further included after combining the first recognition result and the second recognition result, wherein the post-processing is one of non-maximum suppression, large scale non-maximum suppression, or non-maximum fusion.

In an alternative embodiment, the second training set is a picture set obtained by randomly sampling the first training set according to a first scale, and the first scale is 0.125, 0.25 or 0.5.

In an alternative embodiment, sliding window cropping each cropping result map comprises the steps of: and determining the size of the sliding window, and cutting each cutting result graph from left to right and from top to bottom by utilizing the sliding window.

In alternative embodiments, the sliding step size of the sliding window is 0.125, 0.25, or 0.5 of the sliding window size.

In an alternative embodiment, a ppyolo algorithm is employed in the second recognition model, wherein layer additions are replaced with ESE blocks in the head network of the algorithm.

In an alternative embodiment, in the first recognition model, the Feature extraction network adopts a ResNeXt101 module, and the enhanced training subset is simultaneously input to the ResNeXt101 module and a CSPRepResNet module in a PPYOLOE algorithm to obtain a first Feature _res And a second Feature _csp The method comprises the steps of carrying out a first treatment on the surface of the Marking a first Feature based on the first recognition result _res And a second Feature _csp Coordinate information of (a) to obtain a first mark Feature _res-obj And a second mark Feature _csp-obj The method comprises the steps of carrying out a first treatment on the surface of the The first mark Feature _res-obj And a second mark Feature _csp-obj And (3) performing feature fusion, and inputting the obtained fusion result into a CSPPAN module to perform PPYoloE network training.

In a second aspect, an image defect recognition system for a power transformation device includes:

the first acquisition module is used for acquiring an unmanned aerial vehicle inspection image set, dividing the unmanned aerial vehicle inspection image set into a first training set and a second training set, wherein the first training set is an original inspection picture set, and the second training set is a picture set obtained by randomly sampling the first training set;

the first detection module is used for detecting each sample in the first training set by using the first identification model to obtain a first identification result, wherein the first identification result comprises the position information of at least one first target, and the first target refers to a part to be inspected of the power transformation equipment; the first recognition model is a pre-training model obtained based on the first training set;

the first clipping module is used for combining and generating a candidate position area of the sample based on the position information of at least one first target, clipping the candidate position area and obtaining a clipping result diagram;

the second clipping module is used for clipping each clipping result graph by a sliding window to obtain a plurality of clipping subgraphs, and adding the plurality of clipping subgraphs into a second training set to obtain an enhanced training subset;

the second detection module is used for detecting a plurality of clipping subgraphs corresponding to the samples represented by the first recognition results by using a second recognition model to obtain the second recognition results, wherein the second recognition model is a pre-training model obtained based on the enhanced training subset;

and the first output module is used for combining the first identification result and the second identification result and outputting the combined first identification result and the combined second identification result.

The embodiment of the application has the beneficial effects that:

according to the power transformation equipment image defect identification method and system provided by the embodiment of the application, the first identification model and the second identification model are utilized to respectively identify the connecting hardware fitting and the small hardware fitting in the inspection image, the candidate positions containing the connecting hardware fitting and the small hardware fitting are extracted in the first identification model, the extracted high-precision position frame is further identified by utilizing the second identification model, the defect type and the coordinate position of the small hardware fitting are predicted, the precision of the whole identification result can be greatly improved, and the extraction loss of effective information of the image is reduced; meanwhile, the problem of insufficient samples of the neural network during training can be solved by combining a sliding window cutting mode in the stage of the second recognition model, and the detection precision can be further improved when the obtained cutting subgraph is detected.

In general, compared with a single-layer model, the method and the system for identifying the image defects of the power transformation equipment, provided by the embodiment of the application, adopt a double-layer-level identification model to detect, so that the detection precision can be improved to a greater extent, and meanwhile, a sliding window cutting-based network training mode is adopted in a second level, so that a network algorithm is improved, and the problem of insufficient sample size can be solved, and the operation speed is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating main steps of a defect identifying method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a defect identification method according to an embodiment of the present application;

FIG. 3 is an exemplary block diagram of a defect identification system provided by an embodiment of the present application;

fig. 4 is a diagram of a defect recognition result of a small hardware tool according to an embodiment of the present application.

Icon: 700-defect recognition system; 710—a first acquisition module; 720-a first detection module; 730-a first clipping module; 740-a second clipping module; 750-a second detection module; 760-first output module.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It is to be understood that the terms "system," "apparatus," and/or "module" as used herein are intended to be one way of distinguishing between different components, elements, parts, portions, or assemblies of different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.

As used herein and in the claims, the terms "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural, unless the context clearly dictates otherwise. Generally, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

A flowchart is used in the present application to describe the operations performed by a system according to embodiments of the present application. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.

Examples

Referring to fig. 1 and 2, the method for identifying image defects of a power transformation device according to the present embodiment includes the following steps:

s100: acquiring an unmanned aerial vehicle inspection image set, dividing the unmanned aerial vehicle inspection image set into a first training set and a second training set, wherein the first training set is an original inspection image set, and the second training set is an image set obtained by randomly sampling the first training set; the method comprises the steps of taking a patrol image set shot during patrol of the unmanned aerial vehicle as a first training set, taking a part of the patrol image set as a second training set according to random sampling from the first training set, and respectively training neural networks of different levels by the first training set.

The second training set is a picture set obtained by randomly sampling the first training set according to a first proportion, wherein the first proportion is 0.125, 0.25 or 0.5, namely, the training set representing the sampling proportion can occupy 0.125, 0.25 or 0.5 of the original inspection picture set, the number of the second training set is further determined according to the requirement of the number of samples, the proportion can be enlarged to 0.5 if the number of the original samples is small, and the proportion can be selected to be 0.25 if the number of the original samples is relatively sufficient. Of course, in other embodiments, the remaining ratios, such as 0.3, 0.16, etc., may be selected, so as to select the number of samples to be increased according to the actual situation.

S200: and detecting each sample in the first training set by using a first identification model to obtain a first identification result, wherein the first identification model is used as a first layer in a two-level model and is used for carrying out preliminary detection on an original inspection picture set, so as to detect a connecting fitting region capable of being attached with fine fittings, wherein the connecting fitting region at least comprises one position, namely, the first identification result obtained by detection comprises position information of at least one first target, and the first target refers to a part (connecting fitting) to be inspected of the power transformation equipment.

The first recognition model is a pre-training model obtained based on the first training set; the algorithm adopted in the training process is mainly a two-stage algorithm (the type of algorithm is characterized in that firstly, a region proposal frame is generated, and then the generated proposal frame is classified), such as FastR-CNN, fasterR-CNN, R-FCN, cascadeR-CNN and the like. In this embodiment, for example, the FasterR-CNN algorithm is used.

In the FasterR-CNN algorithm, a region generation network (Region Proposal Network, RPN) is firstly proposed, a traditional method for generating candidate frames by sliding window and selective search is abandoned, the candidate frames are directly generated by using the RPN, and the accuracy of a target detection algorithm is greatly improved. The FasterR-CNN algorithm belongs to a two-stage target detector, the backbone of the feature extraction network is ResNeXt101, and the network structure reorganization neck is a feature pyramid network (Feature Pyramid Networks, FPN).

The Loss of FasterR-CNN is largely divided into that of RPN and Fast R-CNN, and both include a classification Loss (CLS Loss) and a regression Loss (Bbox Regression Loss), i.e

In the formula (1-1): i represents the index of the anchor in each batch; p is p _i Is the probability of no target in the anchor, when a target is present, p _i =1 vice versa p _i =0; ti represents 4 of the prediction bounding boxesParameters; t is t _i ^* Is the coordinate parameter of the real boundary box corresponding to the anchor containing the target; p is p _i ^* L _reg Representing regression loss of anchor containing target; { pi } represents the output of the classification layer; { ti } represents the output of the regression layer, using Ncls and N _reg And normalizing the weight lambda. The formula (1-1) comprises two parts of classification loss and regression loss, wherein the loss function of classification is as follows:

the regression loss function is:

the parameter R in the formula (1-3) is a smooth function, i.e

For the regression operation of the bounding box, the following modes are adopted:

in the formula (1-5): x, y represents the center coordinates of the bounding box; w is the width of the bounding box; h is the height of the bounding box; x, x _a ，x ^* Respectively a predicted boundary frame, an anchor frame and a real boundary frame.

Through the technical scheme, the fasterR-CNN algorithm is applied to extract the position of at least one effective component (connecting hardware component) of the inspection image of the unmanned aerial vehicle of the distribution network, and then step S300 is carried out: combining and generating a candidate position area of the sample based on the position information of at least one first target, and cutting the candidate position area to obtain a cutting result diagram; the step represents that the position information of at least one first target is obtained and combined through picture stitching, a candidate position area of the sample is obtained, the candidate position area is a candidate area waiting for further identification in the original image, the further identification (namely, two-level identification mode) can be carried out by utilizing a second-level network, the further identification input is a cutting result diagram obtained after the picture with the candidate position area is cut, and the purpose of cutting is to reduce noise of input parameters and ensure accuracy in the second-stage identification.

Before the further identification process, considering that the unmanned aerial vehicle inspection image acquired by the unmanned aerial vehicle has insufficient samples, the situation that the trained neural network model is not high enough in precision easily occurs, namely, step S400 is performed: sliding window cutting is carried out on each cutting result graph to obtain a plurality of cutting subgraphs, and the plurality of cutting subgraphs are added into the second training set to obtain an enhanced training subset; the step represents that sample amplification is carried out on the clipping result graphs obtained by each sample, each clipping result graph is further clipped in a sliding window clipping mode, a plurality of clipping subgraphs can be obtained by each clipping result graph, all clipping subgraphs are added into the second training set, and the number of samples in the second training set can be greatly increased.

The sliding window cutting is to use a rectangular window to perform sliding cutting, and a picture can be cut into more sub-pictures with smaller sizes, in this embodiment, the step of cutting each cutting result picture into the sliding window includes the following steps: the size of the sliding window (window) is determined, the size of the sliding window can be 0.125, 0.25 or 0.5 of the short side of the clipping result diagram, the sliding step size of the sliding window can be 0.125, 0.25 or 0.5 of the sliding window size, each clipping result diagram is clipped from left to right and from top to bottom by using the sliding window of the parameter, so as to obtain a plurality of clipping subgraphs of the clipping result diagram, and the clipping subgraphs after clipping are added to the second training set to be used for constructing an enhanced training subset (enhanced training subset).

It should be noted that, the samples in the second training set include both the original inspection image and the result image extracted based on the first-level recognition, and at this time, when the neural network training is performed by using the second training set, the recognition accuracy can be higher due to smaller noise of the input parameters, so as to ensure the recognition accuracy of the second-level model, and obtain a more accurate final detection result.

S500: and detecting a plurality of cutting subgraphs corresponding to the samples represented by the first identification result by using a second identification model to obtain a second identification result, wherein the second identification result mainly refers to the types and the number of the fine hardware defects. Wherein the second recognition model is a pre-training model obtained based on the enhanced training subset; in this embodiment, the algorithm used in the second recognition model training process is, for example, ppyolo algorithm.

Compared with the traditional PPYOLOE algorithm, due to the expandable backbond and the neg, CSPRepResNet is designed as the backbond, the neg part also adopts a newly designed CSPPAN structure, and the backbond and the neg are based on the CSPRepResstage proposed by us. The novel backspace and the novel enhance the model characterization capability and simultaneously promote the reasoning speed of the model, and the size of the model can be flexibly configured through a width multiplexer and a depth multiplexer.

■TAL(Task Alignment Learning)

In order to further improve the accuracy of the model, a dynamic matching algorithm strategy TAL in TOOD is selected. The TAL considers classification and regression at the same time, so that the matching result obtains optimal classification and positioning accuracy at the same time.

■Efficient Task-aligned head

In terms of head detection, we improved on the T-head basis of TOOD.

Firstly, ESE block is used for replacing layer attribute which is time-consuming in the original text, so that the model is improved in speed while the precision is kept unchanged. Secondly, because the T-head uses a deformable convolution operator, the hardware deployment is not friendly, the classification branch uses a shortcut to replace cls-align modules, the regression branch uses an integration layer to replace reg-align modules containing deformable convolution, and the head becomes more efficient, concise and easy to deploy by the two improvements. By the technical scheme, the operation speed of the whole second hierarchical network can be improved.

S600: and combining the first identification result and the second identification result and outputting the combined result. Because the input original atlas corresponding to the first recognition result and the second recognition result may have a repeated or crossed condition (the second training set is a part of the first training set), the repeated crossing condition may also occur in the first recognition result and the second recognition result, and the output is only required after the intersection merging processing is performed at this time.

According to the technical scheme, the situation that the original sample of the unmanned aerial vehicle inspection image is insufficient is considered, meanwhile, in the two-level identification mode, the random extraction part of the original inspection image is used as input for two-stage further training, the two-level identification high-precision mode is utilized as a whole, the mode input noise of the second level is smaller, the identification precision is further improved, meanwhile, the problem of sample shortage can be solved, the defect identification method is particularly suitable for the scene of defect detection of power transformation equipment (because the unmanned aerial vehicle is high-altitude inspection, the obtained image is a high-definition picture, and the defect of fine hardware fittings at the position is needed to be positioned and identified in the high-definition picture), namely, the scene of smaller target identification is needed to be obtained on an appointed target, and the identification precision is higher while the acquired sample is fully utilized.

As described above, since the sliding window clipping mode is adopted, the sliding step length of each window is a certain proportion of the window size, a portion of each clipping sub-image overlapping with an adjacent clipping sub-image appears, and when the clipping sub-image is restored to the detection result on the original large image, a certain offset is required to be performed on the overlapping portion to obtain a more accurate detection result, in some embodiments, the step S500: the method further comprises the following steps after detecting a plurality of clipping subgraphs corresponding to the samples in the first training set by using the second recognition model:

s510: determining the original inspection picture and all corresponding cutting subgraphs represented by the sample, and recording the frame coordinate position of each cutting subgraph in the original inspection picture;

s520: obtaining the coordinate position of a detection frame of the clipping subgraph; the coordinate position of the detection frame refers to the coordinate position of the detection frame in the second recognition result;

s530: and obtaining a detection result of the cutting sub-image on the original inspection picture based on the frame coordinate position and the detection frame coordinate position, for example, in this embodiment, adding the frame coordinate position and the upper left corner coordinate of the detection frame coordinate position to obtain a detection result of the cutting sub-image on the original inspection picture, where the detection result is used as the adjusted second recognition result.

Through the technical scheme, the offset is corrected by adding the frame coordinate position and the detection frame coordinate position, so that a more accurate detection result is obtained. In addition, after the first recognition result and the second recognition result are combined, a post-processing step is further included, wherein the post-processing is one of non-maximum suppression, large-scale non-maximum suppression or non-maximum fusion, and therefore a final detection result is obtained.

In some embodiments, in order to enable the target information detected in the recognition stage of the first recognition model to provide more area information around the small hardware defect target, the detection of the small hardware defect target is more effectively assisted. Recording a first recognition result recognized by the first recognition model, wherein the first recognition model feature extraction network adopts a ResNeXt101 module, and the ResNeXt101 module is reserved for feature extraction in the second recognition model for the second stage. Specifically, when training the second recognition model, training data is simultaneously input into a Feature extraction network ResNeXt101 of the first stage and a Feature extraction network CSPRepResNet of PPYoloE to respectively obtain a series of features which are marked as features _res And Feature _csp The coordinate information of the target is obtained through the detection of the current training image data in the first recognition model, and the corresponding positions of the features extracted in ResNeXt101 and CSPRepResNet are obtained according to the coordinates and marked as features _res-obj And Feature _csp-obj The feature acquired at this time is one-stage target information. To be Feature _res-obj And Feature _csp-obj Feature fusion is performed, e.g. weights 0.6 and 0.4 are given respectively, to obtainTo a new Feature map is marked as Feature _res-csp-obj And inputting the new feature map obtained by fusion into a subsequent CSPPAN structure to train the PPYoloE network.

In summary, in order to further improve the capability of the detection network to detect the small hardware defects and solve the problem that the sample is insufficient when the network is trained, the method provides a network training fine adjustment and reasoning detection module based on sliding window cutting based on an image processing technology, and cuts an original high-resolution inspection large image shot by an unmanned aerial vehicle into small subgraphs for network training fine adjustment and reasoning detection. As shown in fig. 2, the sliding window clipping-based network training fine tuning and reasoning detection module is mainly divided into two parts.

Specifically, the first part is a training fine tuning module, when training, an objective function of an optimized network is trained by using a built original high-resolution large-graph training set, training is stopped when the number of training iteration rounds reaches a preset iteration number N1, a pre-training detection model can be obtained, and a part of pictures are randomly sampled in the large-graph training set and used for building a fine tuning training subset. Then, each picture in the training subset is cut by an image processing technology in turn from left to right and from top to bottom in a 'sliding window'. Adding the clipped subgraph to the fine training subset may be used to construct an enhanced training subset. And by utilizing the enhanced training subset, the pre-training model obtained by the original large graph can be further utilized, the learning rate of the network is adjusted to continuously train the objective function of the optimized network, and the fine-tuning training is stopped when the preset iteration number N2 is reached, so that the final optimized detection model can be obtained.

The second part is an inference detection module, and during detection, the inference detection is carried out by using the original large-scale inspection picture and the sub-picture after sliding window cutting based on the trained optimal model. In the process of clipping sub-image detection, first, the left upper corner and right lower corner coordinate positions of each clipping sub-image on the original large image are recorded during clipping, for example, if n clipping sub-images exist after clipping, the list of the coordinate frame positions (refer to the frame coordinate positions) of all clipping sub-images (i.e. n sub-images) is [ [ shift_x ] _1l ，shift_y _1l ，shift_x _1b ，shift_y _1b ]，...，[shift_x _nl ，shift_y _nl ，shift_x _nb ，shift_y _nb ]]. For the detection result of each clipping sub-image, the coordinate position of the detection result frame (the coordinate position of the detection frame) x can be detected according to the clipping sub-image _rl ，y _rl ，x _rb ，y _rb ]And the coordinate frame position [ shift_x ] of the clipping subgraph _rl ，shift_y _rl ，shift_x _rb ，shift_y _rb ]And adjusting the detected result to a position box relative to the original image, wherein a box calculation formula is as follows:

and adding the detection frame with the upper left corner coordinate position frame of the clipping subgraph to obtain a detection result on the original large graph. Finally, merging and post-processing the large-graph detection result and the sub-graph detection result, such as Non-Maximum suppression (Non-Maximum Suppression, NMS), large-scale Non-Maximum suppression (Large Scale Non Maximum Suppression, LSNMS) or Non-Maximum Merge (NMM), so as to filter or Merge the merged overlapped detection frames, and obtain a final detection result.

In summary, for defect detection of the connection hardware fitting and the small hardware fitting of the power transformation equipment, the first-stage single-model algorithm is superior to the first-stage single-model algorithm, and the method adopts the two-level model for identification, so that the recall ratio and the precision ratio are obviously improved, and the following characteristics of clear brightness, blurriness, dimness or long shooting target exist for the original inspection images under different areas, voltages and shooting conditions in the adopted sample set. Fig. 4 shows the result of identifying the defect of the small hardware, and it can be seen that the two-level identification model provided in this embodiment can effectively detect the defect area from the original inspection image, and is still superior for the large scene and the small target. The enhancement processing of the image brightness is added in the model training process, the recognition effect of the low-brightness image is improved, but due to the fact that the shooting quality of part of the image is low, the picture is blurred, the model can not accurately judge whether pins in the targets exist or not even human eyes, and the model tends to recognize the blurred targets as the missing pins.

Referring to fig. 3, the present embodiment further provides a power transformation device image defect recognition system 700, which is mainly used for dividing functional modules of the power transformation device image defect recognition system 700 according to the embodiment of the method described above. For example, each functional module may be divided, or two or more functions may be integrated into one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, the division of the modules in the present application is illustrative, and is merely a logic function division, and other division manners may be implemented in practice. The transformation device image defect recognition system 700 may include a first acquisition module 710, a first detection module 720, a first cropping module 730, a second cropping module 740, a second detection module 750, and a first output module 760. The functions of the respective unit modules are explained below.

A first obtaining module 710, configured to obtain an unmanned aerial vehicle inspection image set, divide the unmanned aerial vehicle inspection image set into a first training set and a second training set, where the first training set is an original inspection image set, and the second training set is an image set obtained by randomly sampling the first training set;

the first detection module 720 is configured to detect each sample in the first training set by using a first identification model, so as to obtain a first identification result, where the first identification result includes location information of at least one first target, and the first target is a component to be inspected of the power transformation device; the first recognition model is a pre-training model obtained based on the first training set;

a first clipping module 730, configured to combine and generate a candidate location area of the sample based on the location information of at least one first target, clip the candidate location area, and obtain a clipping result map;

a second clipping module 740, configured to clip each clipping result graph by sliding window to obtain a plurality of clipping sub-graphs, and add the plurality of clipping sub-graphs to the second training set to obtain an enhanced training subset;

a second detection module 750, configured to detect a plurality of the clipping subgraphs corresponding to the samples represented by the first recognition result by using a second recognition model, to obtain a second recognition result, where the second recognition model is a pre-training model obtained based on the enhanced training subset; in some embodiments, the second detection module 750 is further configured to determine the original inspection picture represented by the sample and all corresponding clipping sub-pictures, and record a frame coordinate position of each clipping sub-picture in the original inspection picture; obtaining the coordinate position of a detection frame of the clipping subgraph; the coordinate position of the detection frame refers to the coordinate position of the detection frame in the second recognition result; and obtaining a detection result of the cutting sub-graph on the original inspection picture based on the frame coordinate position and the detection frame coordinate position, and taking the detection result as the adjusted second identification result.

And the first output module 760 is configured to combine the first recognition result and the second recognition result and output the combined result.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims and the equivalents thereof, the present application is also intended to include such modifications and variations.

Claims

1. The method for identifying the image defects of the power transformation equipment is characterized by comprising the following steps of:

acquiring an unmanned aerial vehicle inspection image set, dividing the unmanned aerial vehicle inspection image set into a first training set and a second training set, wherein the first training set is an original inspection image set, and the second training set is an image set obtained by randomly sampling the first training set;

detecting each sample in the first training set by using a first identification model to obtain a first identification result, wherein the first identification result comprises position information of at least one first target, and the first target refers to a part to be inspected of power transformation equipment; the first recognition model is a pre-training model obtained based on the first training set;

combining and generating a candidate position area of the sample based on the position information of at least one first target, and cutting the candidate position area to obtain a cutting result diagram;

sliding window cutting is carried out on each cutting result graph to obtain a plurality of cutting subgraphs, and the plurality of cutting subgraphs are added into the second training set to obtain an enhanced training subset;

detecting a plurality of clipping subgraphs corresponding to the samples represented by the first recognition results by using a second recognition model to obtain second recognition results, wherein the second recognition model is a pre-training model obtained based on the enhanced training subset;

and combining the first identification result and the second identification result and outputting the combined result.

2. The method for identifying image defects of power transformation equipment according to claim 1, further comprising the steps of, after detecting a plurality of the cropping sub-graphs corresponding to the samples in the first training set by using the second identification model:

determining the original inspection picture and all corresponding cutting subgraphs represented by the sample, and recording the frame coordinate position of each cutting subgraph in the original inspection picture;

obtaining the coordinate position of a detection frame of the clipping subgraph; the coordinate position of the detection frame refers to the coordinate position of the detection frame in the second recognition result;

and obtaining a detection result of the cutting sub-graph on the original inspection picture based on the frame coordinate position and the detection frame coordinate position, and taking the detection result as the adjusted second identification result.

3. The method for identifying the image defect of the power transformation device according to claim 2, wherein the frame coordinate position and the upper left corner coordinate of the detection frame coordinate position are added to obtain a detection result of the clipping subgraph on the original inspection picture.

4. The power transformation device image defect recognition method according to claim 1, further comprising a post-processing step after merging the first recognition result and the second recognition result, wherein the post-processing is one of non-maximum suppression, large-scale non-maximum suppression, or non-maximum fusion.

5. The method for identifying image defects of power transformation equipment according to claim 1, wherein the second training set is a picture set obtained by randomly sampling the first training set according to a first proportion, and the first proportion is 0.125, 0.25 or 0.5.

6. The method for identifying image defects of power transformation equipment according to claim 1 or 5, wherein said sliding window cropping of each cropping result map comprises the following steps: and determining the size of a sliding window, and cutting each cutting result graph from left to right and from top to bottom by utilizing the sliding window.

7. The method of claim 6, wherein the sliding step size of the sliding window is 0.125, 0.25 or 0.5 of the sliding window size.

8. The transformation device image defect identification method according to claim 1, wherein a PPYOLOE algorithm is adopted in the second identification model, and wherein layer insulation is replaced by ESE block in a head network of the algorithm.

9. The method for identifying image defects of power transformation equipment according to claim 8, wherein in the first identification model, a Feature extraction network adopts a ResNeXt101 module, and the enhanced training subset is simultaneously input to the ResNeXt101 module and a CSPRepResNet module in a PPYOLOE algorithm to obtain a first Feature _res And a second Feature _csp ；

Marking the first Feature based on the first recognition result _res And a second Feature _csp Coordinate information of (a) to obtain a first mark Feature _res-obj And a second mark Feature _csp-obj ；

The first mark Feature _res-obj And a second mark Feature _csp-obj And (3) performing feature fusion, and inputting the obtained fusion result into a CSPPAN module to perform PPYoloE network training.

10. An image defect recognition system for a power transformation device, comprising:

the unmanned aerial vehicle inspection system comprises a first acquisition module, a second acquisition module and a first processing module, wherein the first acquisition module is used for acquiring an unmanned aerial vehicle inspection image set, dividing the unmanned aerial vehicle inspection image set into a first training set and a second training set, wherein the first training set is an original inspection picture set, and the second training set is a picture set obtained by randomly sampling the first training set;

the first detection module is used for detecting each sample in the first training set by using a first identification model to obtain a first identification result, wherein the first identification result comprises the position information of at least one first target, and the first target refers to a part to be inspected of the power transformation equipment; the first recognition model is a pre-training model obtained based on the first training set;

the first clipping module is used for generating a candidate position area of the sample based on the position information of at least one first target in a merging mode, clipping the candidate position area and obtaining a clipping result diagram;

the second clipping module is used for carrying out sliding window clipping on each clipping result graph to obtain a plurality of clipping subgraphs, and adding the plurality of clipping subgraphs into the second training set to obtain an enhanced training subset;

the second detection module is used for detecting a plurality of cutting subgraphs corresponding to the samples represented by the first recognition results by using a second recognition model to obtain second recognition results, wherein the second recognition model is a pre-training model obtained based on the enhanced training subset;

and the first output module is used for outputting the first identification result and the second identification result after being combined.