CN113763326B

CN113763326B - Pantograph detection method based on Mask scanning R-CNN network

Info

Publication number: CN113763326B
Application number: CN202110890405.7A
Authority: CN
Inventors: 洪汉玉; 陈冰川; 马雷; 罗心怡
Original assignee: Wuhan Institute of Technology
Current assignee: Wuhan Institute of Technology
Priority date: 2021-08-04
Filing date: 2021-08-04
Publication date: 2023-11-21
Anticipated expiration: 2041-08-04
Also published as: CN113763326A

Abstract

The invention discloses a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following steps: s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, and dividing the data into a training sample set and a test sample set; s2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration method for carrying out fine restoration on the pantograph coarse-granularity segmentation results; s3, loading data of the training sample set into a pantograph detection network, repeatedly and iteratively training, and performing parameter adjustment to obtain a high-quality pantograph detection model; and S4, loading a high-quality pantograph detection model, inputting data of the test sample set into the model, and evaluating a pantograph detection segmentation result. The pantograph detection method has higher accuracy and stronger robustness, does not need other expensive equipment as auxiliary equipment, and can greatly save detection cost.

Description

Pantograph detection method based on Mask scanning R-CNN network

Technical Field

The invention relates to the technical field of computer digital image processing and pattern recognition, in particular to a pantograph detection method based on a Mask scanning R-CNN network.

Background

Along with the rapid development of electrified railways represented by high-speed rails in China, higher requirements are put forward on the safety of traction power supply systems. The pantograph slide plate is used as a component which is only contacted with the contact net of the electric locomotive, is the most important electricity taking equipment in the whole power supply system of the electric locomotive, and has direct influence on whether the electric locomotive can safely and stably run. However, in the running process of the electric locomotive, the pantograph slide plate is continuously contacted with the contact net to cause loss, if the loss is serious, the pantograph can collide with hard points on a power supply line of the contact net, so that the pantograph shakes, deforms and even falls off to cause locomotive faults, the light train is late, the heavy train is heavy to cause major accidents of railway traffic, and the public scars and property losses are caused. Therefore, the pantograph is timely and accurately detected and identified, and is particularly important to ensuring safe operation of the pantograph and avoiding safety accidents. However, current research on pantograph detection is relatively small, on the one hand because of the small amount of relevant data, and on the other hand because no quality algorithms provide technical support. Thus, pantograph detection remains a current technical challenge.

At present, three main ways of pantograph detection exist: the method comprises the following steps of ground online detection, manual roof climbing detection and vehicle-mounted equipment detection, but the methods have certain limitations. The on-line detection on the ground can only detect the thickness of the pantograph slide plate, has single function and limited application range; the manual roof climbing detection can only be performed when a train enters a warehouse and the overhead line is powered off, so that manpower and resources are consumed, and the efficiency is low; the detection of the vehicle-mounted equipment requires that each locomotive is provided with the vehicle-mounted equipment, so that the cost is high and the vehicle-mounted equipment is not suitable for large-scale popularization. In recent years, with the improvement of monitoring equipment imaging technology and the advancement of related algorithms, a deep learning technology has achieved excellent results in the field of target detection. The deep learning-based target detection algorithm is mainly divided into two aspects, one-stage target detection and two-stage target detection. The first-stage target detection mainly comprises the following steps: YOLO, SSD, etc., which are relatively fast and have relatively low accuracy; the two-stage target detection mainly comprises the following steps: R-CNN, fast R-CNN, etc., which are slower in speed and relatively higher in accuracy. Although these methods are effective for handling simple, single objects, they do not have outstanding detection performance for objects in a complex context such as a pantograph.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a pantograph detection method based on a Mask scanning R-CNN network.

The technical scheme adopted for solving the technical problems is as follows:

the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following steps:

s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;

s2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration method for carrying out fine restoration on the pantograph coarse-granularity segmentation results;

s3, loading data of the training sample set into a pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;

and S4, loading a high-quality pantograph detection model, inputting data of the test sample set into the model, and evaluating a pantograph detection segmentation result.

Further, the step S1 of the present invention specifically includes:

s11, acquiring infrared pantograph data of a pantograph net through an infrared camera, and decoding the data and classifying scenes;

s12, preprocessing the data, distributing a training sample set and a test sample set, and constructing a target database.

Further, the step S11 of the present invention specifically includes:

shooting an arc-net infrared video material through an infrared camera arranged in a vehicle-mounted contact net running state detection device in front of the pantograph, and decoding the video to obtain 10000 frames of images; the scene of the infrared pantograph images of the pantograph net collected comprises mountain bodies, bridges, tunnels and iron bridges, weather conditions comprise overcast and rainy, heavy fog, sunlight irradiation and heavy snow, the diversity of the detected data of the pantograph is met, and all pictures are stored in an infrared image folder of the pantograph net of a specified path.

Further, the step S12 of the present invention specifically includes:

counting the abscissa and the ordinate of four coordinate points of a pantograph calibration frame in 10000 pantograph net infrared images, and respectively taking a union set of the abscissa and the ordinate values, wherein the maximum area formed is the range of the pantograph; in order to prevent the pantograph from fluctuating and improving the robustness in the driving process, enlarging the upper and lower boundaries of a calibration frame corresponding to the range of the pantograph by a certain pixel value, and respectively cutting sub-images with fixed sizes from the original images to construct a pantograph template database; 10000 images in the pantograph template database are marked by using a deep learning open source marking tool LabelMe, 6000 images are distributed to serve as a training set, the rest 4000 images serve as a test set, training data and test data do not have repeated images, and then the pantograph target database is successfully constructed.

Further, the step S2 of establishing a pantograph detection network specifically includes:

s21, a Backbone network Backbone uses a residual network ResNet101 and a feature pyramid network FPN to extract a pantograph multi-scale feature map;

s22, using an area recommendation network RPN to propose candidate window areas Proposals; the detection Head RCNN Head performs region of interest alignment RoIALign operation on each candidate region generated by the region recommendation network RPN;

s23, classifying pantograph targets of candidate areas and carrying out bounding box regression operation on the characteristic diagram after RoIAlign through a series of convolution layers and full connection layers;

the Mask Head Mask is used for obtaining the characteristics of the example pixel level, up-sampling of the characteristic map and change of channel dimension are realized through 6-layer convolution operation, and a corresponding prediction Mask is generated;

mask cross-merging is performed by using the output of a prediction mask and the characteristics of RoIAlign after the alignment of the region of interest as input compared with a header mask IoU Head, the downsampling of a feature map is realized through 4 convolution layers, 3 full connection layers are connected, and the last layer outputs mask IoU scores of C categories and multiplies the mask IoU scores with category scores in a detection header RCNN Head to obtain more accurate mask scores;

s24, the Mask edge repair Head calculates the cross ratio of the coarse granularity Mask obtained by the Mask Head Mask and the true value Mask, matches the best pantograph boundary in the training set, and obtains the refined repair result of the pantograph boundary.

Further, the step S21 of the present invention specifically includes:

adopting ResNet101 and FPN as a backbone network to carry out multi-scale extraction on the characteristics of the pantograph; resNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C ₂ 、C ₃ 、C ₄ And C ₅ The method comprises the steps of carrying out a first treatment on the surface of the FPN is a top-down structure, and is transversely connected by combining with the output of each layer of ResNet101 to obtain P ₂ 、P ₃ 、P ₄ And P ₅ A feature map of four layers of fusion;

the step S22 specifically includes:

the method comprises the steps that each layer of feature map output by a main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained;

aligning candidate areas with feature graphs by adopting RoIAlign on Proposals generated by RPN, mainly using bilinear interpolation algorithm to expand the feature graphs, and performing maximum pooling operation to adjust Proposals to uniform size;

the RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information;

the step S23 specifically includes:

The Mask Head uses the features after RoIAlign to realize up-sampling of the feature map and change of channel dimension through 6 convolution operations, and generates a corresponding prediction Mask; meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, 3 full-connection layers are connected, and the last layer outputs mask IoU scores of C categories; mask IoU score S to be predicted _MaskIoU And a classification confidence score S of RCNN Head _Class Multiplying to obtain the final mask score S _MS A segmentation score used to represent the mask accuracy; the calculation formula is as follows:

S _MS ＝S _MaskIoU ·S _Class

the step S24 specifically includes:

the refined edge restoration method comprises the steps of obtaining a pantograph region which is best matched with a training set by calculating the intersection ratio of a pantograph coarse granularity mask dividing region and a pantograph region marked in the training set, adding the intersection ratio as a brand-new level into learning and pushing of the pantograph dividing, and carrying out refined boundary restoration on a pantograph extraction result by utilizing the region; the pantograph region image marked in the training set is a pantograph target database image, and the pantograph edge repair intersection ratio can be calculated as:

wherein S represents a true value of the pantograph marked in the training data, P represents a rough-granularity division result of the pantograph, are (S) represents an intersection area of the rough-granularity division result of the pantograph and the true value, and are (P) represents a union area of the rough-granularity division result of the pantograph and the true value.

Further, the step S3 of the present invention specifically includes:

inputting the training sample set database obtained in the step S1 into a pantograph detection network established in the step S2 for training to obtain a high-quality pantograph detection model;

loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of an MS-COCO data set, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;

obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by adopting a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by utilizing a RoIAlign module for space size alignment;

classifying and regressing targets of the candidate region by using a series of convolution layers and full connection layers, and performing pixel-level coarse-granularity mask segmentation on the targets of the candidate region;

training the mask iou Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score;

calculating the intersection ratio of the coarse-granularity mask segmentation area and the mark mask area in the training set, obtaining a pantograph area with the best matching, and repairing the pantograph segmentation boundary by using the area;

The whole process is trained by adopting three losses of classification loss, detection frame regression loss and mask loss, and is calculated as follows:

wherein the class loss functionIs logarithmic loss of target and non-target two categories, cross entropy loss is selected, and +_is->Is the logarithmic loss of both the target and non-target classes, k _i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T _Class Is a normalization parameter;

bounding box regression loss functionWherein n is _i Representing anchor block measurement offset,/->Representing a truth tag offset; by smoothL ₁ Regression loss of the function as a bounding box; smoothL ₁ The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and the function is expressed as a piecewise function as follows:

wherein σ is used to control the smoothness of the region;

mask loss functionThe mask and truth mask cross ratio score is obtained by the regression depth neural network segmentation, and L is used ₂ Scoring the loss regression cross-over ratio;

the training network type parameter is set to 2, and mainly comprises a pantograph target and a background, the Batch size is set to 100, the iteration number is set to 20000, the momentum factor is set to 0.9, the weight attenuation coefficient is set to 0.001, and the initial learning rate is set to 0.001.

Further, the step S4 of the present invention specifically includes:

the segmentation result of the pantograph is evaluated, and the average accuracy is calculated by adopting the precision and the recall ratio to evaluate:

wherein T is _P F is the correct example _P As a positive example of error, F _N As an inverse example of error, P _re For the precision, R _ec The recall ratio is mAP, and the average accuracy is the mAP; in the test stage, when the overlapping area of the prediction frame and the calibration frame reaches more than 90% of the marked peripheral frame, the detection is considered to be successful.

The invention provides a pantograph detection system based on a Mask scanning R-CNN network, which comprises the following modules:

the data preprocessing module is used for collecting infrared pantograph data of the pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;

the pantograph detection network construction module is used for constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph through a pre-measurement head and a mask head, and providing an edge repair method for carrying out fine repair on the pantograph coarse-granularity segmentation results;

The network training module is used for loading the data of the training sample set into the pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;

the network evaluation module is used for loading a high-quality pantograph detection model, inputting data of the test sample set into the model and evaluating a pantograph segmentation result.

The invention provides an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the pantograph detection method based on Mask scanning R-CNN network when executing the computer program.

The invention has the beneficial effects that: according to the pantograph detection method based on the Mask scanning R-CNN network, an infrared pantograph data set under a complex scene is constructed, the data set contains 8 complex scenes such as no obstacle, bridges, supporting wires and viaducts, 9 complex weather conditions such as overcast, rainy, heavy fog, sunlight irradiation and heavy snow, and the like, and the total is 10000 infrared images, so that a certain data support is provided for the research of pantograph detection, and the development of railway industry in China is promoted.

The invention can automatically monitor the state of the pantograph in real time by simply relying on the online photographing system arranged in front of the pantograph, the AP value can reach 93.26%, the segmentation accuracy is higher, the average speed of detecting one picture on the GPU is 0.302s, and other expensive equipment is not needed as an aid, so that the detection cost can be greatly saved.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

fig. 1 is a schematic flow chart of a pantograph detection method based on a Mask scanning R-CNN network in an embodiment of the present invention.

Fig. 2 is a partial scene bow net infrared image shot by an infrared camera in the vehicle-mounted overhead line system running state detection device (3C) in the embodiment of the invention. Wherein, (a) is bridge scene bow net infrared image, (b) is station scene bow net infrared image, (c) is mountain scene bow net infrared image, and (d) is bridge scene bow net infrared image.

Fig. 3 is a flow chart of the production of the pantograph target database in the embodiment of the present invention.

Fig. 4 is an operation interface of the image marking tool LabelMe in the embodiment of the present invention. Wherein (a) is LabelMe labeling interface and (b) is LabelMe image labeling example.

Fig. 5 is a basic structure diagram of a pantograph detection method based on a Mask scanning R-CNN network in an embodiment of the present invention.

Fig. 6 is a flow chart of the modules of the network in an embodiment of the invention.

Fig. 7 is a schematic diagram of refined edge restoration of a pantograph in an embodiment of the invention.

Fig. 8 is a graph of a pantograph detection result in the embodiment of the present invention. The method comprises the steps of (a) detecting a bridge scene bow net infrared image pantograph, (b) detecting a station scene bow net infrared image pantograph, (c) detecting a mountain scene bow net infrared image pantograph, and (d) detecting a bridge scene bow net infrared image pantograph.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which specifically comprises the following steps as shown in figure 1:

S2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration technology to carry out fine restoration on the pantograph coarse-granularity segmentation results;

s3, loading data on the network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;

and S4, loading a high-quality model, performing a pantograph extraction test, and evaluating a pantograph detection and segmentation result.

Preferably, in the step S1, the method specifically includes:

s11, data acquisition is carried out through a related infrared camera, and decoding and scene classification are carried out on the data;

s12, preprocessing the data, distributing a training set and a testing set, and constructing a target database.

The embodiment of the invention constructs the infrared pantograph database with multiple conditions in a complex scene, which is different from a common single simple pantograph database, and totally contains 10000 pantograph data with 8 different scenes and 9 different weather conditions, which is very favorable for the research of pantograph detection and greatly promotes the development of high-speed rail industry in China.

Preferably, the step S11 specifically includes:

the online shooting system is an infrared camera shooting device of the existing mature high-speed rail vehicle-mounted contact net running state monitoring device 3C system. When the high-speed rail starts to run, the shooting device starts to work; when the high-speed rail stops running, the shooting device also stops working. According to the invention, 3C infrared cameras are utilized to shoot infrared video of the pantograph net under different scenes during the D2236 train traveling, 10000 frames of pantograph images are obtained through decoding, 8 complex scenes such as mountain bodies, bridges, tunnels and viaducts, 9 complex weather such as overcast, heavy fog, sunlight irradiation and heavy snow are included, and the diversity of pantograph detection data is satisfied. The infrared camera shoots a part of the infrared image of the scene bow net as shown in figure 2.

Preferably, the step S12 specifically includes:

counting the horizontal coordinates and the vertical coordinates of four coordinate points of a pantograph calibration frame of an infrared image of 10000 pantograph nets, respectively taking a union set of the horizontal coordinate values and the vertical coordinate values, obtaining the range of the pantograph in a maximum area, expanding the upper and lower boundaries of the calibration frame corresponding to the range of the pantograph by a certain pixel value in order to prevent the pantograph from fluctuating and improving the robustness in the driving process, respectively cutting sub-images with fixed sizes from the original image, and constructing a pantograph template database.

And processing 10000 frames of high-speed rail infrared bow net images one by one, wherein the original sizes of the images are 320 x 240, and the sizes of the pantograph target area images and the pantograph mark images are 194 x 36. 10000 images in the pantograph template database are manually marked by LabelMe, 6000 images are reasonably distributed according to each image scene to serve as a training set, the rest 4000 images serve as a testing set, and then the pantograph target database is constructed. The image labeling interface is shown in fig. 4 (a). The image mark content is a polygon curve circumscribed by the edge of the pantograph slide plate, and the label is train_ bow _1, as shown in (b) of fig. 4.

A basic framework of the pantograph detection method based on Mask Scoring RCNN network is shown in fig. 5.

Preferably, the step S2, as shown in fig. 6, specifically includes:

s21, extracting semantic information of different layers of the pantograph by using a residual error network and a feature pyramid to obtain a multi-scale feature map;

s22, generating a candidate region by using a region suggestion network (RPN), and realizing unbiased alignment of the candidate region and a feature map by using region of interest alignment (RoIALign), and simultaneously performing classification and frame regression operation of the pantograph;

s23, aiming at the feature map after RoIAlign, adopting a series of convolution operations to obtain a rough-granularity pantograph mask segmentation result and an accurate mask score;

S24, performing refined edge restoration on the coarse-granularity mask segmentation result to obtain a high-quality pantograph segmentation result.

The embodiment of the invention constructs a pantograph detection network under a complex scene, the network adopts a convolutional neural network to extract the multi-scale characteristics of the pantograph, realizes unbiased alignment of a candidate region and a characteristic diagram through RPN and RoIAlign, and obtains a pantograph coarse-granularity segmentation mask according to a normal detection segmentation flow. Aiming at the problems that the frame information omission occurs in the process of executing CNN operation by the RCNN Head generating candidate frame, so that the pantograph deviates from a real target in the process of iterative regression and is even restrained in the NMS process, the edge restoration algorithm is adopted to refine the rough-granularity segmentation result of the pantograph, and the pantograph detection segmentation precision is improved. The network realizes high-precision segmentation of the pantograph, effectively solves the problem of health monitoring of the current pantograph under high-speed operation, does not need other expensive equipment as an aid, and greatly saves detection cost.

Preferably, the step S21 specifically includes:

and adopting ResNet101 and FPN as a backbone network to perform multi-scale extraction on the characteristics of the pantograph. ResNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C ₂ 、C ₃ 、C ₄ And C ₅ . FPN is a top-down structure and is matched with ResNet101 layers to carry out transverse connection to obtain the required P ₂ 、P ₃ 、P ₄ And P ₅ And (5) fusing the feature images in four layers.

Preferably, the step S22 specifically includes:

the feature map of each layer output by the main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained.

And (3) aligning the candidate region with the feature map by adopting RoIAlign for Proposals generated by the RPN, mainly using a bilinear interpolation algorithm to expand the feature map, and performing a maximum pooling operation to adjust the Proposals to be uniform in size.

The RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information.

Preferably, the step S23 specifically includes:

mask Head uses the RoIAlign-after-feature to effect up-sampling of the feature map and change of channel dimensions through 6 convolution operations and generates a corresponding prediction Mask. Meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, and the downsampling is connected And 3 full connection layers, wherein the last layer outputs the MaskIoU scores of C categories. Mask IoU score S to be predicted _MaskIoU And a classification confidence score S of RCNN Head _Class Multiplying to obtain the final mask score S _MS To represent the exact segmentation score of the mask. The calculation formula is as follows:

S _MS ＝S _MaskIoU ·S _Class

preferably, the step S24 specifically includes:

the refined edge restoration technology obtains a pantograph region which is best matched with the training set by calculating the intersection ratio of the pantograph coarse granularity mask dividing region and the pantograph region marked in the training set, and performs refined boundary restoration on a pantograph extraction result by using the region. The cross ratio is added as a brand new level to the learning and pushing of the pantograph segmentation. The pantograph region image marked in the training set is a pantograph target database image. A schematic diagram of the refined edge restoration of the pantograph is shown in fig. 7. It can be calculated as:

where S represents a true value (outermost closed region) of the pantograph marked in the training data, P represents a rough segmentation result (inner closed region) of the pantograph, are (S) represents an intersection area (region where a left oblique line is located) of the rough segmentation result and the true value of the pantograph, and are (P) represents a union area (region where a right oblique line is located) of the rough segmentation result and the true value of the pantograph.

Preferably, the step S3 specifically includes:

loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of MS-COCO, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;

obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by using a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by using RoIAlign for space size alignment;

and training the mask IoU Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score.

And calculating the intersection ratio of the predicted coarse-grain masking region and the marking masking region in the training set to obtain a pantograph region which is best matched with the training set, and carrying out refined restoration of the pantograph boundary on the pantograph extraction result by utilizing the region.

The whole network is trained by three losses of classification loss, detection frame regression loss and mask loss, and can be calculated as:

Classification loss functionThe logarithmic losses of the target and non-target categories are selected and cross entropy losses are adopted. Wherein (1)>Is the logarithmic loss of both the target and non-target classes, k _i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T _Class Is a normalized parameter.

Bounding box regression loss functionWherein n is _i Representing anchor block measurement offset,/->Representing the true tag offset. By smoothL ₁ The function regresses the loss as a bounding box. smoothL ₁ The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and can be expressed as a component function such as:

wherein sigma is used to control the smoothness of the region, and the use of piecewise functions solves the problem of zero point non-conduction and reduces errors.

Mask loss functionThe cross-ratio score of the mask and the truth mask obtained by the regression network segmentation is used by L ₂ Loss regressed score of cross-ratios.

All experiments of the invention are carried out under Ubuntu19.04 operating system, and the laboratory server is mainly configured as GeForce RTX 2080 GPU of Intel (R) Core (TM) i7-8700K CPU@3.70GHz, and running memory of 8GB. And constructing a PyTorch deep learning framework on the basis, and programming by using a Python language to realize training and testing of the network. The training class parameter is set to 2, including pantograph and background, batch size is set to 100, iteration number is set to 20000, momentum factor is set to 0.9, weight decay coefficient is set to 0.001, and initial learning rate is set to 0.001.

Preferably, the step S4 specifically includes:

wherein T is _P To correct the example, F _P Error correction, F _N Is the counterexample of error, P _re For the precision rate, R _ec For recall, mAP is the average accuracy. In the test stage, when the overlapping area of the prediction frame and the calibration frame reaches more than 90% of the marked peripheral frame, the detection is considered to be successful. Fig. 8 is a graph showing pantograph extraction results in a partial scene of 4000 test samples.

The embodiment of the invention constructs an infrared pantograph data set under a complex scene, wherein the data set totally comprises 8 complex scenes such as barrier-free, bridge, overhead line, viaduct and the like, 9 complex weather conditions such as overcast, heavy fog, sunlight irradiation, heavy snow and the like, and the total data is 10000 infrared images, so that a certain data support is provided for the detection and research of the pantograph, and the development of railway industry in China is promoted.

The embodiment of the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following modules:

The pantograph detection network construction module is used for constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measurement head and a mask head, and providing an edge repair technology for carrying out fine repair on the pantograph coarse-granularity segmentation results;

the network training module is used for loading data on a network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;

and the network evaluation module is used for loading the high-quality model, carrying out pantograph extraction test and evaluating a pantograph detection and segmentation result.

The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, and is characterized in that the steps of the pantograph detection method based on the Mask scanning R-CNN network are realized when the processor executes the computer program.

The embodiment of the invention can automatically monitor the state of the pantograph in real time simply by means of an online photographing system arranged in front of the pantograph, has an AP value of 93.26%, has higher segmentation accuracy, can averagely detect a picture on the GPU at a speed of 0.302s, does not need other expensive equipment as an aid, and can greatly save detection cost.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A pantograph detection method based on Mask scanning R-CNN network is characterized by comprising the following steps:

S4, loading a high-quality pantograph detection model, inputting data of a test sample set into the model, and evaluating a pantograph detection segmentation result;

the step S2 of establishing a pantograph detection network specifically includes:

s22, using an area recommendation network RPN to propose candidate window areas Proposals; the detection Head RCNN Head performs region of interest alignment RoIAlign operation on each candidate region generated by the RPN;

S24, calculating the cross ratio of the coarse granularity Mask obtained by the Mask edge repair Head on the Mask Head and the true value Mask, matching the best pantograph boundary in the training set, and obtaining the refined repair result of the pantograph boundary;

the step S21 specifically includes:

adopting ResNet101 and FPN as a backbone network to carry out multi-scale extraction on the characteristics of the pantograph; resNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C ₂ 、C ₃ 、C ₄ And C ₅ The method comprises the steps of carrying out a first treatment on the surface of the FPN is a top-down structure, junctionThe outputs of the ResNet101 layers are connected transversely to obtain P ₂ 、P ₃ 、P ₄ And P ₅ A feature map of four layers of fusion;

the step S22 specifically includes:

the step S23 specifically includes:

S _MS ＝S _MaskIoU ·S _Class

the step S24 specifically includes:

the refined edge restoration method is characterized in that the intersection ratio of a pantograph rough granularity mask dividing region and a pantograph region marked in a training set is calculated and added into a learning and pushing process of pantograph dividing as a brand new level, a pantograph region which is best matched with the training set is obtained, and refined boundary restoration is carried out on a pantograph extraction result by utilizing the region; the pantograph region image marked in the training set is a pantograph target database image, and the pantograph edge repair intersection ratio can be calculated as:

Wherein S represents a true value of the pantograph marked in the training data, P represents a rough-granularity segmentation result of the pantograph, are (S) represents an intersection area of the rough-granularity segmentation result of the pantograph and the true value, and are (P) represents a union area of the rough-granularity segmentation result of the pantograph and the true value;

the step S3 specifically includes:

the whole process adopts three losses of classification loss, detection frame regression loss and mask loss for training, and is calculated as:

wherein the class loss functionThe logarithmic losses of the two categories of the target and the non-target are selected and cross entropy losses are adopted; />Is the logarithmic loss of both the target and non-target classes, k _i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T _Class Is a normalization parameter;

wherein σ is used to control the smoothness of the region;

2. The pantograph detection method based on Mask scanning R-CNN network according to claim 1, wherein the step S1 specifically includes:

3. The pantograph detection method based on Mask scanning R-CNN network according to claim 2, wherein the step S11 specifically includes:

4. The pantograph detection method based on Mask scanning R-CNN network according to claim 3, wherein the step S12 specifically includes:

counting the abscissa and the ordinate of four coordinate points of a pantograph calibration frame in 10000 pantograph net infrared images, and respectively taking a union set of the abscissa and the ordinate values, wherein the maximum area formed is the range of the pantograph; in order to prevent the pantograph from rising and falling in the driving process and improve the robustness, the upper and lower boundaries of a calibration frame corresponding to the range of the pantograph are enlarged by pixel values, sub-images with fixed sizes are cut out from the original images respectively, and a pantograph template database is constructed; 10000 images in the pantograph template database are marked by using a deep learning open source marking tool LabelMe, 6000 images are distributed to serve as a training set, the rest 4000 images serve as a test set, training data and test data do not have repeated images, and then the pantograph target database is successfully constructed.

5. The pantograph detection method based on Mask scanning R-CNN network according to claim 1, wherein the step S4 specifically includes:

Wherein T is _P F is the correct example _P As a positive example of error, F _N As an inverse example of error, P _re For the precision, R _ec The recall ratio is mAP, and the average accuracy is the mAP; in the test stage, when the overlapping area of the prediction frame and the calibration frame reaches the mark peripheral frameAbove 90%, the detection was considered successful.

6. A pantograph detection system based on Mask scanning R-CNN network is characterized in that the system comprises the following modules:

The network evaluation module is used for loading a high-quality pantograph detection model, inputting data of a test sample set into the model and evaluating a pantograph segmentation result;

the implementation method of the pantograph detection network construction module specifically comprises the following steps:

the step S21 specifically includes:

the step S22 specifically includes:

the step S23 specifically includes:

mask Head uses the RoIAlign's features to implement up-sampling of feature map and change of channel dimension through 6 convolution operationsAnd generating a corresponding prediction mask; meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, 3 full-connection layers are connected, and the last layer outputs mask IoU scores of C categories; mask IoU score S to be predicted _MaskIoU And a classification confidence score S of RCNN Head _Class Multiplying to obtain the final mask score S _MS A segmentation score used to represent the mask accuracy; the calculation formula is as follows:

S _MS ＝S _MaskIoU ·S _Class

the step S24 specifically includes:

the implementation method of the network training module specifically comprises the following steps:

using a training sample set database, inputting the training sample set database into an established pantograph detection network for training to obtain a high-quality pantograph detection model;

wherein σ is used to control the smoothness of the region;

7. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of a pantograph detection method based on a Mask scanning R-CNN network according to any of claims 1 to 5 when the computer program is executed by the processor.