CN113763326B - Pantograph detection method based on Mask scanning R-CNN network - Google Patents

Pantograph detection method based on Mask scanning R-CNN network Download PDF

Info

Publication number
CN113763326B
CN113763326B CN202110890405.7A CN202110890405A CN113763326B CN 113763326 B CN113763326 B CN 113763326B CN 202110890405 A CN202110890405 A CN 202110890405A CN 113763326 B CN113763326 B CN 113763326B
Authority
CN
China
Prior art keywords
pantograph
mask
network
training
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110890405.7A
Other languages
Chinese (zh)
Other versions
CN113763326A (en
Inventor
洪汉玉
陈冰川
马雷
罗心怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Institute of Technology
Original Assignee
Wuhan Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Institute of Technology filed Critical Wuhan Institute of Technology
Priority to CN202110890405.7A priority Critical patent/CN113763326B/en
Publication of CN113763326A publication Critical patent/CN113763326A/en
Application granted granted Critical
Publication of CN113763326B publication Critical patent/CN113763326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20192Edge enhancement; Edge preservation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30164Workpiece; Machine component
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following steps: s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, and dividing the data into a training sample set and a test sample set; s2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration method for carrying out fine restoration on the pantograph coarse-granularity segmentation results; s3, loading data of the training sample set into a pantograph detection network, repeatedly and iteratively training, and performing parameter adjustment to obtain a high-quality pantograph detection model; and S4, loading a high-quality pantograph detection model, inputting data of the test sample set into the model, and evaluating a pantograph detection segmentation result. The pantograph detection method has higher accuracy and stronger robustness, does not need other expensive equipment as auxiliary equipment, and can greatly save detection cost.

Description

Pantograph detection method based on Mask scanning R-CNN network
Technical Field
The invention relates to the technical field of computer digital image processing and pattern recognition, in particular to a pantograph detection method based on a Mask scanning R-CNN network.
Background
Along with the rapid development of electrified railways represented by high-speed rails in China, higher requirements are put forward on the safety of traction power supply systems. The pantograph slide plate is used as a component which is only contacted with the contact net of the electric locomotive, is the most important electricity taking equipment in the whole power supply system of the electric locomotive, and has direct influence on whether the electric locomotive can safely and stably run. However, in the running process of the electric locomotive, the pantograph slide plate is continuously contacted with the contact net to cause loss, if the loss is serious, the pantograph can collide with hard points on a power supply line of the contact net, so that the pantograph shakes, deforms and even falls off to cause locomotive faults, the light train is late, the heavy train is heavy to cause major accidents of railway traffic, and the public scars and property losses are caused. Therefore, the pantograph is timely and accurately detected and identified, and is particularly important to ensuring safe operation of the pantograph and avoiding safety accidents. However, current research on pantograph detection is relatively small, on the one hand because of the small amount of relevant data, and on the other hand because no quality algorithms provide technical support. Thus, pantograph detection remains a current technical challenge.
At present, three main ways of pantograph detection exist: the method comprises the following steps of ground online detection, manual roof climbing detection and vehicle-mounted equipment detection, but the methods have certain limitations. The on-line detection on the ground can only detect the thickness of the pantograph slide plate, has single function and limited application range; the manual roof climbing detection can only be performed when a train enters a warehouse and the overhead line is powered off, so that manpower and resources are consumed, and the efficiency is low; the detection of the vehicle-mounted equipment requires that each locomotive is provided with the vehicle-mounted equipment, so that the cost is high and the vehicle-mounted equipment is not suitable for large-scale popularization. In recent years, with the improvement of monitoring equipment imaging technology and the advancement of related algorithms, a deep learning technology has achieved excellent results in the field of target detection. The deep learning-based target detection algorithm is mainly divided into two aspects, one-stage target detection and two-stage target detection. The first-stage target detection mainly comprises the following steps: YOLO, SSD, etc., which are relatively fast and have relatively low accuracy; the two-stage target detection mainly comprises the following steps: R-CNN, fast R-CNN, etc., which are slower in speed and relatively higher in accuracy. Although these methods are effective for handling simple, single objects, they do not have outstanding detection performance for objects in a complex context such as a pantograph.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a pantograph detection method based on a Mask scanning R-CNN network.
The technical scheme adopted for solving the technical problems is as follows:
the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following steps:
s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
s2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration method for carrying out fine restoration on the pantograph coarse-granularity segmentation results;
s3, loading data of the training sample set into a pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
and S4, loading a high-quality pantograph detection model, inputting data of the test sample set into the model, and evaluating a pantograph detection segmentation result.
Further, the step S1 of the present invention specifically includes:
s11, acquiring infrared pantograph data of a pantograph net through an infrared camera, and decoding the data and classifying scenes;
s12, preprocessing the data, distributing a training sample set and a test sample set, and constructing a target database.
Further, the step S11 of the present invention specifically includes:
shooting an arc-net infrared video material through an infrared camera arranged in a vehicle-mounted contact net running state detection device in front of the pantograph, and decoding the video to obtain 10000 frames of images; the scene of the infrared pantograph images of the pantograph net collected comprises mountain bodies, bridges, tunnels and iron bridges, weather conditions comprise overcast and rainy, heavy fog, sunlight irradiation and heavy snow, the diversity of the detected data of the pantograph is met, and all pictures are stored in an infrared image folder of the pantograph net of a specified path.
Further, the step S12 of the present invention specifically includes:
counting the abscissa and the ordinate of four coordinate points of a pantograph calibration frame in 10000 pantograph net infrared images, and respectively taking a union set of the abscissa and the ordinate values, wherein the maximum area formed is the range of the pantograph; in order to prevent the pantograph from fluctuating and improving the robustness in the driving process, enlarging the upper and lower boundaries of a calibration frame corresponding to the range of the pantograph by a certain pixel value, and respectively cutting sub-images with fixed sizes from the original images to construct a pantograph template database; 10000 images in the pantograph template database are marked by using a deep learning open source marking tool LabelMe, 6000 images are distributed to serve as a training set, the rest 4000 images serve as a test set, training data and test data do not have repeated images, and then the pantograph target database is successfully constructed.
Further, the step S2 of establishing a pantograph detection network specifically includes:
s21, a Backbone network Backbone uses a residual network ResNet101 and a feature pyramid network FPN to extract a pantograph multi-scale feature map;
s22, using an area recommendation network RPN to propose candidate window areas Proposals; the detection Head RCNN Head performs region of interest alignment RoIALign operation on each candidate region generated by the region recommendation network RPN;
s23, classifying pantograph targets of candidate areas and carrying out bounding box regression operation on the characteristic diagram after RoIAlign through a series of convolution layers and full connection layers;
the Mask Head Mask is used for obtaining the characteristics of the example pixel level, up-sampling of the characteristic map and change of channel dimension are realized through 6-layer convolution operation, and a corresponding prediction Mask is generated;
mask cross-merging is performed by using the output of a prediction mask and the characteristics of RoIAlign after the alignment of the region of interest as input compared with a header mask IoU Head, the downsampling of a feature map is realized through 4 convolution layers, 3 full connection layers are connected, and the last layer outputs mask IoU scores of C categories and multiplies the mask IoU scores with category scores in a detection header RCNN Head to obtain more accurate mask scores;
s24, the Mask edge repair Head calculates the cross ratio of the coarse granularity Mask obtained by the Mask Head Mask and the true value Mask, matches the best pantograph boundary in the training set, and obtains the refined repair result of the pantograph boundary.
Further, the step S21 of the present invention specifically includes:
adopting ResNet101 and FPN as a backbone network to carry out multi-scale extraction on the characteristics of the pantograph; resNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C 2 、C 3 、C 4 And C 5 The method comprises the steps of carrying out a first treatment on the surface of the FPN is a top-down structure, and is transversely connected by combining with the output of each layer of ResNet101 to obtain P 2 、P 3 、P 4 And P 5 A feature map of four layers of fusion;
the step S22 specifically includes:
the method comprises the steps that each layer of feature map output by a main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained;
aligning candidate areas with feature graphs by adopting RoIAlign on Proposals generated by RPN, mainly using bilinear interpolation algorithm to expand the feature graphs, and performing maximum pooling operation to adjust Proposals to uniform size;
the RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information;
the step S23 specifically includes:
The Mask Head uses the features after RoIAlign to realize up-sampling of the feature map and change of channel dimension through 6 convolution operations, and generates a corresponding prediction Mask; meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, 3 full-connection layers are connected, and the last layer outputs mask IoU scores of C categories; mask IoU score S to be predicted MaskIoU And a classification confidence score S of RCNN Head Class Multiplying to obtain the final mask score S MS A segmentation score used to represent the mask accuracy; the calculation formula is as follows:
S MS =S MaskIoU ·S Class
the step S24 specifically includes:
the refined edge restoration method comprises the steps of obtaining a pantograph region which is best matched with a training set by calculating the intersection ratio of a pantograph coarse granularity mask dividing region and a pantograph region marked in the training set, adding the intersection ratio as a brand-new level into learning and pushing of the pantograph dividing, and carrying out refined boundary restoration on a pantograph extraction result by utilizing the region; the pantograph region image marked in the training set is a pantograph target database image, and the pantograph edge repair intersection ratio can be calculated as:
wherein S represents a true value of the pantograph marked in the training data, P represents a rough-granularity division result of the pantograph, are (S) represents an intersection area of the rough-granularity division result of the pantograph and the true value, and are (P) represents a union area of the rough-granularity division result of the pantograph and the true value.
Further, the step S3 of the present invention specifically includes:
inputting the training sample set database obtained in the step S1 into a pantograph detection network established in the step S2 for training to obtain a high-quality pantograph detection model;
loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of an MS-COCO data set, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;
obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by adopting a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by utilizing a RoIAlign module for space size alignment;
classifying and regressing targets of the candidate region by using a series of convolution layers and full connection layers, and performing pixel-level coarse-granularity mask segmentation on the targets of the candidate region;
training the mask iou Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score;
calculating the intersection ratio of the coarse-granularity mask segmentation area and the mark mask area in the training set, obtaining a pantograph area with the best matching, and repairing the pantograph segmentation boundary by using the area;
The whole process is trained by adopting three losses of classification loss, detection frame regression loss and mask loss, and is calculated as follows:
wherein the class loss functionIs logarithmic loss of target and non-target two categories, cross entropy loss is selected, and +_is->Is the logarithmic loss of both the target and non-target classes, k i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T Class Is a normalization parameter;
bounding box regression loss functionWherein n is i Representing anchor block measurement offset,/->Representing a truth tag offset; by smoothL 1 Regression loss of the function as a bounding box; smoothL 1 The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and the function is expressed as a piecewise function as follows:
wherein σ is used to control the smoothness of the region;
mask loss functionThe mask and truth mask cross ratio score is obtained by the regression depth neural network segmentation, and L is used 2 Scoring the loss regression cross-over ratio;
the training network type parameter is set to 2, and mainly comprises a pantograph target and a background, the Batch size is set to 100, the iteration number is set to 20000, the momentum factor is set to 0.9, the weight attenuation coefficient is set to 0.001, and the initial learning rate is set to 0.001.
Further, the step S4 of the present invention specifically includes:
the segmentation result of the pantograph is evaluated, and the average accuracy is calculated by adopting the precision and the recall ratio to evaluate:
wherein T is P F is the correct example P As a positive example of error, F N As an inverse example of error, P re For the precision, R ec The recall ratio is mAP, and the average accuracy is the mAP; in the test stage, when the overlapping area of the prediction frame and the calibration frame reaches more than 90% of the marked peripheral frame, the detection is considered to be successful.
The invention provides a pantograph detection system based on a Mask scanning R-CNN network, which comprises the following modules:
the data preprocessing module is used for collecting infrared pantograph data of the pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
the pantograph detection network construction module is used for constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph through a pre-measurement head and a mask head, and providing an edge repair method for carrying out fine repair on the pantograph coarse-granularity segmentation results;
The network training module is used for loading the data of the training sample set into the pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
the network evaluation module is used for loading a high-quality pantograph detection model, inputting data of the test sample set into the model and evaluating a pantograph segmentation result.
The invention provides an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the pantograph detection method based on Mask scanning R-CNN network when executing the computer program.
The invention has the beneficial effects that: according to the pantograph detection method based on the Mask scanning R-CNN network, an infrared pantograph data set under a complex scene is constructed, the data set contains 8 complex scenes such as no obstacle, bridges, supporting wires and viaducts, 9 complex weather conditions such as overcast, rainy, heavy fog, sunlight irradiation and heavy snow, and the like, and the total is 10000 infrared images, so that a certain data support is provided for the research of pantograph detection, and the development of railway industry in China is promoted.
The invention can automatically monitor the state of the pantograph in real time by simply relying on the online photographing system arranged in front of the pantograph, the AP value can reach 93.26%, the segmentation accuracy is higher, the average speed of detecting one picture on the GPU is 0.302s, and other expensive equipment is not needed as an aid, so that the detection cost can be greatly saved.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
fig. 1 is a schematic flow chart of a pantograph detection method based on a Mask scanning R-CNN network in an embodiment of the present invention.
Fig. 2 is a partial scene bow net infrared image shot by an infrared camera in the vehicle-mounted overhead line system running state detection device (3C) in the embodiment of the invention. Wherein, (a) is bridge scene bow net infrared image, (b) is station scene bow net infrared image, (c) is mountain scene bow net infrared image, and (d) is bridge scene bow net infrared image.
Fig. 3 is a flow chart of the production of the pantograph target database in the embodiment of the present invention.
Fig. 4 is an operation interface of the image marking tool LabelMe in the embodiment of the present invention. Wherein (a) is LabelMe labeling interface and (b) is LabelMe image labeling example.
Fig. 5 is a basic structure diagram of a pantograph detection method based on a Mask scanning R-CNN network in an embodiment of the present invention.
Fig. 6 is a flow chart of the modules of the network in an embodiment of the invention.
Fig. 7 is a schematic diagram of refined edge restoration of a pantograph in an embodiment of the invention.
Fig. 8 is a graph of a pantograph detection result in the embodiment of the present invention. The method comprises the steps of (a) detecting a bridge scene bow net infrared image pantograph, (b) detecting a station scene bow net infrared image pantograph, (c) detecting a mountain scene bow net infrared image pantograph, and (d) detecting a bridge scene bow net infrared image pantograph.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which specifically comprises the following steps as shown in figure 1:
s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
S2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration technology to carry out fine restoration on the pantograph coarse-granularity segmentation results;
s3, loading data on the network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
and S4, loading a high-quality model, performing a pantograph extraction test, and evaluating a pantograph detection and segmentation result.
Preferably, in the step S1, the method specifically includes:
s11, data acquisition is carried out through a related infrared camera, and decoding and scene classification are carried out on the data;
s12, preprocessing the data, distributing a training set and a testing set, and constructing a target database.
The embodiment of the invention constructs the infrared pantograph database with multiple conditions in a complex scene, which is different from a common single simple pantograph database, and totally contains 10000 pantograph data with 8 different scenes and 9 different weather conditions, which is very favorable for the research of pantograph detection and greatly promotes the development of high-speed rail industry in China.
Preferably, the step S11 specifically includes:
the online shooting system is an infrared camera shooting device of the existing mature high-speed rail vehicle-mounted contact net running state monitoring device 3C system. When the high-speed rail starts to run, the shooting device starts to work; when the high-speed rail stops running, the shooting device also stops working. According to the invention, 3C infrared cameras are utilized to shoot infrared video of the pantograph net under different scenes during the D2236 train traveling, 10000 frames of pantograph images are obtained through decoding, 8 complex scenes such as mountain bodies, bridges, tunnels and viaducts, 9 complex weather such as overcast, heavy fog, sunlight irradiation and heavy snow are included, and the diversity of pantograph detection data is satisfied. The infrared camera shoots a part of the infrared image of the scene bow net as shown in figure 2.
Preferably, the step S12 specifically includes:
counting the horizontal coordinates and the vertical coordinates of four coordinate points of a pantograph calibration frame of an infrared image of 10000 pantograph nets, respectively taking a union set of the horizontal coordinate values and the vertical coordinate values, obtaining the range of the pantograph in a maximum area, expanding the upper and lower boundaries of the calibration frame corresponding to the range of the pantograph by a certain pixel value in order to prevent the pantograph from fluctuating and improving the robustness in the driving process, respectively cutting sub-images with fixed sizes from the original image, and constructing a pantograph template database.
And processing 10000 frames of high-speed rail infrared bow net images one by one, wherein the original sizes of the images are 320 x 240, and the sizes of the pantograph target area images and the pantograph mark images are 194 x 36. 10000 images in the pantograph template database are manually marked by LabelMe, 6000 images are reasonably distributed according to each image scene to serve as a training set, the rest 4000 images serve as a testing set, and then the pantograph target database is constructed. The image labeling interface is shown in fig. 4 (a). The image mark content is a polygon curve circumscribed by the edge of the pantograph slide plate, and the label is train_ bow _1, as shown in (b) of fig. 4.
A basic framework of the pantograph detection method based on Mask Scoring RCNN network is shown in fig. 5.
Preferably, the step S2, as shown in fig. 6, specifically includes:
s21, extracting semantic information of different layers of the pantograph by using a residual error network and a feature pyramid to obtain a multi-scale feature map;
s22, generating a candidate region by using a region suggestion network (RPN), and realizing unbiased alignment of the candidate region and a feature map by using region of interest alignment (RoIALign), and simultaneously performing classification and frame regression operation of the pantograph;
s23, aiming at the feature map after RoIAlign, adopting a series of convolution operations to obtain a rough-granularity pantograph mask segmentation result and an accurate mask score;
S24, performing refined edge restoration on the coarse-granularity mask segmentation result to obtain a high-quality pantograph segmentation result.
The embodiment of the invention constructs a pantograph detection network under a complex scene, the network adopts a convolutional neural network to extract the multi-scale characteristics of the pantograph, realizes unbiased alignment of a candidate region and a characteristic diagram through RPN and RoIAlign, and obtains a pantograph coarse-granularity segmentation mask according to a normal detection segmentation flow. Aiming at the problems that the frame information omission occurs in the process of executing CNN operation by the RCNN Head generating candidate frame, so that the pantograph deviates from a real target in the process of iterative regression and is even restrained in the NMS process, the edge restoration algorithm is adopted to refine the rough-granularity segmentation result of the pantograph, and the pantograph detection segmentation precision is improved. The network realizes high-precision segmentation of the pantograph, effectively solves the problem of health monitoring of the current pantograph under high-speed operation, does not need other expensive equipment as an aid, and greatly saves detection cost.
Preferably, the step S21 specifically includes:
and adopting ResNet101 and FPN as a backbone network to perform multi-scale extraction on the characteristics of the pantograph. ResNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C 2 、C 3 、C 4 And C 5 . FPN is a top-down structure and is matched with ResNet101 layers to carry out transverse connection to obtain the required P 2 、P 3 、P 4 And P 5 And (5) fusing the feature images in four layers.
Preferably, the step S22 specifically includes:
the feature map of each layer output by the main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained.
And (3) aligning the candidate region with the feature map by adopting RoIAlign for Proposals generated by the RPN, mainly using a bilinear interpolation algorithm to expand the feature map, and performing a maximum pooling operation to adjust the Proposals to be uniform in size.
The RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information.
Preferably, the step S23 specifically includes:
mask Head uses the RoIAlign-after-feature to effect up-sampling of the feature map and change of channel dimensions through 6 convolution operations and generates a corresponding prediction Mask. Meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, and the downsampling is connected And 3 full connection layers, wherein the last layer outputs the MaskIoU scores of C categories. Mask IoU score S to be predicted MaskIoU And a classification confidence score S of RCNN Head Class Multiplying to obtain the final mask score S MS To represent the exact segmentation score of the mask. The calculation formula is as follows:
S MS =S MaskIoU ·S Class
preferably, the step S24 specifically includes:
the refined edge restoration technology obtains a pantograph region which is best matched with the training set by calculating the intersection ratio of the pantograph coarse granularity mask dividing region and the pantograph region marked in the training set, and performs refined boundary restoration on a pantograph extraction result by using the region. The cross ratio is added as a brand new level to the learning and pushing of the pantograph segmentation. The pantograph region image marked in the training set is a pantograph target database image. A schematic diagram of the refined edge restoration of the pantograph is shown in fig. 7. It can be calculated as:
where S represents a true value (outermost closed region) of the pantograph marked in the training data, P represents a rough segmentation result (inner closed region) of the pantograph, are (S) represents an intersection area (region where a left oblique line is located) of the rough segmentation result and the true value of the pantograph, and are (P) represents a union area (region where a right oblique line is located) of the rough segmentation result and the true value of the pantograph.
Preferably, the step S3 specifically includes:
loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of MS-COCO, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;
obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by using a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by using RoIAlign for space size alignment;
classifying and regressing targets of the candidate region by using a series of convolution layers and full connection layers, and performing pixel-level coarse-granularity mask segmentation on the targets of the candidate region;
and training the mask IoU Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score.
And calculating the intersection ratio of the predicted coarse-grain masking region and the marking masking region in the training set to obtain a pantograph region which is best matched with the training set, and carrying out refined restoration of the pantograph boundary on the pantograph extraction result by utilizing the region.
The whole network is trained by three losses of classification loss, detection frame regression loss and mask loss, and can be calculated as:
Classification loss functionThe logarithmic losses of the target and non-target categories are selected and cross entropy losses are adopted. Wherein (1)>Is the logarithmic loss of both the target and non-target classes, k i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T Class Is a normalized parameter.
Bounding box regression loss functionWherein n is i Representing anchor block measurement offset,/->Representing the true tag offset. By smoothL 1 The function regresses the loss as a bounding box. smoothL 1 The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and can be expressed as a component function such as:
wherein sigma is used to control the smoothness of the region, and the use of piecewise functions solves the problem of zero point non-conduction and reduces errors.
Mask loss functionThe cross-ratio score of the mask and the truth mask obtained by the regression network segmentation is used by L 2 Loss regressed score of cross-ratios.
All experiments of the invention are carried out under Ubuntu19.04 operating system, and the laboratory server is mainly configured as GeForce RTX 2080 GPU of Intel (R) Core (TM) i7-8700K CPU@3.70GHz, and running memory of 8GB. And constructing a PyTorch deep learning framework on the basis, and programming by using a Python language to realize training and testing of the network. The training class parameter is set to 2, including pantograph and background, batch size is set to 100, iteration number is set to 20000, momentum factor is set to 0.9, weight decay coefficient is set to 0.001, and initial learning rate is set to 0.001.
Preferably, the step S4 specifically includes:
the segmentation result of the pantograph is evaluated, and the average accuracy is calculated by adopting the precision and the recall ratio to evaluate:
wherein T is P To correct the example, F P Error correction, F N Is the counterexample of error, P re For the precision rate, R ec For recall, mAP is the average accuracy. In the test stage, when the overlapping area of the prediction frame and the calibration frame reaches more than 90% of the marked peripheral frame, the detection is considered to be successful. Fig. 8 is a graph showing pantograph extraction results in a partial scene of 4000 test samples.
The embodiment of the invention constructs an infrared pantograph data set under a complex scene, wherein the data set totally comprises 8 complex scenes such as barrier-free, bridge, overhead line, viaduct and the like, 9 complex weather conditions such as overcast, heavy fog, sunlight irradiation, heavy snow and the like, and the total data is 10000 infrared images, so that a certain data support is provided for the detection and research of the pantograph, and the development of railway industry in China is promoted.
The embodiment of the invention provides a pantograph detection method based on a Mask scanning R-CNN network, which comprises the following modules:
the data preprocessing module is used for collecting infrared pantograph data of the pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
The pantograph detection network construction module is used for constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measurement head and a mask head, and providing an edge repair technology for carrying out fine repair on the pantograph coarse-granularity segmentation results;
the network training module is used for loading data on a network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
and the network evaluation module is used for loading the high-quality model, carrying out pantograph extraction test and evaluating a pantograph detection and segmentation result.
The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, and is characterized in that the steps of the pantograph detection method based on the Mask scanning R-CNN network are realized when the processor executes the computer program.
The embodiment of the invention can automatically monitor the state of the pantograph in real time simply by means of an online photographing system arranged in front of the pantograph, has an AP value of 93.26%, has higher segmentation accuracy, can averagely detect a picture on the GPU at a speed of 0.302s, does not need other expensive equipment as an aid, and can greatly save detection cost.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (7)

1. A pantograph detection method based on Mask scanning R-CNN network is characterized by comprising the following steps:
s1, collecting infrared pantograph data of a pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
s2, constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph by a pre-measuring head and a mask head, and providing an edge restoration method for carrying out fine restoration on the pantograph coarse-granularity segmentation results;
s3, loading data of the training sample set into a pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
S4, loading a high-quality pantograph detection model, inputting data of a test sample set into the model, and evaluating a pantograph detection segmentation result;
the step S2 of establishing a pantograph detection network specifically includes:
s21, a Backbone network Backbone uses a residual network ResNet101 and a feature pyramid network FPN to extract a pantograph multi-scale feature map;
s22, using an area recommendation network RPN to propose candidate window areas Proposals; the detection Head RCNN Head performs region of interest alignment RoIAlign operation on each candidate region generated by the RPN;
s23, classifying pantograph targets of candidate areas and carrying out bounding box regression operation on the characteristic diagram after RoIAlign through a series of convolution layers and full connection layers;
the Mask Head Mask is used for obtaining the characteristics of the example pixel level, up-sampling of the characteristic map and change of channel dimension are realized through 6-layer convolution operation, and a corresponding prediction Mask is generated;
mask cross-merging is performed by using the output of a prediction mask and the characteristics of RoIAlign after the alignment of the region of interest as input compared with a header mask IoU Head, the downsampling of a feature map is realized through 4 convolution layers, 3 full connection layers are connected, and the last layer outputs mask IoU scores of C categories and multiplies the mask IoU scores with category scores in a detection header RCNN Head to obtain more accurate mask scores;
S24, calculating the cross ratio of the coarse granularity Mask obtained by the Mask edge repair Head on the Mask Head and the true value Mask, matching the best pantograph boundary in the training set, and obtaining the refined repair result of the pantograph boundary;
the step S21 specifically includes:
adopting ResNet101 and FPN as a backbone network to carry out multi-scale extraction on the characteristics of the pantograph; resNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C 2 、C 3 、C 4 And C 5 The method comprises the steps of carrying out a first treatment on the surface of the FPN is a top-down structure, junctionThe outputs of the ResNet101 layers are connected transversely to obtain P 2 、P 3 、P 4 And P 5 A feature map of four layers of fusion;
the step S22 specifically includes:
the method comprises the steps that each layer of feature map output by a main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained;
aligning candidate areas with feature graphs by adopting RoIAlign on Proposals generated by RPN, mainly using bilinear interpolation algorithm to expand the feature graphs, and performing maximum pooling operation to adjust Proposals to uniform size;
The RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information;
the step S23 specifically includes:
the Mask Head uses the features after RoIAlign to realize up-sampling of the feature map and change of channel dimension through 6 convolution operations, and generates a corresponding prediction Mask; meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, 3 full-connection layers are connected, and the last layer outputs mask IoU scores of C categories; mask IoU score S to be predicted MaskIoU And a classification confidence score S of RCNN Head Class Multiplying to obtain the final mask score S MS A segmentation score used to represent the mask accuracy; the calculation formula is as follows:
S MS =S MaskIoU ·S Class
the step S24 specifically includes:
the refined edge restoration method is characterized in that the intersection ratio of a pantograph rough granularity mask dividing region and a pantograph region marked in a training set is calculated and added into a learning and pushing process of pantograph dividing as a brand new level, a pantograph region which is best matched with the training set is obtained, and refined boundary restoration is carried out on a pantograph extraction result by utilizing the region; the pantograph region image marked in the training set is a pantograph target database image, and the pantograph edge repair intersection ratio can be calculated as:
Wherein S represents a true value of the pantograph marked in the training data, P represents a rough-granularity segmentation result of the pantograph, are (S) represents an intersection area of the rough-granularity segmentation result of the pantograph and the true value, and are (P) represents a union area of the rough-granularity segmentation result of the pantograph and the true value;
the step S3 specifically includes:
inputting the training sample set database obtained in the step S1 into a pantograph detection network established in the step S2 for training to obtain a high-quality pantograph detection model;
loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of an MS-COCO data set, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;
obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by adopting a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by utilizing a RoIAlign module for space size alignment;
classifying and regressing targets of the candidate region by using a series of convolution layers and full connection layers, and performing pixel-level coarse-granularity mask segmentation on the targets of the candidate region;
training the mask iou Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score;
Calculating the intersection ratio of the coarse-granularity mask segmentation area and the mark mask area in the training set, obtaining a pantograph area with the best matching, and repairing the pantograph segmentation boundary by using the area;
the whole process adopts three losses of classification loss, detection frame regression loss and mask loss for training, and is calculated as:
wherein the class loss functionThe logarithmic losses of the two categories of the target and the non-target are selected and cross entropy losses are adopted; />Is the logarithmic loss of both the target and non-target classes, k i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T Class Is a normalization parameter;
bounding box regression loss functionWherein n is i Representing anchor block measurement offset,/->Representing a truth tag offset; by smoothL 1 Regression loss of the function as a bounding box; smoothL 1 The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and the function is expressed as a piecewise function as follows:
wherein σ is used to control the smoothness of the region;
mask loss functionThe mask and truth mask cross ratio score is obtained by the regression depth neural network segmentation, and L is used 2 Scoring the loss regression cross-over ratio;
The training network type parameter is set to 2, and mainly comprises a pantograph target and a background, the Batch size is set to 100, the iteration number is set to 20000, the momentum factor is set to 0.9, the weight attenuation coefficient is set to 0.001, and the initial learning rate is set to 0.001.
2. The pantograph detection method based on Mask scanning R-CNN network according to claim 1, wherein the step S1 specifically includes:
s11, acquiring infrared pantograph data of a pantograph net through an infrared camera, and decoding the data and classifying scenes;
s12, preprocessing the data, distributing a training sample set and a test sample set, and constructing a target database.
3. The pantograph detection method based on Mask scanning R-CNN network according to claim 2, wherein the step S11 specifically includes:
shooting an arc-net infrared video material through an infrared camera arranged in a vehicle-mounted contact net running state detection device in front of the pantograph, and decoding the video to obtain 10000 frames of images; the scene of the infrared pantograph images of the pantograph net collected comprises mountain bodies, bridges, tunnels and iron bridges, weather conditions comprise overcast and rainy, heavy fog, sunlight irradiation and heavy snow, the diversity of the detected data of the pantograph is met, and all pictures are stored in an infrared image folder of the pantograph net of a specified path.
4. The pantograph detection method based on Mask scanning R-CNN network according to claim 3, wherein the step S12 specifically includes:
counting the abscissa and the ordinate of four coordinate points of a pantograph calibration frame in 10000 pantograph net infrared images, and respectively taking a union set of the abscissa and the ordinate values, wherein the maximum area formed is the range of the pantograph; in order to prevent the pantograph from rising and falling in the driving process and improve the robustness, the upper and lower boundaries of a calibration frame corresponding to the range of the pantograph are enlarged by pixel values, sub-images with fixed sizes are cut out from the original images respectively, and a pantograph template database is constructed; 10000 images in the pantograph template database are marked by using a deep learning open source marking tool LabelMe, 6000 images are distributed to serve as a training set, the rest 4000 images serve as a test set, training data and test data do not have repeated images, and then the pantograph target database is successfully constructed.
5. The pantograph detection method based on Mask scanning R-CNN network according to claim 1, wherein the step S4 specifically includes:
the segmentation result of the pantograph is evaluated, and the average accuracy is calculated by adopting the precision and the recall ratio to evaluate:
Wherein T is P F is the correct example P As a positive example of error, F N As an inverse example of error, P re For the precision, R ec The recall ratio is mAP, and the average accuracy is the mAP; in the test stage, when the overlapping area of the prediction frame and the calibration frame reaches the mark peripheral frameAbove 90%, the detection was considered successful.
6. A pantograph detection system based on Mask scanning R-CNN network is characterized in that the system comprises the following modules:
the data preprocessing module is used for collecting infrared pantograph data of the pantograph net, preprocessing the data, dividing the data into a training sample set and a test sample set, and constructing a pantograph target database;
the pantograph detection network construction module is used for constructing a pantograph detection network, extracting a pantograph multi-scale feature map by utilizing a main network, obtaining classification information, position coordinates and coarse-granularity segmentation results of the pantograph through a pre-measurement head and a mask head, and providing an edge repair method for carrying out fine repair on the pantograph coarse-granularity segmentation results;
the network training module is used for loading the data of the training sample set into the pantograph detection network, repeatedly and iteratively training, and adjusting parameters to obtain a high-quality pantograph detection model;
The network evaluation module is used for loading a high-quality pantograph detection model, inputting data of a test sample set into the model and evaluating a pantograph segmentation result;
the implementation method of the pantograph detection network construction module specifically comprises the following steps:
s21, a Backbone network Backbone uses a residual network ResNet101 and a feature pyramid network FPN to extract a pantograph multi-scale feature map;
s22, using an area recommendation network RPN to propose candidate window areas Proposals; the detection Head RCNN Head performs region of interest alignment RoIAlign operation on each candidate region generated by the RPN;
s23, classifying pantograph targets of candidate areas and carrying out bounding box regression operation on the characteristic diagram after RoIAlign through a series of convolution layers and full connection layers;
the Mask Head Mask is used for obtaining the characteristics of the example pixel level, up-sampling of the characteristic map and change of channel dimension are realized through 6-layer convolution operation, and a corresponding prediction Mask is generated;
mask cross-merging is performed by using the output of a prediction mask and the characteristics of RoIAlign after the alignment of the region of interest as input compared with a header mask IoU Head, the downsampling of a feature map is realized through 4 convolution layers, 3 full connection layers are connected, and the last layer outputs mask IoU scores of C categories and multiplies the mask IoU scores with category scores in a detection header RCNN Head to obtain more accurate mask scores;
S24, calculating the cross ratio of the coarse granularity Mask obtained by the Mask edge repair Head on the Mask Head and the true value Mask, matching the best pantograph boundary in the training set, and obtaining the refined repair result of the pantograph boundary;
the step S21 specifically includes:
adopting ResNet101 and FPN as a backbone network to carry out multi-scale extraction on the characteristics of the pantograph; resNet101 is a bottom-up structure, outputs four layers of feature graphs through a residual network, and is defined as C 2 、C 3 、C 4 And C 5 The method comprises the steps of carrying out a first treatment on the surface of the FPN is a top-down structure, and is transversely connected by combining with the output of each layer of ResNet101 to obtain P 2 、P 3 、P 4 And P 5 A feature map of four layers of fusion;
the step S22 specifically includes:
the method comprises the steps that each layer of feature map output by a main network generates two branches through convolution of one 3*3, one branch distinguishes positive and negative of a preset anchor frame through Reshape-Softmax-Reshape operation, and the other branch obtains the offset of the anchor frame through convolution of 1*1, so that a candidate region is obtained;
aligning candidate areas with feature graphs by adopting RoIAlign on Proposals generated by RPN, mainly using bilinear interpolation algorithm to expand the feature graphs, and performing maximum pooling operation to adjust Proposals to uniform size;
The RCNN Head outputs the feature map after RoIAlign into 1024 dimensions through a full connection layer, classifies the features and carries out regression operation to output pantograph category and coordinate information;
the step S23 specifically includes:
mask Head uses the RoIAlign's features to implement up-sampling of feature map and change of channel dimension through 6 convolution operationsAnd generating a corresponding prediction mask; meanwhile, the output of the prediction mask and the characteristics after the original RoIAlign are used as mask IoU Head input, the downsampling of the characteristic map is realized through 4 convolution layers, 3 full-connection layers are connected, and the last layer outputs mask IoU scores of C categories; mask IoU score S to be predicted MaskIoU And a classification confidence score S of RCNN Head Class Multiplying to obtain the final mask score S MS A segmentation score used to represent the mask accuracy; the calculation formula is as follows:
S MS =S MaskIoU ·S Class
the step S24 specifically includes:
the refined edge restoration method is characterized in that the intersection ratio of a pantograph rough granularity mask dividing region and a pantograph region marked in a training set is calculated and added into a learning and pushing process of pantograph dividing as a brand new level, a pantograph region which is best matched with the training set is obtained, and refined boundary restoration is carried out on a pantograph extraction result by utilizing the region; the pantograph region image marked in the training set is a pantograph target database image, and the pantograph edge repair intersection ratio can be calculated as:
Wherein S represents a true value of the pantograph marked in the training data, P represents a rough-granularity segmentation result of the pantograph, are (S) represents an intersection area of the rough-granularity segmentation result of the pantograph and the true value, and are (P) represents a union area of the rough-granularity segmentation result of the pantograph and the true value;
the implementation method of the network training module specifically comprises the following steps:
using a training sample set database, inputting the training sample set database into an established pantograph detection network for training to obtain a high-quality pantograph detection model;
loading 6000 training sample pictures with 194 x 36 resolution in a data set into a network, initializing network parameters on a pre-training model of an MS-COCO data set, and simultaneously pre-training by using a backbone network for extracting characteristics of targets in the data set;
obtaining candidate window areas of 100 before scoring for anchor blocks generated by RPN by adopting a non-maximum suppression algorithm, and normalizing all candidate window areas to a specific size by utilizing a RoIAlign module for space size alignment;
classifying and regressing targets of the candidate region by using a series of convolution layers and full connection layers, and performing pixel-level coarse-granularity mask segmentation on the targets of the candidate region;
training the mask iou Head by using Proposals with IoU more than 0.5 in the RPN as training samples to obtain a real mask score;
Calculating the intersection ratio of the coarse-granularity mask segmentation area and the mark mask area in the training set, obtaining a pantograph area with the best matching, and repairing the pantograph segmentation boundary by using the area;
the whole process adopts three losses of classification loss, detection frame regression loss and mask loss for training, and is calculated as:
wherein the class loss functionThe logarithmic losses of the two categories of the target and the non-target are selected and cross entropy losses are adopted; />Is the logarithmic loss of both the target and non-target classes, k i Representing the probability of targeting an anchor block, < +.>Indicating that the anchor block is a negative label,>representing the anchor point frame as positive label, T Class Is a normalization parameter;
bounding box regression loss functionWherein n is i Representing anchor block measurement offset,/->Representing a truth tag offset; by smoothL 1 Regression loss of the function as a bounding box; smoothL 1 The function has the advantages of reducing the error growth rate and reducing the penalty caused by errors, and the function is expressed as a piecewise function as follows:
wherein σ is used to control the smoothness of the region;
mask loss functionThe mask and truth mask cross ratio score is obtained by the regression depth neural network segmentation, and L is used 2 Scoring the loss regression cross-over ratio;
The training network type parameter is set to 2, and mainly comprises a pantograph target and a background, the Batch size is set to 100, the iteration number is set to 20000, the momentum factor is set to 0.9, the weight attenuation coefficient is set to 0.001, and the initial learning rate is set to 0.001.
7. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of a pantograph detection method based on a Mask scanning R-CNN network according to any of claims 1 to 5 when the computer program is executed by the processor.
CN202110890405.7A 2021-08-04 2021-08-04 Pantograph detection method based on Mask scanning R-CNN network Active CN113763326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110890405.7A CN113763326B (en) 2021-08-04 2021-08-04 Pantograph detection method based on Mask scanning R-CNN network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110890405.7A CN113763326B (en) 2021-08-04 2021-08-04 Pantograph detection method based on Mask scanning R-CNN network

Publications (2)

Publication Number Publication Date
CN113763326A CN113763326A (en) 2021-12-07
CN113763326B true CN113763326B (en) 2023-11-21

Family

ID=78788491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110890405.7A Active CN113763326B (en) 2021-08-04 2021-08-04 Pantograph detection method based on Mask scanning R-CNN network

Country Status (1)

Country Link
CN (1) CN113763326B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117690096B (en) * 2024-02-04 2024-04-12 成都中轨轨道设备有限公司 Contact net safety inspection system adapting to different scenes

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658387A (en) * 2018-11-27 2019-04-19 北京交通大学 The detection method of the pantograph carbon slide defect of power train
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN112132789A (en) * 2020-08-30 2020-12-25 南京理工大学 Pantograph online detection device and method based on cascade neural network
WO2021056705A1 (en) * 2019-09-23 2021-04-01 平安科技(深圳)有限公司 Method for detecting damage to outside of human body on basis of semantic segmentation network, and related device
CN112766195A (en) * 2021-01-26 2021-05-07 西南交通大学 Electrified railway bow net arcing visual detection method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679351B2 (en) * 2017-08-18 2020-06-09 Samsung Electronics Co., Ltd. System and method for semantic segmentation of images

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN109658387A (en) * 2018-11-27 2019-04-19 北京交通大学 The detection method of the pantograph carbon slide defect of power train
WO2021056705A1 (en) * 2019-09-23 2021-04-01 平安科技(深圳)有限公司 Method for detecting damage to outside of human body on basis of semantic segmentation network, and related device
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN112132789A (en) * 2020-08-30 2020-12-25 南京理工大学 Pantograph online detection device and method based on cascade neural network
CN112766195A (en) * 2021-01-26 2021-05-07 西南交通大学 Electrified railway bow net arcing visual detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
bounding box repairing algorithm for underwater object detection based on iou optimization;Chen B 等;《IEEE》;第369-373页 *

Also Published As

Publication number Publication date
CN113763326A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN109166094B (en) Insulator fault positioning and identifying method based on deep learning
CN107563372B (en) License plate positioning method based on deep learning SSD frame
Zhang et al. RCNN-based foreign object detection for securing power transmission lines (RCNN4SPTL)
CN106919902B (en) Vehicle identification and track tracking method based on CNN
CN103544483B (en) A kind of joint objective method for tracing based on local rarefaction representation and system thereof
CN105404857A (en) Infrared-based night intelligent vehicle front pedestrian detection method
CN105260749B (en) Real-time target detection method based on direction gradient binary pattern and soft cascade SVM
CN105044122A (en) Copper part surface defect visual inspection system and inspection method based on semi-supervised learning model
CN103745224A (en) Image-based railway contact net bird-nest abnormal condition detection method
CN104881661B (en) Vehicle checking method based on structural similarity
CN113409252B (en) Obstacle detection method for overhead transmission line inspection robot
CN104992429A (en) Mountain crack detection method based on image local reinforcement
CN113947731A (en) Foreign matter identification method and system based on contact net safety inspection
Dong et al. Intelligent segmentation and measurement model for asphalt road cracks based on modified mask R-CNN algorithm
CN113763326B (en) Pantograph detection method based on Mask scanning R-CNN network
CN113780200A (en) Computer vision-based pavement multi-disease area detection and positioning method
CN115619719A (en) Pine wood nematode infected wood detection method based on improved Yolo v3 network model
CN106611147A (en) Vehicle tracking method and device
CN113312987B (en) Recognition method based on unmanned aerial vehicle road surface crack image
CN103996207A (en) Object tracking method
CN111881914B (en) License plate character segmentation method and system based on self-learning threshold
CN113762247A (en) Road crack automatic detection method based on significant instance segmentation algorithm
CN114742975B (en) Vehicle-mounted image rail curve modeling method
CN109934172B (en) GPS-free full-operation line fault visual detection and positioning method for high-speed train pantograph
CN116758421A (en) Remote sensing image directed target detection method based on weak supervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant