CN112183453B - Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system - Google Patents

Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system Download PDF

Info

Publication number
CN112183453B
CN112183453B CN202011105125.2A CN202011105125A CN112183453B CN 112183453 B CN112183453 B CN 112183453B CN 202011105125 A CN202011105125 A CN 202011105125A CN 112183453 B CN112183453 B CN 112183453B
Authority
CN
China
Prior art keywords
attention
self
cover plate
layer
water injection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011105125.2A
Other languages
Chinese (zh)
Other versions
CN112183453A (en
Inventor
战岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Kejia General Mechanical and Electrical Co Ltd
Original Assignee
Harbin Kejia General Mechanical and Electrical Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Kejia General Mechanical and Electrical Co Ltd filed Critical Harbin Kejia General Mechanical and Electrical Co Ltd
Priority to CN202011105125.2A priority Critical patent/CN112183453B/en
Publication of CN112183453A publication Critical patent/CN112183453A/en
Application granted granted Critical
Publication of CN112183453B publication Critical patent/CN112183453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

A method and a system for detecting faults that a water injection opening cover plate is not locked in place based on deep learning belong to the technical field of image detection. The problems of low detection accuracy and low detection efficiency in the conventional manual image checking mode are solved. The method comprises the steps of obtaining a text position of a water filling port and a position of the water filling port by using a trained target detection model, further determining the accurate position of a cover plate of the water filling port, and carrying out binarization on an image to obtain a binary image of the cover plate of the water filling port; then, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plate of the water injection port and a vehicle body is captured on the binary image, and the gap shadow area between the cover plate of the water injection port and the vehicle body is obtained; and finally, judging whether the gap shadow area exceeds a threshold value according to gap shadow area threshold values corresponding to different vehicle types, so as to determine whether the accurate position of the water filling port cover plate is a fault area. The method is mainly used for fault detection when the cover plate of the water injection port is not locked in place.

Description

Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system
Technical Field
The invention belongs to the technical field of image detection, and particularly relates to a method and a system for detecting a fault that a water injection port cover plate is not locked in place.
Background
The water filling port cover plate can ensure the air tightness of the train body of the motor train unit in high-speed running, and can protect the water filling port and nearby components from water leakage, component loss and other faults caused by the influence of the difference between the internal air pressure and the external air pressure, so that the water filling port cover plate is very important for detecting the unlocking.
In the existing detection method, the fault detection is usually carried out by manually checking images. The situations of missed detection, wrong detection and the like are caused by the fact that vehicle detection personnel are easy to fatigue in the working process. Therefore, the automatic alarm device has important significance in timely and automatically alarming the fault that the water filling port cover plate of the motor train unit is not locked in place.
The detection method using the existing detection operator cannot well detect the fault that the water injection port cover plate is not locked in place, and the detection accuracy often cannot meet the actual requirement. With the development of deep learning technology, although the existing neural network can be applied to fault detection when a water filling port cover plate is not locked in place, the following problems still exist:
when the existing neural network model with a relatively simple structure is used for locking and detecting the water injection port cover plate, the problems of low detection accuracy and high false detection rate and missing detection rate exist; when the neural network model with a complex structure is used for locking and detecting the water injection port cover plate, long training time is needed, the number of model parameters is very large, the operation efficiency of the model is seriously reduced, and the detection time is prolonged.
Disclosure of Invention
The invention aims to solve the problems of low detection accuracy and low detection efficiency of the conventional manual image checking mode.
The method for detecting the fault that the water injection port cover plate is not locked in place based on deep learning comprises the following steps:
s1, acquiring an interested area image including a water injection port part;
s2, acquiring the text position of a water filling port and the position of the water filling port by using the trained target detection model; the target detection model is a target detection neural network model based on a multi-head region self-attention neural network;
s3, determining the accurate position of the cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;
s4, determining the self-adaptive binarization threshold value of the image corresponding to the accurate position of the water filling port cover plate obtained in the S3 by adopting an OTSU algorithm, carrying out binarization, setting the pixel value smaller than the self-adaptive binarization threshold value to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255, and obtaining a water filling port cover plate binary image;
s5, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plates of the water injection ports and a vehicle body is captured on the binary image of the cover plates of the water injection ports;
s6, calculating the number of pixels with the pixel value of 0 in the gap shadow image, and acquiring the gap shadow area between the water filling port cover plate and the vehicle body;
s7, judging whether the gap shadow area obtained in S6 exceeds a threshold value or not according to the gap shadow area threshold values corresponding to different vehicle types; and if so, recording the accurate position of the water filling port cover plate obtained in the step S3 as fault information.
Further, the processing procedure of the multi-head region self-attention neural network is as follows:
(1) selecting a sub-region pixel matrix with the window size of mxm pixels from the input image;
(2) copying 3 parts of the matrix obtained in the step (1), and respectively naming the matrix as a query matrix Q, a key matrix K and a value matrix V;
(3) and (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2) to obtain matrixes subjected to linear transformation
Figure GDA0002998986140000021
(4) And (3) calculating
Figure GDA0002998986140000022
Correlation matrix between matrices
Figure GDA0002998986140000023
Figure GDA0002998986140000024
The self-attention matrix a is calculated by the softmax function:
Figure GDA0002998986140000025
wherein:
Figure GDA0002998986140000026
and
Figure GDA0002998986140000027
self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; wCA weight parameter for the associated linear transformation, size 2 mxm; b isCThe bias parameters were transformed for the relevant linear with a size of mxm; the size of the self-attention matrix a is mxm;
(5) and using the sub-region matrix obtained from the attention matrix A obtained in the step (4) to the sub-region matrix obtained in the step (3)
Figure GDA0002998986140000028
Weighting to obtain a sub-region characteristic matrix F;
(6) repeating the processes of the steps (4) and (5) for H times, wherein each time of repeating is called a self-attention head;
(7) splicing H sub-area characteristic matrixes F corresponding to H self-attention heads to obtain H sub-area characteristic matrixes F
Figure GDA0002998986140000029
The size is mxmH;
(8) obtained by (7) above
Figure GDA00029989861400000210
Linear transformation is carried out to obtain the final characteristics of the subareas
Figure GDA00029989861400000211
(9) And (3) moving the subareas in the step (1) to the right line by step length S, and repeating the processes (1) to (8) until all the features of the input image are extracted, so as to obtain a feature map with the same size as the input image.
Further, the process of step (3) for performing linear transformation on the Q, K, V three matrices obtained in step (2) is as follows:
Figure GDA00029989861400000212
Figure GDA00029989861400000213
Figure GDA00029989861400000214
wherein: wQ、WK、WVLinear transformation weight parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation weight parameters are mxm; b isQ、BK、BVLinear transformation bias parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation bias parameters are mxm;
Figure GDA0002998986140000031
Figure GDA0002998986140000032
q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are mxm.
Further, the model structure of the target detection neural network model based on the multi-head region self-attention neural network comprises a region 1 self-attention layer to a region 7 candidate layer and an output layer;
zone 1 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 2 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 3 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 4 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 5 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 6 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
region 7 candidate layer: 2, connecting a plurality of multi-head area self-attention neural networks in parallel, wherein the multi-head area self-attention neural networks are respectively a position prediction layer and a probability prediction layer;
an output layer: and outputting the predicted position information, the corresponding prediction type and the prediction confidence of the target in the image.
Further, the number H of the multi-headed regions in the multi-headed region self-attention neural network series structure from the attention layer in the 1 st region from the attention layer to the 6 th region from the attention layer is 2, 4, 8, 16, 32, 64, respectively;
in the multi-head region self-attention neural network parallel structure in the 7 th region candidate layer, the number H of the multi-head region self-attention heads in the position prediction layer is 4; the number of multi-headed regions in the probability prediction layer from the attention head H is 2.
Further, the size of the sub-region in the multi-head region self-attention neural network from the attention layer 1 to the attention layer 6 is 3x3 pixels, and the step length S is 3 pixels;
in the multi-head area self-attention neural network in the 7 th area candidate layer, the size of the position prediction layer sub-area is 1x1 pixels, and the step length S is 1 pixel; the size of the probability prediction layer sub-region is 1x1 pixels, and the step S is 1 pixel.
Further, the window size from the attention level for region 1 to the largest pool level in region 6 from the attention level is 2x2 with a step size of 2.
Further, the labeling process for the data set of the trained target detection model comprises the steps of:
according to different motor train units of which the water injection ports belong to, labeling the images by using a labelImg data labeling tool:
the label name of the text of the 'water filling port' is text, and the label category is 1;
the label names of the water injection ports of different vehicle types are cover + vehicle types, the corresponding label types are 2, 3 and 4, and the like.
Further, before the labeling of the data set, a data amplification operation is performed on the acquired image data set, so as to obtain a data set for labeling.
The water injection port cover plate non-locking in-place fault detection system based on deep learning is used for executing a water injection port cover plate non-locking in-place fault detection method based on deep learning.
Has the advantages that:
1. the deep learning method is used for replacing manual work to carry out automatic fault detection on the motor train unit, the influence of subjective factors of detection personnel and the limitation of working time are avoided, and the detection quality (high detection accuracy, low false detection rate and low omission factor) and the detection efficiency of faults of the water filling port of the motor train unit can be effectively improved.
2. The invention provides a Multi-head Region Self-attention (Multi-head Region Self-attention) neural network for replacing a traditional convolution neural network to construct a target detection model. The proposed multi-headed region self-attention neural network has the following advantages: (1) by dividing the image into a plurality of sub-regions for independent operation, the number of self-attention parameters can be reduced, and the model training and reasoning speed is improved; (2) the self-attention matrix is calculated by combining the self-attention diagrams in the vertical direction and the horizontal direction, so that the feature extraction capability is stronger; (3) by replacing convolution operation with self-attention operation, key features in image sub-regions can be extracted more effectively, and background and noise are better suppressed; (4) through the parallel connection of a plurality of self-attention heads, parameters of all the heads are independent, and the features of different semantic subspaces in the region can be extracted.
The method has the advantages of high detection accuracy and rapid detection, and can effectively solve the problem of low detection accuracy or long detection time when the existing neural network model is used for detecting the locking of the water injection port cover plate.
Drawings
FIG. 1 is a schematic view of a fault detection process for a water injection port cover plate not locked in place;
FIG. 2 is a schematic diagram of a multi-head region self-attention neural network structure;
FIG. 3 is a schematic diagram of a self-attention matrix calculation;
FIG. 4 is a schematic diagram of a target detection neural network model structure.
Detailed Description
The first embodiment is as follows:
the method for detecting the fault that the water injection port cover plate is not locked in place based on deep learning comprises the following steps:
s1, acquiring an interested area image including a water injection port part;
s2, acquiring the text position of a water filling port and the position of the water filling port by using the trained target detection model; the target detection model is a target detection neural network model based on a multi-head region self-attention neural network;
s3, determining the accurate position of the cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;
s4, determining the self-adaptive binarization threshold value of the image corresponding to the accurate position of the water filling port cover plate obtained in the S3 by adopting an OTSU algorithm, carrying out binarization, setting the pixel value smaller than the self-adaptive binarization threshold value to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255, and obtaining a water filling port cover plate binary image;
s5, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plates of the water injection ports and a vehicle body is captured on the binary image of the cover plates of the water injection ports;
s6, calculating the number of pixels with the pixel value of 0 in the gap shadow image, and acquiring the gap shadow area between the water filling port cover plate and the vehicle body;
s7, judging whether the gap shadow area obtained in S6 exceeds a threshold value or not according to the gap shadow area threshold values corresponding to different vehicle types; and if so, recording the accurate position of the water filling port cover plate obtained in the step S3 as fault information.
The second embodiment is as follows:
in the method for detecting a fault that a water injection port cover plate is not locked in place based on deep learning in the embodiment, the processing process of the multi-head area self-attention neural network is as follows:
(1) selecting a sub-region pixel matrix with the window size of mxm pixels from the input image;
(2) copying 3 parts of the matrix obtained in the step (1), and respectively naming the matrix as a query matrix Q, a key matrix K and a value matrix V;
(3) and (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2) to obtain matrixes subjected to linear transformation
Figure GDA0002998986140000051
(4) And (3) calculating
Figure GDA0002998986140000052
Correlation matrix between matrices
Figure GDA0002998986140000053
Figure GDA0002998986140000054
The self-attention matrix a is calculated by the softmax function:
Figure GDA0002998986140000055
wherein:
Figure GDA0002998986140000056
and
Figure GDA0002998986140000057
self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; wCA weight parameter for the associated linear transformation, size 2 mxm; b isCThe bias parameters were transformed for the relevant linear with a size of mxm; the size of the self-attention matrix a is mxm;
(5) and using the sub-region matrix obtained from the attention matrix A obtained in the step (4) to the sub-region matrix obtained in the step (3)
Figure GDA0002998986140000058
Weighting to obtain a sub-region characteristic matrix F;
(6) repeating the processes of the steps (4) and (5) for H times, wherein each time of repeating is called a self-attention head;
(7) splicing H sub-area characteristic matrixes F corresponding to H self-attention heads to obtain H sub-area characteristic matrixes F
Figure GDA0002998986140000059
The size is mxmH;
(8) obtained by (7) above
Figure GDA0002998986140000061
Linear transformation is carried out to obtain the final characteristics of the subareas
Figure GDA0002998986140000062
(9) And (3) moving the subareas in the step (1) to the right line by step length S, and repeating the processes (1) to (8) until all the features of the input image are extracted, so as to obtain a feature map with the same size as the input image.
Other steps and parameters are the same as in the first embodiment.
The third concrete implementation mode:
in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the process of performing linear transformation on the Q, K, V three matrices obtained in step (2) in step (3) is as follows:
Figure GDA0002998986140000063
Figure GDA0002998986140000064
Figure GDA0002998986140000065
wherein: wQ、WK、WVLinear transformation weight parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation weight parameters are mxm; b isQ、BK、BVLinear transformation bias parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation bias parameters are mxm;
Figure GDA0002998986140000066
Figure GDA0002998986140000067
q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are mxm.
Other steps and parameters are the same as in the second embodiment.
The fourth concrete implementation mode:
in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the model structure of the target detection neural network model based on the multi-head region self-attention neural network includes a 1 st region self-attention layer to a 7 th region candidate layer and an output layer;
zone 1 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 2 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 3 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 4 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 5 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 6 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
region 7 candidate layer: 2, connecting a plurality of multi-head area self-attention neural networks in parallel, wherein the multi-head area self-attention neural networks are respectively a position prediction layer and a probability prediction layer;
an output layer: and outputting the predicted position information, the corresponding prediction type and the prediction confidence of the target in the image.
Other steps and parameters are the same as in the second or third embodiment.
The fifth concrete implementation mode:
in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the self-attention head number H of a multi-head region in a multi-head region self-attention neural network series structure from an attention layer 1 to an attention layer 6 is 2, 4, 8, 16, 32, and 64, respectively;
in the multi-head region self-attention neural network parallel structure in the 7 th region candidate layer, the number H of the multi-head region self-attention heads in the position prediction layer is 4; the number of multi-headed regions in the probability prediction layer from the attention head H is 2.
Other steps and parameters are the same as in embodiment four.
The sixth specific implementation mode:
in the method for detecting a fault that a water injection port cover plate is not locked in place based on deep learning in this embodiment, the size of a sub-region in a multi-head region self-attention neural network from an attention layer in a region 1 to a region 6 is 3x3 pixels, and the step length S is 3 pixels;
in the multi-head area self-attention neural network in the 7 th area candidate layer, the size of the position prediction layer sub-area is 1x1 pixels, and the step length S is 1 pixel; the size of the probability prediction layer sub-region is 1x1 pixels, and the step S is 1 pixel.
Other steps and parameters are the same as those in the fifth embodiment.
The seventh embodiment:
in the method for detecting an unlocked-in-place water injection port cover plate based on deep learning in this embodiment, the window size of the largest pool layer from the attention layer in the 1 st area to the attention layer in the 6 th area is 2 × 2, and the step length is 2.
The other steps and parameters are the same as in embodiment six.
The specific implementation mode is eight:
in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the embodiment, a marking process of a data set of a target detection model for training includes the following steps:
according to different motor train units of which the water injection ports belong to, labeling the images by using a labelImg data labeling tool:
the label name of the text of the 'water filling port' is text, and the label category is 1;
the label names of the water injection ports of different vehicle types are cover + vehicle types, the corresponding label types are 2, 3 and 4, and the like.
Other steps and parameters are the same as in one of the first to seventh embodiments.
The specific implementation method nine:
in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning according to this embodiment, before a data set is marked, a data amplification operation is performed on an acquired image data set to obtain a data set for marking.
The other steps and parameters are the same as in embodiment eight.
The detailed implementation mode is ten:
the embodiment is a water injection port cover plate non-locking in-place fault detection system based on deep learning, and the system is used for a water injection port cover plate non-locking in-place fault detection method based on deep learning.
Examples
Specifically describing the present embodiment with reference to fig. 1, the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment includes the following steps:
step one, obtaining a side integral image of a motor train unit
And carrying the linear array camera by using the fixing equipment arranged beside the track. And calculating the shooting frequency of the linear array camera according to the moving speed of the motor train unit, and continuously shooting the motor train unit. And seamlessly splicing the plurality of shot linear images to obtain a complete integral image of the lateral part of the motor train unit.
Step two, coarse positioning of water injection port
And acquiring wheelbase information between the axles of the motor train unit by using a sensor arranged beside the track. And (4) according to the wheel base information and the prior information of the position of the water filling port relative to the axle, roughly positioning the water filling port on the whole image of the side part of the motor train unit obtained in the step one, and obtaining an image of the region of interest possibly containing the water filling port.
Step three, preprocessing the image data set
(3.1) construction of raw image dataset
And (5) filtering out images without water injection ports from all the images of the region of interest obtained in the step two, and constructing an original image data set.
(3.2) data amplification
And (3) carrying out image processing operations such as brightness transformation, histogram equalization, Gaussian noise addition and the like on the image samples of the original image data set obtained in the step (3.1) to carry out data amplification, increasing the number of samples and improving the robustness of the trained model.
(3.3) data annotation
And (3) marking the image data set obtained in the step (3.2) by utilizing a labelImg data marking tool according to different motor train units to which the water injection ports belong.
The tag name of the "fill port" text is text and the tag class is 1.
The label names of the water injection ports of different vehicle types are cover + vehicle types, such as cover XXX, cover YY and cover ZZZ, the corresponding label types are 2, 3 and 4, and the like.
(3.4) data set partitioning
And (4) dividing the image data sets obtained in the step (3.2) and the step (3.3) and the corresponding labels into a training data set and a testing data set according to the proportion.
Step four, designing a multi-head area self-attention neural network
The proposed multi-headed region self-attention neural network structure is shown in fig. 2:
(1) a sub-region pixel matrix with a window size of mxm pixels is selected in the input image. Because the pixels far away from each other in the image generally have no correlation, the image is divided into a plurality of sub-regions to be subjected to attention calculation independently, so that the number of parameters of the attention calculation can be greatly reduced, and the calculation speed is improved on the premise of not influencing the extraction effect of the local features of the image.
(2) Copying 3 copies of the matrix obtained in the step (1), and respectively naming the matrix Q of 'Query', 'Key' and 'Value' as matrix V.
(3) And (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2), wherein the formula is as follows:
Figure GDA00029989861400000912
Figure GDA0002998986140000091
Figure GDA0002998986140000092
wherein: wQ、WK、WVLinear transformation weight parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation weight parameters are mxm; b isQ、BK、BVLinear transformation bias parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation bias parameters are mxm;
Figure GDA0002998986140000093
Figure GDA0002998986140000094
q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are mxm.
(4) Calculated (3) to obtain
Figure GDA0002998986140000095
Correlation matrix between matrices
Figure GDA0002998986140000096
And calculating a self-attention matrix A through a softmax function, wherein the formula is as follows:
Figure GDA0002998986140000097
Figure GDA0002998986140000098
wherein:
Figure GDA0002998986140000099
and
Figure GDA00029989861400000910
self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; wCA weight parameter for the associated linear transformation, size 2 mxm; b isCThe bias parameters were transformed for the relevant linear with a size of mxm; the size of the self-attention matrix a is mxm. The softmax function is formulated as:
Figure GDA00029989861400000911
wherein, aijThe matrix A is the element of the ith row and the jth column; c. CijIs composed of
Figure GDA0002998986140000101
The ith row and the jth column; c. CmjIs composed of
Figure GDA0002998986140000102
Row m and column j.
This process is used to extract the key information in the current sub-region, which is saved in the form of the attention matrix a. The more critical the information contained in a part of a sub-area is, the higher the corresponding attention weight is.
The self-attention matrix calculation schematic is shown in fig. 3:
self-attention force diagram for only adopting single direction relative to traditional self-attention force matrix calculation to obtain correlation matrix
Figure GDA0002998986140000103
The invention provides a self-attention map for splicing vertical and horizontal directions
Figure GDA0002998986140000104
And
Figure GDA0002998986140000105
then, linear transformation is carried out to obtain correlation matrix
Figure GDA0002998986140000106
The method obtains the self-attention map of the image from the vertical and horizontal visual angles, and is more suitable for extracting the image characteristics. Through adding linear transformation, the spliced self-attention map is weighted and subjected to size transformation, automatic selection of vertical and horizontal angle features is realized, and the automatic selection is combined with a 'value' matrix
Figure GDA0002998986140000107
Is adapted to the size of the sensor.
Compared with the method of directly using the whole input image to carry out the self-attention operation, the method of dividing the image into a plurality of sub-areas can greatly reduce the operation amount of the self-attention mechanism. Taking an input image of 9 × 9 pixels as an example: if the self-attention calculation is performed directly on the entire image, the number of parameters included in the self-attention matrix a is 81 × 81 — 6561: if the input image is divided into 9 sub-regions of 3 × 3, the number of parameters included in the attention matrix a is 729, that is, 9 × 9.
(5) Using the sub-region matrix obtained from the attention matrix A obtained in (4) to the sub-region matrix obtained in (3)
Figure GDA0002998986140000108
And weighting to obtain a sub-region characteristic matrix F. The formula is as follows:
Figure GDA0002998986140000109
wherein: the dimension of F is mxm.
(6) The processes (4) and (5) are repeated, each process is called a self-attention Head (Head), and the process is repeated for H times.
Wherein: h is a neural network hyper-parameter which is manually set.
Because the parameters of each self-attention head are independent, the information of different semantic subspaces in the current sub-region can be respectively extracted, and the feature extraction capability is stronger.
(7) H sub-region characteristic moments corresponding to H self-attention headsSplicing the array F to obtain
Figure GDA00029989861400001010
The size is mxmH.
(8) Obtained in the pair (7)
Figure GDA00029989861400001011
Linear transformation is carried out to obtain the final characteristics of the subareas
Figure GDA00029989861400001012
The calculation formula is as follows:
Figure GDA00029989861400001013
wherein:
Figure GDA00029989861400001014
is composed of
Figure GDA00029989861400001015
The corresponding linear transformation weight parameter has the size of mHxm;
Figure GDA00029989861400001016
is composed of
Figure GDA00029989861400001017
A corresponding linear transformation bias parameter of size mxm;
Figure GDA00029989861400001018
has a size of mxm.
(9) And (4) moving the sub-region pixel matrix (equivalent to a sliding window) in the step (1) to the right line by using a manually set step length (Stride) S, and repeating the steps (1) to (8) until all the features of the input image are extracted. And obtaining a characteristic diagram with the same size as the input image.
Step five, building a target detection model
The structure of the target detection neural network model provided by the invention is shown in FIG. 4:
zone 1 self-attention layer: the 2 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 2. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Zone 2 self-attention layer: the 2 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 4. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Zone 3 self-attention layer: the 3 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 8. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Zone 4 self-attention layer: the 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 16. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Zone 5 self-attention layer: 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 32. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Zone 6 self-attention layer: the 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 64. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.
Region 7 candidate layer: the 2 multi-head areas are connected in parallel with the attention neural network and are respectively a position prediction layer and a probability prediction layer. The size of the position prediction layer sub-region is 1x1 pixels, the step length S is 1 pixel, and the number H of the heads of the multi-head region from the attention is 4 (corresponding to the prediction target coordinate x, the coordinate y, the width w and the height H). The size of the probability prediction layer sub-region is 1x1 pixels, the step length S is 1 pixel, and the number of the multi-head region from the attention head H is 2 (corresponding to the prediction target type c and the prediction confidence conf).
An output layer: and outputting the predicted position information, the corresponding prediction type and the prediction confidence of the target in the image.
The target detection neural network model structure is constructed in a targeted manner aiming at the processed image after being researched, and can effectively extract key features in the image sub-region, so that the prediction accuracy is improved.
Sixthly, training and testing a target detection neural network model
And (6.1) training the target detection neural network model provided in the fifth step by adopting the data set preprocessed in the third step. Wherein the target detection model loss function is mean square error loss;
(6.2) manually adjusting and optimizing neural network hyper-parameters such as the training times of the model, the learning rate, the data batch size and the like according to the training result;
and repeating the processes (6.1) and (6.2) until the neural network model achieves the optimal performance.
Seventhly, detecting the fault of the passing image of the motor train unit
(1) Acquiring a train passing image and corresponding train type information of the motor train unit;
(2) according to different vehicle types, roughly positioning a water injection port to obtain an image of an interested area;
(3) loading the weight of the target detection model obtained in the step six;
(4) acquiring a text position of a water filling port and a position of the water filling port through a target detection model;
(5) acquiring the accurate position of a cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;
(6) and (3) solving a self-adaptive binary threshold corresponding to the water filling port cover plate accurate position image obtained in the step (5) by adopting an OTSU algorithm, and binarizing the water filling port cover plate accurate position image through the threshold so as to obtain a water filling port cover plate accurate position binary image. Setting the pixel value smaller than the self-adaptive binarization threshold value in the image to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255;
(7) according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plate of the water injection port and the vehicle body is captured from the binary image of the cover plate of the water injection port obtained in the step (6);
(8) calculating the number of pixels with the pixel value of 0 in the gap shadow image obtained in the step (7) to obtain the gap shadow area between the water filling port cover plate and the vehicle body;
(9) and (4) judging whether the gap shadow area obtained in the step (8) exceeds a threshold value or not according to the gap shadow area threshold values set by different vehicle types. If yes, recording the accurate position of the cover plate of the water filling port obtained in the step (5) as fault information;
(10) and uploading all fault information to an alarm platform.
The present invention is capable of other embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and scope of the present invention.

Claims (9)

1. The method for detecting the fault that the water injection port cover plate is not locked in place based on deep learning is characterized by comprising the following steps of:
s1, acquiring an interested area image including a water injection port part;
s2, acquiring the text position of a water filling port and the position of the water filling port by using the trained target detection model; the target detection model is a target detection neural network model based on a multi-head region self-attention neural network; the model structure of the target detection neural network model based on the multi-head region self-attention neural network comprises a region 1 self-attention layer to a region 7 candidate layer and an output layer;
zone 1 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 2 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 3 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 4 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 5 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
zone 6 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;
region 7 candidate layer: 2, connecting a plurality of multi-head area self-attention neural networks in parallel, wherein the multi-head area self-attention neural networks are respectively a position prediction layer and a probability prediction layer;
an output layer: outputting the predicted position information, the corresponding prediction type and the prediction confidence coefficient of the target in the image;
s3, determining the accurate position of the cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;
s4, determining the self-adaptive binarization threshold value of the image corresponding to the accurate position of the water filling port cover plate obtained in the S3 by adopting an OTSU algorithm, carrying out binarization, setting the pixel value smaller than the self-adaptive binarization threshold value to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255, and obtaining a water filling port cover plate binary image;
s5, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plates of the water injection ports and a vehicle body is captured on the binary image of the cover plates of the water injection ports;
s6, calculating the number of pixels with the pixel value of 0 in the gap shadow image, and acquiring the gap shadow area between the water filling port cover plate and the vehicle body;
s7, judging whether the gap shadow area obtained in S6 exceeds a threshold value or not according to the gap shadow area threshold values corresponding to different vehicle types; and if so, recording the accurate position of the water filling port cover plate obtained in the step S3 as fault information.
2. The deep learning based water injection port cover plate unlocking-in-place fault detection method according to claim 1, wherein the multi-head area self-attention neural network is processed as follows:
(1) selecting a sub-region pixel matrix with a window size of m x m pixels from the input image;
(2) copying 3 parts of the matrix obtained in the step (1), and respectively naming the matrix as a query matrix Q, a key matrix K and a value matrix V;
(3) and (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2) to obtain matrixes subjected to linear transformation
Figure FDA0002998986130000021
(4) And (3) calculating
Figure FDA0002998986130000022
Correlation matrix between matrices
Figure FDA0002998986130000023
Figure FDA0002998986130000024
The self-attention matrix a is calculated by the softmax function:
Figure FDA0002998986130000025
wherein:
Figure FDA0002998986130000026
and
Figure FDA0002998986130000027
self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; wCThe weight parameters are related to linear transformation, and the size is 2m x m; b isCThe bias parameters are related to linear transformation, and the size is m x m; fromThe size of the attention matrix A is m x m;
(5) and using the sub-region matrix obtained from the attention matrix A obtained in the step (4) to the sub-region matrix obtained in the step (3)
Figure FDA0002998986130000028
Weighting to obtain a sub-region characteristic matrix F;
(6) repeating the processes of the steps (4) and (5) for H times, wherein each time of repeating is called a self-attention head;
(7) splicing H sub-area characteristic matrixes F corresponding to H self-attention heads to obtain H sub-area characteristic matrixes F
Figure FDA0002998986130000029
The size is m x mH;
(8) obtained by (7) above
Figure FDA00029989861300000210
Linear transformation is carried out to obtain the final characteristics of the subareas
Figure FDA00029989861300000211
(9) And (3) moving the subareas in the step (1) to the right line by step length S, and repeating the processes (1) to (8) until all the features of the input image are extracted, so as to obtain a feature map with the same size as the input image.
3. The deep learning based water injection port cover plate unlocking fault detection method according to claim 2, wherein the step (3) is that the three matrixes Q, K, V obtained in the step (2) are respectively subjected to linear transformation as follows:
Figure FDA00029989861300000212
Figure FDA00029989861300000213
Figure FDA00029989861300000214
wherein: wQ、WK、WVLinear transformation weight parameters corresponding to the matrix Q, K, V are respectively, and the sizes of the linear transformation weight parameters are m x m; b isQ、BK、BVLinear transformation bias parameters corresponding to the matrix Q, K, V are respectively, and the sizes of the linear transformation bias parameters are m x m;
Figure FDA00029989861300000215
Figure FDA0002998986130000031
q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are m x m.
4. The deep learning based water injection port cover plate unlocking-in-place fault detection method according to claim 1, wherein the number of self-attention heads H of a multi-head area in a multi-head area self-attention neural network series structure from an attention layer to a region 1 to a region 6 is 2, 4, 8, 16, 32, 64;
in the multi-head region self-attention neural network parallel structure in the 7 th region candidate layer, the number H of the multi-head region self-attention heads in the position prediction layer is 4; the number of multi-headed regions in the probability prediction layer from the attention head H is 2.
5. The deep learning based water injection door panel unlocking-in-place fault detection method according to claim 4, characterized in that the size of the sub-region in the multi-headed region self-attention neural network from the attention layer 1 to the attention layer 6 is 3x3 pixels, and the step S is 3 pixels;
in the multi-head area self-attention neural network in the 7 th area candidate layer, the size of the position prediction layer sub-area is 1x1 pixels, and the step length S is 1 pixel; the size of the probability prediction layer sub-region is 1x1 pixels, and the step S is 1 pixel.
6. The deep learning based water injection port cover plate non-locking-in-place fault detection method according to claim 5, wherein the window size of the 1 st zone from the attention level to the 6 th zone from the largest pool level in the attention level is 2x2 with a step size of 2.
7. The deep learning based water injection port cover plate unlocking fault detection method according to one of claims 1 to 6, wherein the marking process of the data set for the trained target detection model comprises the following steps:
according to different motor train units of which the water injection ports belong to, labeling the images by using a labelImg data labeling tool:
the label name of the text of the 'water filling port' is text, and the label category is 1;
the label names of the water injection ports of different vehicle types are cover + vehicle types, and the corresponding label types are 2, 3, 4 and … ….
8. The method for detecting the unlocked fault of the water injection port cover plate based on the deep learning of claim 7, wherein the data set used for marking is obtained by performing a data amplification operation on the acquired image data set before the data set is marked.
9. Deep learning based water injection port cover plate non-locking in-place fault detection system, characterized in that the system is used to perform the deep learning based water injection port cover plate non-locking in-place fault detection method of one of claims 1 to 8.
CN202011105125.2A 2020-10-15 2020-10-15 Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system Active CN112183453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011105125.2A CN112183453B (en) 2020-10-15 2020-10-15 Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011105125.2A CN112183453B (en) 2020-10-15 2020-10-15 Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system

Publications (2)

Publication Number Publication Date
CN112183453A CN112183453A (en) 2021-01-05
CN112183453B true CN112183453B (en) 2021-05-11

Family

ID=73950428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011105125.2A Active CN112183453B (en) 2020-10-15 2020-10-15 Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system

Country Status (1)

Country Link
CN (1) CN112183453B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321815A (en) * 2019-06-18 2019-10-11 中国计量大学 A kind of crack on road recognition methods based on deep learning
CN110491146A (en) * 2019-08-21 2019-11-22 浙江工业大学 A kind of traffic signal control scheme real-time recommendation method based on deep learning
CN111445493A (en) * 2020-03-27 2020-07-24 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN207882170U (en) * 2018-02-28 2018-09-18 北京康拓红外技术股份有限公司 A kind of EMU critical component installation condition image detection device
CN109858456B (en) * 2019-02-18 2023-09-22 沈阳铁道科学技术研究所有限公司 Railway vehicle state fault analysis system
CN109918681B (en) * 2019-03-29 2023-01-31 哈尔滨理工大学 Chinese character-pinyin-based fusion problem semantic matching method
CN111079819B (en) * 2019-12-12 2021-03-23 哈尔滨市科佳通用机电股份有限公司 Method for judging state of coupler knuckle pin of railway wagon based on image recognition and deep learning
CN111199288A (en) * 2019-12-20 2020-05-26 山东众阳健康科技集团有限公司 Novel multi-head attention mechanism
CN111721535B (en) * 2020-06-23 2021-11-30 中国人民解放军战略支援部队航天工程大学 Bearing fault detection method based on convolution multi-head self-attention mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321815A (en) * 2019-06-18 2019-10-11 中国计量大学 A kind of crack on road recognition methods based on deep learning
CN110491146A (en) * 2019-08-21 2019-11-22 浙江工业大学 A kind of traffic signal control scheme real-time recommendation method based on deep learning
CN111461190A (en) * 2020-03-24 2020-07-28 华南理工大学 Deep convolutional neural network-based non-equilibrium ship classification method
CN111445493A (en) * 2020-03-27 2020-07-24 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112183453A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN115690561B (en) Pavement anomaly monitoring method
CN111080609B (en) Brake shoe bolt loss detection method based on deep learning
CN112926563B (en) Fault diagnosis system for steel coil spray printing mark
CN113830136B (en) Method for identifying malposition fault of angle cock handle of railway wagon
CN104318559A (en) Quick feature point detecting method for video image matching
CN111563896A (en) Image processing method for catenary anomaly detection
CN111523416A (en) Vehicle early warning method and device based on highway ETC portal
CN115995056A (en) Automatic bridge disease identification method based on deep learning
CN114241310A (en) Improved YOLO model-based intelligent identification method for piping dangerous case of dike
CN112183453B (en) Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system
CN111144203B (en) Pedestrian shielding detection method based on deep learning
CN112528994A (en) Free-angle license plate detection method, license plate identification method and identification system
CN116895036A (en) Deep learning-based farmland protection early warning method and device
CN108647679B (en) Car logo identification method based on car window coarse positioning
CN112488049A (en) Fault identification method for foreign matter clamped between traction motor and shaft of motor train unit
Shin et al. Visualization for explanation of deep learning-based defect detection model using class activation map
CN117593300B (en) PE pipe crack defect detection method and system
Ramachandraiah et al. Evaluation of Pavement Surface Distress Using Image Processing and Artificial Neural Network
Yao et al. Detection of Bughole on Concrete Surface with Convolutional Neural Network
CN116823737B (en) Tunnel wall abnormity detection method and system in low-texture environment
CN116385414B (en) Component multi-scale defect feature extraction method, computing device and storage medium
Sahane et al. A Crevice-Centric Approach to Surface Crack Detection
Li et al. CrackTinyNet: A novel deep learning model specifically designed for superior performance in tiny road surface crack detection
Wei et al. Application of crack detection algorithm using convolutional neural network in concrete pavement construction
CN117218434A (en) Concrete structure surface defect classification method and system based on hybrid neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant