CN112183453B

CN112183453B - Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system

Info

Publication number: CN112183453B
Application number: CN202011105125.2A
Authority: CN
Inventors: 战岭
Original assignee: Harbin Kejia General Mechanical and Electrical Co Ltd
Current assignee: Harbin Kejia General Mechanical and Electrical Co Ltd
Priority date: 2020-10-15
Filing date: 2020-10-15
Publication date: 2021-05-11
Anticipated expiration: 2040-10-15
Also published as: CN112183453A

Abstract

A method and a system for detecting faults that a water injection opening cover plate is not locked in place based on deep learning belong to the technical field of image detection. The problems of low detection accuracy and low detection efficiency in the conventional manual image checking mode are solved. The method comprises the steps of obtaining a text position of a water filling port and a position of the water filling port by using a trained target detection model, further determining the accurate position of a cover plate of the water filling port, and carrying out binarization on an image to obtain a binary image of the cover plate of the water filling port; then, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plate of the water injection port and a vehicle body is captured on the binary image, and the gap shadow area between the cover plate of the water injection port and the vehicle body is obtained; and finally, judging whether the gap shadow area exceeds a threshold value according to gap shadow area threshold values corresponding to different vehicle types, so as to determine whether the accurate position of the water filling port cover plate is a fault area. The method is mainly used for fault detection when the cover plate of the water injection port is not locked in place.

Description

Deep learning-based water injection port cover plate unlocking-in-place fault detection method and system

Technical Field

The invention belongs to the technical field of image detection, and particularly relates to a method and a system for detecting a fault that a water injection port cover plate is not locked in place.

Background

The water filling port cover plate can ensure the air tightness of the train body of the motor train unit in high-speed running, and can protect the water filling port and nearby components from water leakage, component loss and other faults caused by the influence of the difference between the internal air pressure and the external air pressure, so that the water filling port cover plate is very important for detecting the unlocking.

In the existing detection method, the fault detection is usually carried out by manually checking images. The situations of missed detection, wrong detection and the like are caused by the fact that vehicle detection personnel are easy to fatigue in the working process. Therefore, the automatic alarm device has important significance in timely and automatically alarming the fault that the water filling port cover plate of the motor train unit is not locked in place.

The detection method using the existing detection operator cannot well detect the fault that the water injection port cover plate is not locked in place, and the detection accuracy often cannot meet the actual requirement. With the development of deep learning technology, although the existing neural network can be applied to fault detection when a water filling port cover plate is not locked in place, the following problems still exist:

when the existing neural network model with a relatively simple structure is used for locking and detecting the water injection port cover plate, the problems of low detection accuracy and high false detection rate and missing detection rate exist; when the neural network model with a complex structure is used for locking and detecting the water injection port cover plate, long training time is needed, the number of model parameters is very large, the operation efficiency of the model is seriously reduced, and the detection time is prolonged.

Disclosure of Invention

The invention aims to solve the problems of low detection accuracy and low detection efficiency of the conventional manual image checking mode.

The method for detecting the fault that the water injection port cover plate is not locked in place based on deep learning comprises the following steps:

s1, acquiring an interested area image including a water injection port part;

s2, acquiring the text position of a water filling port and the position of the water filling port by using the trained target detection model; the target detection model is a target detection neural network model based on a multi-head region self-attention neural network;

s3, determining the accurate position of the cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;

s4, determining the self-adaptive binarization threshold value of the image corresponding to the accurate position of the water filling port cover plate obtained in the S3 by adopting an OTSU algorithm, carrying out binarization, setting the pixel value smaller than the self-adaptive binarization threshold value to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255, and obtaining a water filling port cover plate binary image;

s5, according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plates of the water injection ports and a vehicle body is captured on the binary image of the cover plates of the water injection ports;

s6, calculating the number of pixels with the pixel value of 0 in the gap shadow image, and acquiring the gap shadow area between the water filling port cover plate and the vehicle body;

s7, judging whether the gap shadow area obtained in S6 exceeds a threshold value or not according to the gap shadow area threshold values corresponding to different vehicle types; and if so, recording the accurate position of the water filling port cover plate obtained in the step S3 as fault information.

Further, the processing procedure of the multi-head region self-attention neural network is as follows:

(1) selecting a sub-region pixel matrix with the window size of mxm pixels from the input image;

(2) copying 3 parts of the matrix obtained in the step (1), and respectively naming the matrix as a query matrix Q, a key matrix K and a value matrix V;

(3) and (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2) to obtain matrixes subjected to linear transformation

(4) And (3) calculating

Correlation matrix between matrices

The self-attention matrix a is calculated by the softmax function:

wherein:

and

self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; w_CA weight parameter for the associated linear transformation, size 2 mxm; b is_CThe bias parameters were transformed for the relevant linear with a size of mxm; the size of the self-attention matrix a is mxm;

(5) and using the sub-region matrix obtained from the attention matrix A obtained in the step (4) to the sub-region matrix obtained in the step (3)

Weighting to obtain a sub-region characteristic matrix F;

(6) repeating the processes of the steps (4) and (5) for H times, wherein each time of repeating is called a self-attention head;

(7) splicing H sub-area characteristic matrixes F corresponding to H self-attention heads to obtain H sub-area characteristic matrixes F

The size is mxmH;

(8) obtained by (7) above

Linear transformation is carried out to obtain the final characteristics of the subareas

(9) And (3) moving the subareas in the step (1) to the right line by step length S, and repeating the processes (1) to (8) until all the features of the input image are extracted, so as to obtain a feature map with the same size as the input image.

Further, the process of step (3) for performing linear transformation on the Q, K, V three matrices obtained in step (2) is as follows:

wherein: w_Q、W_K、W_VLinear transformation weight parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation weight parameters are mxm; b is_Q、B_K、B_VLinear transformation bias parameters corresponding to the matrix Q, K, V respectively, wherein the sizes of the linear transformation bias parameters are mxm;

q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are mxm.

Further, the model structure of the target detection neural network model based on the multi-head region self-attention neural network comprises a region 1 self-attention layer to a region 7 candidate layer and an output layer;

zone 1 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

zone 2 self-attention layer: 2 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

zone 3 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

zone 4 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

zone 5 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

zone 6 self-attention layer: 3 same multi-head regions are connected in series with the attention neural network and then connected with a maximum pooling layer;

region 7 candidate layer: 2, connecting a plurality of multi-head area self-attention neural networks in parallel, wherein the multi-head area self-attention neural networks are respectively a position prediction layer and a probability prediction layer;

an output layer: and outputting the predicted position information, the corresponding prediction type and the prediction confidence of the target in the image.

Further, the number H of the multi-headed regions in the multi-headed region self-attention neural network series structure from the attention layer in the 1 st region from the attention layer to the 6 th region from the attention layer is 2, 4, 8, 16, 32, 64, respectively;

in the multi-head region self-attention neural network parallel structure in the 7 th region candidate layer, the number H of the multi-head region self-attention heads in the position prediction layer is 4; the number of multi-headed regions in the probability prediction layer from the attention head H is 2.

Further, the size of the sub-region in the multi-head region self-attention neural network from the attention layer 1 to the attention layer 6 is 3x3 pixels, and the step length S is 3 pixels;

in the multi-head area self-attention neural network in the 7 th area candidate layer, the size of the position prediction layer sub-area is 1x1 pixels, and the step length S is 1 pixel; the size of the probability prediction layer sub-region is 1x1 pixels, and the step S is 1 pixel.

Further, the window size from the attention level for region 1 to the largest pool level in region 6 from the attention level is 2x2 with a step size of 2.

Further, the labeling process for the data set of the trained target detection model comprises the steps of:

according to different motor train units of which the water injection ports belong to, labeling the images by using a labelImg data labeling tool:

the label name of the text of the 'water filling port' is text, and the label category is 1;

the label names of the water injection ports of different vehicle types are cover + vehicle types, the corresponding label types are 2, 3 and 4, and the like.

Further, before the labeling of the data set, a data amplification operation is performed on the acquired image data set, so as to obtain a data set for labeling.

The water injection port cover plate non-locking in-place fault detection system based on deep learning is used for executing a water injection port cover plate non-locking in-place fault detection method based on deep learning.

Has the advantages that:

1. the deep learning method is used for replacing manual work to carry out automatic fault detection on the motor train unit, the influence of subjective factors of detection personnel and the limitation of working time are avoided, and the detection quality (high detection accuracy, low false detection rate and low omission factor) and the detection efficiency of faults of the water filling port of the motor train unit can be effectively improved.

2. The invention provides a Multi-head Region Self-attention (Multi-head Region Self-attention) neural network for replacing a traditional convolution neural network to construct a target detection model. The proposed multi-headed region self-attention neural network has the following advantages: (1) by dividing the image into a plurality of sub-regions for independent operation, the number of self-attention parameters can be reduced, and the model training and reasoning speed is improved; (2) the self-attention matrix is calculated by combining the self-attention diagrams in the vertical direction and the horizontal direction, so that the feature extraction capability is stronger; (3) by replacing convolution operation with self-attention operation, key features in image sub-regions can be extracted more effectively, and background and noise are better suppressed; (4) through the parallel connection of a plurality of self-attention heads, parameters of all the heads are independent, and the features of different semantic subspaces in the region can be extracted.

The method has the advantages of high detection accuracy and rapid detection, and can effectively solve the problem of low detection accuracy or long detection time when the existing neural network model is used for detecting the locking of the water injection port cover plate.

Drawings

FIG. 1 is a schematic view of a fault detection process for a water injection port cover plate not locked in place;

FIG. 2 is a schematic diagram of a multi-head region self-attention neural network structure;

FIG. 3 is a schematic diagram of a self-attention matrix calculation;

FIG. 4 is a schematic diagram of a target detection neural network model structure.

Detailed Description

The first embodiment is as follows:

s1, acquiring an interested area image including a water injection port part;

The second embodiment is as follows:

in the method for detecting a fault that a water injection port cover plate is not locked in place based on deep learning in the embodiment, the processing process of the multi-head area self-attention neural network is as follows:

(4) And (3) calculating

Correlation matrix between matrices

The self-attention matrix a is calculated by the softmax function:

wherein:

and

Weighting to obtain a sub-region characteristic matrix F;

The size is mxmH;

(8) obtained by (7) above

Other steps and parameters are the same as in the first embodiment.

The third concrete implementation mode:

in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the process of performing linear transformation on the Q, K, V three matrices obtained in step (2) in step (3) is as follows:

Other steps and parameters are the same as in the second embodiment.

The fourth concrete implementation mode:

in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the model structure of the target detection neural network model based on the multi-head region self-attention neural network includes a 1 st region self-attention layer to a 7 th region candidate layer and an output layer;

Other steps and parameters are the same as in the second or third embodiment.

The fifth concrete implementation mode:

in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment, the self-attention head number H of a multi-head region in a multi-head region self-attention neural network series structure from an attention layer 1 to an attention layer 6 is 2, 4, 8, 16, 32, and 64, respectively;

Other steps and parameters are the same as in embodiment four.

The sixth specific implementation mode:

in the method for detecting a fault that a water injection port cover plate is not locked in place based on deep learning in this embodiment, the size of a sub-region in a multi-head region self-attention neural network from an attention layer in a region 1 to a region 6 is 3x3 pixels, and the step length S is 3 pixels;

Other steps and parameters are the same as those in the fifth embodiment.

The seventh embodiment:

in the method for detecting an unlocked-in-place water injection port cover plate based on deep learning in this embodiment, the window size of the largest pool layer from the attention layer in the 1 st area to the attention layer in the 6 th area is 2 × 2, and the step length is 2.

The other steps and parameters are the same as in embodiment six.

The specific implementation mode is eight:

in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the embodiment, a marking process of a data set of a target detection model for training includes the following steps:

Other steps and parameters are the same as in one of the first to seventh embodiments.

The specific implementation method nine:

in the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning according to this embodiment, before a data set is marked, a data amplification operation is performed on an acquired image data set to obtain a data set for marking.

The other steps and parameters are the same as in embodiment eight.

The detailed implementation mode is ten:

the embodiment is a water injection port cover plate non-locking in-place fault detection system based on deep learning, and the system is used for a water injection port cover plate non-locking in-place fault detection method based on deep learning.

Examples

Specifically describing the present embodiment with reference to fig. 1, the method for detecting an unlocked-in-place fault of a water injection port cover plate based on deep learning in the present embodiment includes the following steps:

step one, obtaining a side integral image of a motor train unit

And carrying the linear array camera by using the fixing equipment arranged beside the track. And calculating the shooting frequency of the linear array camera according to the moving speed of the motor train unit, and continuously shooting the motor train unit. And seamlessly splicing the plurality of shot linear images to obtain a complete integral image of the lateral part of the motor train unit.

Step two, coarse positioning of water injection port

And acquiring wheelbase information between the axles of the motor train unit by using a sensor arranged beside the track. And (4) according to the wheel base information and the prior information of the position of the water filling port relative to the axle, roughly positioning the water filling port on the whole image of the side part of the motor train unit obtained in the step one, and obtaining an image of the region of interest possibly containing the water filling port.

Step three, preprocessing the image data set

(3.1) construction of raw image dataset

And (5) filtering out images without water injection ports from all the images of the region of interest obtained in the step two, and constructing an original image data set.

(3.2) data amplification

And (3) carrying out image processing operations such as brightness transformation, histogram equalization, Gaussian noise addition and the like on the image samples of the original image data set obtained in the step (3.1) to carry out data amplification, increasing the number of samples and improving the robustness of the trained model.

(3.3) data annotation

And (3) marking the image data set obtained in the step (3.2) by utilizing a labelImg data marking tool according to different motor train units to which the water injection ports belong.

The tag name of the "fill port" text is text and the tag class is 1.

The label names of the water injection ports of different vehicle types are cover + vehicle types, such as cover XXX, cover YY and cover ZZZ, the corresponding label types are 2, 3 and 4, and the like.

(3.4) data set partitioning

And (4) dividing the image data sets obtained in the step (3.2) and the step (3.3) and the corresponding labels into a training data set and a testing data set according to the proportion.

Step four, designing a multi-head area self-attention neural network

The proposed multi-headed region self-attention neural network structure is shown in fig. 2:

(1) a sub-region pixel matrix with a window size of mxm pixels is selected in the input image. Because the pixels far away from each other in the image generally have no correlation, the image is divided into a plurality of sub-regions to be subjected to attention calculation independently, so that the number of parameters of the attention calculation can be greatly reduced, and the calculation speed is improved on the premise of not influencing the extraction effect of the local features of the image.

(2) Copying 3 copies of the matrix obtained in the step (1), and respectively naming the matrix Q of 'Query', 'Key' and 'Value' as matrix V.

(3) And (3) respectively carrying out linear transformation on the Q, K, V matrixes obtained in the step (2), wherein the formula is as follows:

(4) Calculated (3) to obtain

Correlation matrix between matrices

And calculating a self-attention matrix A through a softmax function, wherein the formula is as follows:

wherein:

and

self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; w_CA weight parameter for the associated linear transformation, size 2 mxm; b is_CThe bias parameters were transformed for the relevant linear with a size of mxm; the size of the self-attention matrix a is mxm. The softmax function is formulated as:

wherein, a_ijThe matrix A is the element of the ith row and the jth column; c. C_ijIs composed of

The ith row and the jth column; c. C_mjIs composed of

Row m and column j.

This process is used to extract the key information in the current sub-region, which is saved in the form of the attention matrix a. The more critical the information contained in a part of a sub-area is, the higher the corresponding attention weight is.

The self-attention matrix calculation schematic is shown in fig. 3:

self-attention force diagram for only adopting single direction relative to traditional self-attention force matrix calculation to obtain correlation matrix

The invention provides a self-attention map for splicing vertical and horizontal directions

And

then, linear transformation is carried out to obtain correlation matrix

The method obtains the self-attention map of the image from the vertical and horizontal visual angles, and is more suitable for extracting the image characteristics. Through adding linear transformation, the spliced self-attention map is weighted and subjected to size transformation, automatic selection of vertical and horizontal angle features is realized, and the automatic selection is combined with a 'value' matrix

Is adapted to the size of the sensor.

Compared with the method of directly using the whole input image to carry out the self-attention operation, the method of dividing the image into a plurality of sub-areas can greatly reduce the operation amount of the self-attention mechanism. Taking an input image of 9 × 9 pixels as an example: if the self-attention calculation is performed directly on the entire image, the number of parameters included in the self-attention matrix a is 81 × 81 — 6561: if the input image is divided into 9 sub-regions of 3 × 3, the number of parameters included in the attention matrix a is 729, that is, 9 × 9.

(5) Using the sub-region matrix obtained from the attention matrix A obtained in (4) to the sub-region matrix obtained in (3)

And weighting to obtain a sub-region characteristic matrix F. The formula is as follows:

wherein: the dimension of F is mxm.

(6) The processes (4) and (5) are repeated, each process is called a self-attention Head (Head), and the process is repeated for H times.

Wherein: h is a neural network hyper-parameter which is manually set.

Because the parameters of each self-attention head are independent, the information of different semantic subspaces in the current sub-region can be respectively extracted, and the feature extraction capability is stronger.

(7) H sub-region characteristic moments corresponding to H self-attention headsSplicing the array F to obtain

The size is mxmH.

(8) Obtained in the pair (7)

The calculation formula is as follows:

wherein:

is composed of

The corresponding linear transformation weight parameter has the size of mHxm;

is composed of

A corresponding linear transformation bias parameter of size mxm;

has a size of mxm.

(9) And (4) moving the sub-region pixel matrix (equivalent to a sliding window) in the step (1) to the right line by using a manually set step length (Stride) S, and repeating the steps (1) to (8) until all the features of the input image are extracted. And obtaining a characteristic diagram with the same size as the input image.

Step five, building a target detection model

The structure of the target detection neural network model provided by the invention is shown in FIG. 4:

zone 1 self-attention layer: the 2 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 2. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Zone 2 self-attention layer: the 2 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 4. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Zone 3 self-attention layer: the 3 same multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 8. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Zone 4 self-attention layer: the 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 16. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Zone 5 self-attention layer: 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 32. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Zone 6 self-attention layer: the 3 identical multi-head regions are connected in series from the attention neural network, the size of the sub-region is 3x3 pixels, the step length S is 3 pixels, and the number H of the multi-head regions from the attention head is 64. The field of reception is enlarged with a maximum pooling with a window size of 2x2 and a step size of 2.

Region 7 candidate layer: the 2 multi-head areas are connected in parallel with the attention neural network and are respectively a position prediction layer and a probability prediction layer. The size of the position prediction layer sub-region is 1x1 pixels, the step length S is 1 pixel, and the number H of the heads of the multi-head region from the attention is 4 (corresponding to the prediction target coordinate x, the coordinate y, the width w and the height H). The size of the probability prediction layer sub-region is 1x1 pixels, the step length S is 1 pixel, and the number of the multi-head region from the attention head H is 2 (corresponding to the prediction target type c and the prediction confidence conf).

The target detection neural network model structure is constructed in a targeted manner aiming at the processed image after being researched, and can effectively extract key features in the image sub-region, so that the prediction accuracy is improved.

Sixthly, training and testing a target detection neural network model

And (6.1) training the target detection neural network model provided in the fifth step by adopting the data set preprocessed in the third step. Wherein the target detection model loss function is mean square error loss;

(6.2) manually adjusting and optimizing neural network hyper-parameters such as the training times of the model, the learning rate, the data batch size and the like according to the training result;

and repeating the processes (6.1) and (6.2) until the neural network model achieves the optimal performance.

Seventhly, detecting the fault of the passing image of the motor train unit

(1) Acquiring a train passing image and corresponding train type information of the motor train unit;

(2) according to different vehicle types, roughly positioning a water injection port to obtain an image of an interested area;

(3) loading the weight of the target detection model obtained in the step six;

(4) acquiring a text position of a water filling port and a position of the water filling port through a target detection model;

(5) acquiring the accurate position of a cover plate of the water filling port according to the text position of the water filling port and the position of the water filling port;

(6) and (3) solving a self-adaptive binary threshold corresponding to the water filling port cover plate accurate position image obtained in the step (5) by adopting an OTSU algorithm, and binarizing the water filling port cover plate accurate position image through the threshold so as to obtain a water filling port cover plate accurate position binary image. Setting the pixel value smaller than the self-adaptive binarization threshold value in the image to be 0, and setting the pixel value larger than or equal to the self-adaptive binarization threshold value to be 255;

(7) according to the opening and closing directions of cover plates of water injection ports of different vehicle types, a gap shadow image between the cover plate of the water injection port and the vehicle body is captured from the binary image of the cover plate of the water injection port obtained in the step (6);

(8) calculating the number of pixels with the pixel value of 0 in the gap shadow image obtained in the step (7) to obtain the gap shadow area between the water filling port cover plate and the vehicle body;

(9) and (4) judging whether the gap shadow area obtained in the step (8) exceeds a threshold value or not according to the gap shadow area threshold values set by different vehicle types. If yes, recording the accurate position of the cover plate of the water filling port obtained in the step (5) as fault information;

(10) and uploading all fault information to an alarm platform.

The present invention is capable of other embodiments and its several details are capable of modifications in various obvious respects, all without departing from the spirit and scope of the present invention.

Claims

1. The method for detecting the fault that the water injection port cover plate is not locked in place based on deep learning is characterized by comprising the following steps of:

s1, acquiring an interested area image including a water injection port part;

s2, acquiring the text position of a water filling port and the position of the water filling port by using the trained target detection model; the target detection model is a target detection neural network model based on a multi-head region self-attention neural network; the model structure of the target detection neural network model based on the multi-head region self-attention neural network comprises a region 1 self-attention layer to a region 7 candidate layer and an output layer;

an output layer: outputting the predicted position information, the corresponding prediction type and the prediction confidence coefficient of the target in the image;

2. The deep learning based water injection port cover plate unlocking-in-place fault detection method according to claim 1, wherein the multi-head area self-attention neural network is processed as follows:

(1) selecting a sub-region pixel matrix with a window size of m x m pixels from the input image;

(4) And (3) calculating

Correlation matrix between matrices

The self-attention matrix a is calculated by the softmax function:

wherein:

and

self-attention maps in the vertical direction and the horizontal direction of the image respectively; concat is matrix horizontal splicing; w_CThe weight parameters are related to linear transformation, and the size is 2m x m; b is_CThe bias parameters are related to linear transformation, and the size is m x m; fromThe size of the attention matrix A is m x m;

Weighting to obtain a sub-region characteristic matrix F;

The size is m x mH;

(8) obtained by (7) above

3. The deep learning based water injection port cover plate unlocking fault detection method according to claim 2, wherein the step (3) is that the three matrixes Q, K, V obtained in the step (2) are respectively subjected to linear transformation as follows:

wherein: w_Q、W_K、W_VLinear transformation weight parameters corresponding to the matrix Q, K, V are respectively, and the sizes of the linear transformation weight parameters are m x m; b is_Q、B_K、B_VLinear transformation bias parameters corresponding to the matrix Q, K, V are respectively, and the sizes of the linear transformation bias parameters are m x m;

q, K, V matrixes after linear transformation are respectively, and the sizes of the matrixes are m x m.

4. The deep learning based water injection port cover plate unlocking-in-place fault detection method according to claim 1, wherein the number of self-attention heads H of a multi-head area in a multi-head area self-attention neural network series structure from an attention layer to a region 1 to a region 6 is 2, 4, 8, 16, 32, 64;

5. The deep learning based water injection door panel unlocking-in-place fault detection method according to claim 4, characterized in that the size of the sub-region in the multi-headed region self-attention neural network from the attention layer 1 to the attention layer 6 is 3x3 pixels, and the step S is 3 pixels;

6. The deep learning based water injection port cover plate non-locking-in-place fault detection method according to claim 5, wherein the window size of the 1 st zone from the attention level to the 6 th zone from the largest pool level in the attention level is 2x2 with a step size of 2.

7. The deep learning based water injection port cover plate unlocking fault detection method according to one of claims 1 to 6, wherein the marking process of the data set for the trained target detection model comprises the following steps:

the label names of the water injection ports of different vehicle types are cover + vehicle types, and the corresponding label types are 2, 3, 4 and … ….

8. The method for detecting the unlocked fault of the water injection port cover plate based on the deep learning of claim 7, wherein the data set used for marking is obtained by performing a data amplification operation on the acquired image data set before the data set is marked.

9. Deep learning based water injection port cover plate non-locking in-place fault detection system, characterized in that the system is used to perform the deep learning based water injection port cover plate non-locking in-place fault detection method of one of claims 1 to 8.