CN112464846A - Automatic identification method for abnormal fault of freight train carriage at station - Google Patents

Automatic identification method for abnormal fault of freight train carriage at station Download PDF

Info

Publication number
CN112464846A
CN112464846A CN202011415713.6A CN202011415713A CN112464846A CN 112464846 A CN112464846 A CN 112464846A CN 202011415713 A CN202011415713 A CN 202011415713A CN 112464846 A CN112464846 A CN 112464846A
Authority
CN
China
Prior art keywords
fault
image
representing
prediction
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011415713.6A
Other languages
Chinese (zh)
Other versions
CN112464846B (en
Inventor
刘清
刘同财
李雪琪
谢兆青
王靖博
郭建明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202011415713.6A priority Critical patent/CN112464846B/en
Publication of CN112464846A publication Critical patent/CN112464846A/en
Application granted granted Critical
Publication of CN112464846B publication Critical patent/CN112464846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Abstract

The invention provides an automatic identification method for the abnormity and the fault of a freight train carriage at a station. Firstly, shooting left, right and top images of each carriage by using a high-speed linear array camera under the condition that a freight train does not stop to construct a carriage image data set, and forming a carriage fault image training set after preprocessing operations such as cutting, screening, manual marking and the like. And then constructing a train carriage abnormal fault recognition network and a loss function, inputting a train carriage fault image in a training set, and optimizing the recognition network through a gradient descent algorithm. During testing, inputting an image to be identified into the optimized train compartment abnormal fault identification network to obtain a primary identification result; and then carrying out post-processing operations such as confidence filtering, non-maximum suppression and the like to obtain a final recognition result. The invention has the advantages of high recognition rate, high speed, strong real-time performance and the like, realizes the functions of monitoring the running state of the train and automatically alarming for abnormity or faults, and further improves the intelligent level of railway transportation.

Description

Automatic identification method for abnormal fault of freight train carriage at station
Technical Field
The invention relates to the field of intelligent supervision of railway traffic safety, in particular to an automatic identification method for abnormal faults of a carriage of a freight train at a station.
Background
Aiming at the safety problem of the railway freight car in the operation process, some safety monitoring systems are put into use at present. The system mainly comprises 5 subsystems, namely a THDS (vehicle axle temperature intelligent detection system), a TPDS (vehicle operation quality rail side dynamic monitoring system), a TADS (vehicle rolling bearing fault rail side acoustic diagnosis system), a TFDS (freight car fault rail side image detection system) and a TCDS (passenger car operation safety monitoring system). The method is characterized in that related data in the running process of the train are collected by adopting technologies such as an infrared technology, acoustics, mechanics, image collection, a computer and the like, and the running condition of the train is monitored by using a manual screening mode. With the increase of railway lines and the improvement of safety guarantee requirements, the efficiency of a traditional train operation monitoring system and a method for checking potential safety hazards cannot meet the requirements. In the traditional train operation monitoring and potential safety hazard troubleshooting system, after a train image is collected by a camera, workers need to troubleshoot and verify the collected image samples one by one, and information such as train number, train type and fault is manually recorded. Due to the reasons of short parking time, large number of passing trains and the like in the train station, the problems of fault missing detection, fault false detection, recording errors and low efficiency caused by manual identification are easy to occur. For a train fault identification application scene, the target detection technology has natural advantages compared with manual detection, on one hand, the image detection technology can detect the train under the condition that the train is not stopped in a non-contact mode, the normal operation of the train cannot be affected, on the other hand, the labor intensity of personnel can be greatly reduced, and the labor cost is reduced. Therefore, the introduction of target detection technology into the train fault identification scenario can greatly promote the development of the railway transportation industry.
Disclosure of Invention
The invention provides an automatic identification method for the abnormity and the fault of a freight train carriage at a station, aiming at the problems that the abnormity and the fault of the freight train are identified by adopting a mode of manually checking images, the information of train numbers, train types, faults and the like is manually recorded, and the problems of fault omission, fault misdetection, recording errors, low efficiency and the like caused by manual identification are easy to occur.
In order to achieve the purpose, the invention adopts the technical scheme that:
step 1: respectively shooting a left high-resolution image, a right high-resolution image and a top high-resolution image of each carriage by using a high-speed linear array camera under the condition that a freight train does not stop to construct a carriage high-resolution image data set, reducing the carriage high-resolution image to a proper proportion by using a linear interpolation method in an equal proportion, cutting the carriage high-resolution image into four overlapped image blocks with the same size, screening out carriage image samples containing faults from all the image blocks, and constructing a carriage fault image data set by using the carriage image samples containing the faults;
step 2: manually labeling a carriage fault marking frame and fault types of each carriage fault image in the carriage fault image data set in the step 1, respectively counting the number of carriage fault image samples of each fault type, and collecting fault types of which the number of the image samples is less than a sample number threshold value until the number of the carriage fault image samples of each fault type is greater than the sample number threshold value so as to construct a train carriage abnormal fault identification network training set;
and step 3: constructing a train carriage abnormal fault recognition network, taking the train carriage abnormal fault recognition network training set in the step 2 as input data, constructing a train carriage abnormal fault recognition network loss function by combining the fault types of the carriage fault image samples in the train carriage abnormal fault recognition network training set, and obtaining the optimized train carriage abnormal fault recognition network through gradient descent algorithm training;
and 4, step 4: inputting the image to be recognized into the optimized train compartment abnormal fault recognition network, predicting to obtain a first prediction characteristic diagram, a second prediction characteristic diagram and a third prediction characteristic diagram of the image to be recognized, splicing the first prediction characteristic diagram, the second prediction characteristic diagram and the third prediction characteristic diagram of the image to be recognized to obtain a primary recognition result of the image to be recognized, and performing operations such as confidence screening, non-maximum value suppression and the like to obtain a final recognition result.
Preferably, the car fault image data set in step 1 is:
{trains(m,n),s∈[1,S],m∈[1,M],n∈[1,N]}
wherein, trains(M, N) represents pixel information of the mth row and nth column of the S car fault image in the car fault image data set, S represents the number of all image samples in the car fault image data set, M is the row number of each fault image in the car fault image data set, and N is the column number of each fault image in the car fault image data set;
preferably, the coordinates of the car fault marking box of each car fault image in the car fault image data set in the step 2 are as follows:
Figure BDA0002815286210000021
Figure BDA0002815286210000022
Figure BDA0002815286210000023
where l denotes the left on the car trouble image, t denotes the upper on the car trouble image, r denotes the right on the car trouble image, and b denotes the lower on the car trouble image; s represents the number of fault images of all the carriages in the carriage fault image data set, KsRepresenting the total number of the compartment fault mark frames in the s compartment fault image in the compartment fault image data set; boxs,kIndicating the k-th compartment fault mark frame in the s-th compartment fault image in the compartment fault image data setIs determined by the coordinate of (a) in the space,
Figure BDA0002815286210000031
the coordinates representing the upper left corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure BDA0002815286210000032
the abscissa representing the upper left corner of the kth car fault flag box in the s-th car fault image data set,
Figure BDA0002815286210000033
the ordinate of the upper left corner of the kth carriage fault marking frame in the s carriage fault image data set is represented;
Figure BDA0002815286210000034
the coordinates representing the lower right corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure BDA0002815286210000035
an abscissa representing the lower right corner of the kth car fault flag box in the s-th car fault image data set,
Figure BDA0002815286210000036
the ordinate represents the lower right corner of the kth carriage fault marking frame in the s carriage fault image data set;
step 2, the compartment fault marking frame category information of each compartment fault image in the compartment fault image data set is as follows:
labels,k,c,s∈[1,S],k∈[1,Ks],c∈[1,C]
wherein C is the total number of fault types in the carriage fault image data set; labels,k,cThe kth carriage fault marking frame which represents the s carriage fault image in the carriage fault image data set belongs to the c fault type;
step 2, the training set of the train compartment abnormal fault recognition network is as follows:
{trains(m,n),(boxs,k,labels,k,c)}
s∈[1,S],m∈[1,M],n∈[1,N],k∈[1,Ks],c∈[1,C]
wherein, trains(m, n) pixel information, box, of the mth row and the nth column of the mth train car fault image in the train car abnormal fault recognition network training sets,kRepresenting the coordinates, label, of the kth carriage fault mark frame in the s carriage fault image in the train carriage abnormal fault recognition network training sets,k,cIndicating that the kth carriage fault marking frame of the ith carriage fault image in the train carriage abnormal fault recognition network training set belongs to the type c fault; s represents the number of all image samples in the train compartment abnormal fault recognition network training set, M is the number of lines of each fault image in the train compartment abnormal fault recognition network training set, N is the number of columns of each fault image in the train compartment abnormal fault recognition network training set, and K is the number of columns of each fault image in the train compartment abnormal fault recognition network training setsRepresenting the total number of the compartment fault marking frames in the s carriage fault image in the train compartment abnormal fault recognition network training set, wherein C is the total number of the fault types in the train compartment abnormal fault recognition network training set;
preferably, the network for identifying an abnormal fault in a train car in step 3 specifically includes: the system comprises a feature extraction network, a channel feature fusion network, a first spatial feature fusion network, a second spatial feature fusion network and a multi-scale prediction layer;
the channel feature fusion network is embedded in the feature extraction network as a sub-module; the feature extraction network is serially cascaded with the first spatial feature fusion network and then is connected with the second spatial feature fusion network in parallel; the second spatial feature fusion network is serially cascaded with the multi-scale prediction layer;
the feature extraction network: the dimensionality reduction convolution module and the residual error module are sequentially stacked and cascaded;
the dimension reduction convolution module is formed by sequentially stacking and cascading a dimension reduction convolution layer, a dimension reduction batch normalization layer and a Leaky ReLU activation layer;
the residual module is formed by sequentially stacking and cascading a plurality of Ghost residual blocks;
the Ghost residual block is composed of a residual convolution layer, a residual batch normalization layer and a ReLU activation layer according to the stacking mode of the traditional residual block;
the feature extraction network is defined as:
Figure BDA0002815286210000041
wherein N isJRepresenting the number of dimension-reducing convolution modules, N, in a feature extraction networkCRepresenting the number of residual modules in the feature extraction network, NGRepresenting the number of Ghost residual blocks in each residual module in the feature extraction network,
Figure BDA0002815286210000042
represents the number of layers of the dimensionality reduction convolution layer in each dimensionality reduction convolution module,
Figure BDA0002815286210000043
representing the number of layers of the dimensionality reduction batch normalization layer in each dimensionality reduction convolution module,
Figure BDA0002815286210000044
indicates the number of layers of the residual convolutional layer in each Ghost residual block,
Figure BDA0002815286210000045
representing the number of layers of a residual error batch normalization layer in each Ghost residual block;
Figure BDA0002815286210000046
representing the parameters in the b1 dimension reduction convolution layer in the a1 dimension reduction convolution module as the parameters to be optimized;
Figure BDA0002815286210000047
representing the translation amount of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure BDA0002815286210000048
representing the scaling quantity of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure BDA0002815286210000051
representing the parameters in the b3 th residual convolution layer in the a3 th Ghost residual block under the a2 th residual module as the parameters to be optimized;
Figure BDA0002815286210000052
representing the translation amount of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module, wherein the translation amount is a parameter to be optimized;
Figure BDA0002815286210000053
representing the scaling quantity of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module as a parameter to be optimized;
the input data of the feature extraction network is a single image in the train compartment abnormal fault recognition network training set in the step 2, and the output data is a low-dimensional feature map, namely Feat1 (M)1×N1×C1) Middle dimension characteristic diagram, namely Feat2 (M)2×N2×C2) High-dimensional feature map, i.e., Feat3 (M)3×N3×C3);
In the output data of the feature extraction network, M1Is the width, N, of the low-dimensional feature map Feat11Height, C, of the low-dimensional feature map Feat11The number of channels is the low-dimensional feature map Feat 1; m2For the width, N, of the medium-dimensional feature map Feat22Height, C, of the medium dimensional feature map Feat22The number of channels of the middle-dimensional feature map Feat 2; m3Is the width, N, of the high-dimensional feature map Feat33Height, C, of high-dimensional feature map Feat33The number of channels of the high-dimensional feature map Feat 3;
the first spatial feature fusion network: the first space convolution layer, the first space batch normalization layer and the maximum pooling module are sequentially stacked and cascaded;
the maximum pooling module is formed by connecting a first maximum pooling layer, a second maximum pooling layer, a third maximum pooling layer and a fourth maximum pooling layer in parallel;
the first spatial feature fusion network is defined as:
Figure BDA0002815286210000054
wherein the content of the first and second substances,
Figure BDA0002815286210000055
representing the number of layers of the first spatial convolution layer in the first spatial feature fusion network,
Figure BDA0002815286210000056
representing the number of layers of a first spatial batch normalization layer in a first spatial feature fusion network; SPP _ kerneleRepresenting the parameters in the e-th first space convolution layer in the first space feature fusion network as the parameters to be optimized; SPP _ GammagRepresenting the translation amount of the g-th first space batch normalization layer in the first space feature fusion network, wherein the translation amount is a parameter to be optimized; SPP _ betagRepresenting the scaling quantity of the g-th first space batch normalization layer in the first space feature fusion network as a parameter to be optimized;
the input data of the first spatial feature fusion network is a high-dimensional feature map, Feat3, and the output data is a spatial fusion feature map, Feat4 (M)4×N4×C4);
In the output data of the first spatial feature fusion network, M4For the width, N, of the spatially fused feature map Feat44Height, C, of spatially fused feature map Feat44The number of channels is the spatial fusion feature map Feat 4;
the second spatial feature fusion network: the device consists of a second space convolution layer, a second space deconvolution layer, a second space batch normalization layer and a ReLU activation layer which are connected in a cross way;
the second spatial feature fusion network is defined as:
Figure BDA0002815286210000061
wherein the content of the first and second substances,
Figure BDA0002815286210000062
representing the number of layers of a second spatial convolution layer in a second spatial feature fusion network,
Figure BDA0002815286210000063
representing the number of layers of a second spatial deconvolution layer in a second spatial feature fusion network,
Figure BDA0002815286210000064
representing the number of layers of a second spatial batch normalization layer in a second spatial feature fusion network; PAN _ kernelpRepresenting the parameters in the p second space convolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ kernelqRepresenting the parameters in the qth second space deconvolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ gammarRepresenting the translation amount of an r second space batch normalization layer in a second space feature fusion network, wherein the translation amount is a parameter to be optimized; PAN _ betarRepresenting the scaling quantity of an r second space batch normalization layer in a second space feature fusion network as a parameter to be optimized;
the input data of the second spatial feature fusion network is Feat1 which is a low-dimensional feature map, Feat2 which is a medium-dimensional feature map, and Feat4 which is a spatial fusion feature map, and the output data is Feat5 (M) which is a first fusion feature map5×N5×C5) Feat6 (M), the second fused feature map6×N6×C6) Feat7 (M), which is the third fused feature map7×N7×C7);
In the output data of the second spatial feature fusion network, M5Is the width, N, of the first fused feature map, Feat55Height, C, of the first fused feature map, Feat55The number of channels being the first fused feature map Feat 5; m6Is the width, N, of the second fused feature map Feat66Height, C, of the second fused feature map Feat66The number of channels being the second fused feature map Feat 6; m7Is the width, N, of the third fused feature map Feat77Height, C, of the third fused feature map Feat77The number of channels being the third fused feature map Feat 7;
the channel feature fusion network comprises: the average pooling layer, the full-connection layer, the ReLU activation layer and the Sigmoid activation layer are sequentially stacked and cascaded;
the channel feature fusion network is defined as:
fSE(SE_kernelz),z∈[1,NSE]
wherein N isSERepresenting the number of layers of a full connection layer in the channel characteristic fusion network; SE _ kernelzRepresenting the parameter of the z-th full connection layer in the channel feature fusion network as the parameter to be optimized;
the input data of the channel feature fusion network are a low-dimensional feature map, i.e., Feat1, a medium-dimensional feature map, i.e., Feat2, and a high-dimensional feature map, i.e., Feat3, and the output data is a first Tensor, i.e., Tensor1(T × T)1) The second Tensor, Tensor2 (T)2) And the third Tensor, Tensor3(T3);
In the output data of the channel characteristic fusion network, T is the line number of a first Tensor Tensor1, a second Tensor Tensor2 and a third Tensor Tensor3, and T is1Is the column number, T, of the first quantity, Tensor12Is the column number, T, of the second Tensor Tensor23Column number for the third Tensor 3;
the multi-scale prediction layer: sequentially stacking and cascading a prediction convolution layer, a prediction batch normalization layer and a ReLU activation layer;
the multi-scale prediction layer is defined as:
Figure BDA0002815286210000071
wherein the content of the first and second substances,
Figure BDA0002815286210000072
indicating the number of predicted convolutional layers in the multi-scale prediction layer,
Figure BDA0002815286210000073
representing the number of layers of a prediction batch normalization layer in the multi-scale prediction layer; YO _ kernelxRepresenting the parameter of the xth predicted convolutional layer in the multi-scale predicted layer, which is the parameter to be optimized; YO-gammayRepresenting the translation amount of the ith prediction batch normalization layer in the multi-scale prediction layer, wherein the translation amount is a parameter to be optimized; YO _ betayRepresenting the zoom quantity of the ith prediction batch normalization layer in the multi-scale prediction layer as a parameter to be optimized;
the input data for the multi-scale prediction layer is the first fused feature map, Feat5, the second fused feature map, Feat6, and the third fused feature map, Feat7, and the output data is the first predicted feature map, Feat8 (M)8×N8×C8) Feat9 (M), the second prediction feature map9×N9×C9) Feat10 (M), the third predictive feature map10×N10×C10);
In the output data of the multi-scale prediction layer, M8For the width, N, of the first predicted feature map, Feat88Height, C, of the first predicted feature map, Feat88The number of channels being the first predicted feature map Feat 8; m9For the width, N, of the second predicted feature map Feat99Height, C, of the second predicted feature map, Feat99The number of channels being the second predicted feature map Feat 9; m10For the width, N, of the third predicted feature map Feat1010Height, C, of the third predicted feature map Feat1010The number of channels is the third predicted feature map Feat 10;
step 3, constructing and constructing a train compartment abnormal fault recognition network loss function through a positioning loss function, a confidence coefficient loss function and a classification loss function;
wherein, when the train carriage fault image is input into the train carriage abnormal fault recognition network for training, the train carriage fault image is divided into A multiplied by A grids, each grid is preset with B anchor boxes, each anchor box obtains corresponding A multiplied by B prediction frames through network regression,not all prediction blocks participate in the computation of the loss function. When a certain car fault image trainsA certain fault flag box (box) in (m, n)s,k,labels,k,c) When the central point of (B) anchor boxes falls on the ith grid, the one of the B anchor boxes with the largest IOU value between the fault mark boxes is selected to learn the characteristic information of the fault and is considered as a positive sample, and the rest of the B-1 anchor boxes are considered as negative samples.
The positioning loss function is:
Figure BDA0002815286210000081
and is
Figure BDA0002815286210000082
Figure BDA0002815286210000083
Figure BDA0002815286210000084
Figure BDA0002815286210000085
Wherein the content of the first and second substances,
Figure BDA0002815286210000086
whether the jth anchor box under the ith grid is responsible for predicting a certain fault or not is shown, if so, the value is 1, and if not, the value is 0; so-called
Figure BDA0002815286210000087
Indicating that the IOU of the jth anchor box and the mark frame of the fault is maximum in all B anchor boxes under the ith grid; IoUiFor car fault image trains(mN) fault marker boxes (box) falling within the ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labels,k,c) Cross-over ratio of (d)iMarking boxes (box) for failures in ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labels,k,c) Euclidean distance of two central points, liTo be able to cover the fault mark box (box) at the same times,k,labels,k,c) And a failure prediction block (p _ box)s,k,p_labels,k,c) Is the diagonal distance, v, of the smallest rectangleiFor measuring aspect ratio uniformity, alphaiIs the trade-off parameter. So that the positioning loss LlocIndicating when an image train is inputsThe kth failure flag box (box) of (m, n)s,k,labels,k,c) Falls in the ith grid and the jth anchor box is responsible for predicting the fault, then the anchor box generates a fault prediction box (p _ box)s,k,p_labels,k,c) Should be associated with the label box (box) of the faults,k,labels,k,c) The positioning losses are calculated together.
The confidence loss function is:
Figure BDA0002815286210000091
and is
Figure BDA0002815286210000092
Figure BDA0002815286210000093
Figure BDA0002815286210000094
Wherein the content of the first and second substances,
Figure BDA0002815286210000095
the jth anchor box representing the ith mesh is not responsible for predicting the fault, so-called
Figure BDA0002815286210000096
That is, in the ith grid, the IOU of the jth anchor box and the failed mark box is not the largest among all the B anchor boxes; lambda [ alpha ]objAnd λnoobjRespectively representing the weights when the anchor box is responsible for predicting a certain fault and not responsible for predicting the certain fault;
Figure BDA0002815286210000097
is the true value of the confidence, if the jth anchor box of the ith grid is responsible for predicting a certain fault
Figure BDA0002815286210000098
Taking 1, otherwise, taking 0;
Figure BDA0002815286210000099
confidence of the prediction box output for the multiscale prediction layer YOLO _ head. So there is a confidence loss LlocConsisting of confidence loss for prediction boxes where objects exist and confidence loss for prediction boxes where objects do not exist.
The classification loss function is:
Figure BDA0002815286210000101
and is
Figure BDA0002815286210000102
Figure BDA0002815286210000103
Wherein the content of the first and second substances,
Figure BDA0002815286210000104
if the ith grid is a true value of the category probability, the jth anchor box under the ith grid is responsible for predicting a certain fault (box)s,k,labels,k,c) When the temperature of the water is higher than the set temperature,
Figure BDA0002815286210000105
and a one-hot matrix with dimension C x 1 is generated, the C-th dimension of the matrix is 1, and the rest are 0.
Figure BDA0002815286210000106
The class probability of the prediction box for expressing the output of the multi-scale prediction layer YOLO _ head is a matrix with the dimension of C multiplied by 1, and the loss value L between the two is calculated by using cross entropycls
The train compartment abnormal fault identification network loss function is as follows:
L=Lloc+Lconf+Lcls
wherein L islocFor the localization loss function, LconfAs a function of confidence loss, LclsIs a classification loss function.
Preferably, the first prediction feature map of the image to be recognized in step 4 is Feat 8;
step 4, the second prediction characteristic map of the image to be recognized is Feat 9;
step 4, the third prediction characteristic map of the image to be recognized is Feat 10;
and 4, the preliminary identification result of the image to be identified comprises the probability that the prediction frame belongs to the foreground, the coordinate of the prediction frame and the class probability of the prediction frame.
The probability that the prediction frame belongs to the foreground in the preliminary recognition result of the image to be recognized is defined as follows:
Iv∈[0,1],v∈NRs
wherein N isRsRepresenting the number of preliminary recognition results, I, of the image to be recognizedvRepresenting the probability that the prediction frame belongs to the foreground in the preliminary identification result of the v-th image to be identified;
and the coordinates of a prediction frame in the preliminary identification result of the image to be identified are defined as:
Figure BDA0002815286210000107
Figure BDA0002815286210000111
wherein l represents the left on the image to be recognized, t represents the upper on the image to be recognized, r represents the right on the image to be recognized, and b represents the lower on the image to be recognized;
Figure BDA0002815286210000112
the coordinates of the upper left corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure BDA0002815286210000113
the abscissa representing the upper left corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure BDA0002815286210000114
representing the ordinate of the upper left corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized;
Figure BDA0002815286210000115
the coordinates of the lower right corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure BDA0002815286210000116
the abscissa representing the lower right corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure BDA0002815286210000117
the ordinate of the lower right corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized is represented;
and the prediction frame category probability in the preliminary identification result of the image to be identified is defined as:
Figure BDA0002815286210000118
wherein, PrvRepresenting a set of all six types of fault category probabilities in the preliminary identification result of the v-th image to be identified;
Figure BDA0002815286210000119
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 0 th type fault;
Figure BDA00028152862100001110
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 1 st type fault;
Figure BDA00028152862100001111
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 2 nd type fault;
Figure BDA00028152862100001112
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 3 rd type fault;
Figure BDA00028152862100001113
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 4 th type fault;
Figure BDA00028152862100001114
representing the probability that the v-th preliminary identification result belongs to the 5 th type fault;
the preliminary identification result of the image to be identified is defined as:
Figure BDA00028152862100001115
wherein R isfirstRepresenting a preliminary recognition result of the image to be recognized;
the final recognition result of the image to be recognized is defined as:
Figure BDA00028152862100001116
and is
Figure BDA0002815286210000121
Wherein R isfinalRepresenting the final recognition result of the image to be recognized, NReRepresenting the number of final recognition results of the image to be recognized;
Figure BDA0002815286210000122
representing the coordinates of the upper left corner of a prediction box in the final recognition result of the epsilon-th image to be recognized;
Figure BDA0002815286210000123
representing the coordinates of the lower right corner of a prediction frame in the final recognition result of the epsilon-th image to be recognized; plabelεAnd the final recognition result of the epsilon-th image to be recognized belongs to which type of fault.
The invention has the following beneficial effects:
some problems possibly encountered in the actual recognition process are simulated by a method of manually adjusting the original picture, for example, environmental noise exists, the intensity of ambient light is too high or too low, color deviation exists, the robustness of the recognition network is improved, and meanwhile, the over-fitting problem caused by few training samples can be avoided to a certain extent.
Fusing a novel convolution calculation mode in the Ghost-block: fewer convolution kernels are used to generate the original feature map, then linear transformation operation is used to produce more phantom feature maps, and finally the two part feature maps are spliced. By the method, nearly half of parameters in the feature extraction network are compressed, and the recognition speed of the network on the input image is accelerated under the condition of not reducing the recognition precision.
The SE-block is introduced to enable the network model to automatically learn the importance degree of different channel characteristics, namely the convolution operation is not weighted but equal in addition operation, and the weight is obtained by the SE-block automatic learning, so that although the execution time of the algorithm is slightly increased, the problem of false detection in the algorithm can be better solved.
The improved identification network can accurately and quickly complete the identification task even on an industrial computer with low hardware configuration and feed the identification task back to workers through a display. The problem of high identification cost in the prior art can be solved, the staff does not need to check the vehicle passing samples one by one, and only needs to confirm the fault of the image output by the network, so that the operation efficiency is greatly improved.
Furthermore, the data acquisition equipment related by the invention is simple and convenient to install and deploy, a freight train does not need to be stopped, workers do not need to arrive at the site for operation, and the automatic identification of the abnormity and the fault of the carriage of the freight train can be realized only by using the train passing samples stored in the server.
Drawings
FIG. 1: the invention is a network structure diagram for identifying abnormal faults of train carriages;
FIG. 2: the invention is a flow chart of a training train compartment abnormal fault recognition network;
FIG. 3: the method is an execution flow chart of the train compartment abnormal fault identification method;
FIG. 4: the train car abnormal fault identification network is an example of the abnormal or fault which can be identified by the train car abnormal fault identification network.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following describes embodiments of the present invention with reference to fig. 1 to 4:
step 1: respectively shooting a left high-resolution image, a right high-resolution image and a top high-resolution image of each carriage by using a high-speed linear array camera under the condition that a freight train does not stop to construct a carriage high-resolution image data set, reducing the carriage high-resolution image to a proper proportion by using a linear interpolation method in an equal proportion, cutting the carriage high-resolution image into four overlapped image blocks with the same size, screening out carriage image samples containing faults from all the image blocks, and constructing a carriage fault image data set by using the carriage image samples containing the faults;
step 1, the compartment fault image data set comprises:
{trains(m,n),s∈[1,S],m∈[1,M],n∈[1,N]}
wherein, trains(M, N) represents pixel information of the mth row and nth column of the S-th compartment fault image in the compartment fault image data set, wherein S is 18031 represents the number of all image samples in the compartment fault image data set, M is 416 represents the number of rows of each fault image in the compartment fault image data set, and N is 416 represents the number of columns of each fault image in the compartment fault image data set;
step 2: manually labeling a carriage fault marking frame and fault types of each carriage fault image in the carriage fault image data set in the step 1, respectively counting the number of carriage fault image samples of each fault type, and collecting fault types of which the number of the image samples is less than a sample number threshold value until the number of the carriage fault image samples of each fault type is greater than the sample number threshold value so as to construct a train carriage abnormal fault identification network training set;
step 2, the coordinates of the compartment fault marking frame of each compartment fault image in the compartment fault image data set are as follows:
Figure BDA0002815286210000131
Figure BDA0002815286210000132
Figure BDA0002815286210000141
where l denotes the left on the car trouble image, t denotes the upper on the car trouble image, r denotes the right on the car trouble image, and b denotes the lower on the car trouble image; s18031 denotes the number of all car fault images in the car fault image data set, KsRepresenting the total number of the compartment fault mark frames in the s compartment fault image in the compartment fault image data set; boxs,kCoordinates representing a k-th car fault flag box in the s-th car fault image in the car fault image data set,
Figure BDA0002815286210000142
the coordinates representing the upper left corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure BDA0002815286210000143
the abscissa representing the upper left corner of the kth car fault flag box in the s-th car fault image data set,
Figure BDA0002815286210000144
the ordinate of the upper left corner of the kth carriage fault marking frame in the s carriage fault image data set is represented;
Figure BDA0002815286210000145
the coordinates representing the lower right corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure BDA0002815286210000146
an abscissa representing the lower right corner of the kth car fault flag box in the s-th car fault image data set,
Figure BDA0002815286210000147
the ordinate represents the lower right corner of the kth carriage fault marking frame in the s carriage fault image data set;
step 2, the compartment fault marking frame category information of each compartment fault image in the compartment fault image data set is as follows:
labels,k,c,s∈[1,S],k∈[1,Ks],c∈[1,C]
wherein, C is 6 which is the total number of fault types in the carriage fault image data set; labels,k,cThe kth carriage fault marking frame which represents the s carriage fault image in the carriage fault image data set belongs to the c fault type;
step 2, the training set of the train compartment abnormal fault recognition network is as follows:
{trains(m,n),(boxs,k,labels,k,c)}
s∈[1,S],m∈[1,M],n∈[1,N],k∈[1,Ks],c∈[1,C]
wherein, trains(m, n) pixel information, box, of the mth row and the nth column of the mth train car fault image in the train car abnormal fault recognition network training sets,kRepresenting the coordinates, label, of the kth carriage fault mark frame in the s carriage fault image in the train carriage abnormal fault recognition network training sets,k,cIndicating that the kth carriage fault marking frame of the ith carriage fault image in the train carriage abnormal fault recognition network training set belongs to the type c fault; s18031 represents the number of all image samples in the train carriage abnormal fault recognition network training set, M416 is the number of lines of each fault image in the train carriage abnormal fault recognition network training set, N416 is the number of columns of each fault image in the train carriage abnormal fault recognition network training set, and K is the number of columns of each fault image in the train carriage abnormal fault recognition network training setsRepresenting the total number of the compartment fault marking frames in the s-th compartment fault image in the train compartment abnormal fault recognition network training set, wherein C is 6 which is the total number of the fault types in the train compartment abnormal fault recognition network training set;
and step 3: building a YOLOv4 original network by using a deep learning framework Tensorflow, replacing partial convolution calculation in a YOLOv4 feature extraction network by using Ghost-block, introducing an attention mechanism (SE-block), and constructing a train compartment abnormal fault identification network. The train car abnormal fault recognition network structure is shown in fig. 2. And (3) taking the train carriage abnormal fault recognition network training set in the step (2) as input data, and constructing a train carriage abnormal fault recognition network loss function by combining the fault types of the carriage fault image samples in the train carriage abnormal fault recognition network training set. And training 50000 times by using a gradient descent algorithm to obtain an optimized train carriage abnormal fault recognition network. The training process of the train compartment abnormal fault identification network is shown in fig. 1;
and 3, the train compartment abnormal fault identification network specifically comprises: the system comprises a feature extraction network, a channel feature fusion network, a first spatial feature fusion network, a second spatial feature fusion network and a multi-scale prediction layer;
the channel feature fusion network is embedded in the feature extraction network as a sub-module; the feature extraction network is serially cascaded with the first spatial feature fusion network and then is connected with the second spatial feature fusion network in parallel; the second spatial feature fusion network is serially cascaded with the multi-scale prediction layer;
the feature extraction network: the dimensionality reduction convolution module and the residual error module are sequentially stacked and cascaded;
the dimension reduction convolution module is formed by sequentially stacking and cascading a dimension reduction convolution layer, a dimension reduction batch normalization layer and a Leaky ReLU activation layer;
the residual module is formed by sequentially stacking and cascading a plurality of Ghost residual blocks;
the Ghost residual block is composed of a residual convolution layer, a residual batch normalization layer and a ReLU activation layer according to the stacking mode of the traditional residual block;
the feature extraction network is defined as:
Figure BDA0002815286210000151
wherein N isJRepresenting the number of dimension-reducing convolution modules, N, in a feature extraction networkCRepresenting the number of residual modules in the feature extraction network, NGEach of the representation feature extraction networksThe number of Ghost residual blocks in the residual module,
Figure BDA0002815286210000161
represents the number of layers of the dimensionality reduction convolution layer in each dimensionality reduction convolution module,
Figure BDA0002815286210000162
representing the number of layers of the dimensionality reduction batch normalization layer in each dimensionality reduction convolution module,
Figure BDA0002815286210000163
indicates the number of layers of the residual convolutional layer in each Ghost residual block,
Figure BDA0002815286210000164
representing the number of layers of a residual error batch normalization layer in each Ghost residual block;
Figure BDA0002815286210000165
representing the parameters in the b1 dimension reduction convolution layer in the a1 dimension reduction convolution module as the parameters to be optimized;
Figure BDA0002815286210000166
representing the translation amount of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure BDA0002815286210000167
representing the scaling quantity of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure BDA0002815286210000168
representing the parameters in the b3 th residual convolution layer in the a3 th Ghost residual block under the a2 th residual module as the parameters to be optimized;
Figure BDA0002815286210000169
representing the translation amount of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module, wherein the translation amount is a parameter to be optimized;
Figure BDA00028152862100001610
representing the scaling quantity of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module as a parameter to be optimized;
the input data of the feature extraction network is a single image in the train compartment abnormal fault recognition network training set in the step 2, and the output data is a low-dimensional feature map, namely Feat1 (M)1×N1×C1) Middle dimension characteristic diagram, namely Feat2 (M)2×N2×C2) High-dimensional feature map, i.e., Feat3 (M)3×N3×C3);
In the output data of the feature extraction network, M152 is the width of the low-dimensional feature map Feat1, N152 is the height, C, of the low-dimensional feature map Feat11256 is the number of channels of the low-dimensional feature map Feat 1; m226 is the width of the medium dimensional feature map Feat2, N226 is the height of the medium dimensional feature map Feat2, C2512 is the channel number of the middle dimension feature map Feat 2; m313 is the width of the high-dimensional feature map Feat3, N313 is the height of the high-dimensional feature map Feat3, C31024 is the channel number of the high-dimensional feature map Feat 3;
the first spatial feature fusion network: the first space convolution layer, the first space batch normalization layer and the maximum pooling module are sequentially stacked and cascaded;
the maximum pooling module is formed by connecting a first maximum pooling layer, a second maximum pooling layer, a third maximum pooling layer and a fourth maximum pooling layer in parallel;
the first spatial feature fusion network is defined as:
Figure BDA00028152862100001611
wherein the content of the first and second substances,
Figure BDA0002815286210000171
representing the number of layers of the first spatial convolution layer in the first spatial feature fusion network,
Figure BDA0002815286210000172
representing the number of layers of a first spatial batch normalization layer in a first spatial feature fusion network; SPP _ kerneleRepresenting the parameters in the e-th first space convolution layer in the first space feature fusion network as the parameters to be optimized; SPP _ GammagRepresenting the translation amount of the g-th first space batch normalization layer in the first space feature fusion network, wherein the translation amount is a parameter to be optimized; SPP _ betagRepresenting the scaling quantity of the g-th first space batch normalization layer in the first space feature fusion network as a parameter to be optimized;
the input data of the first spatial feature fusion network is a high-dimensional feature map, Feat3, and the output data is a spatial fusion feature map, Feat4 (M)4×N4×C4);
In the output data of the first spatial feature fusion network, M413 is the width of the spatial fusion feature map Feat4, N413 is the height of the spatially fused feature map, Feat4, C42048 is the number of channels of the spatial fusion feature map Feat 4;
the second spatial feature fusion network: the device consists of a second space convolution layer, a second space deconvolution layer, a second space batch normalization layer and a ReLU activation layer which are connected in a cross way;
the second spatial feature fusion network is defined as:
Figure BDA0002815286210000173
wherein the content of the first and second substances,
Figure BDA0002815286210000174
representing the number of layers of a second spatial convolution layer in a second spatial feature fusion network,
Figure BDA0002815286210000175
representing the number of layers of a second spatial deconvolution layer in a second spatial feature fusion network,
Figure BDA0002815286210000176
representing the number of layers of a second spatial batch normalization layer in a second spatial feature fusion network; PAN _ kernelpRepresenting the parameters in the p second space convolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ kernelqRepresenting the parameters in the qth second space deconvolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ gammarRepresenting the translation amount of an r second space batch normalization layer in a second space feature fusion network, wherein the translation amount is a parameter to be optimized; PAN _ betarRepresenting the scaling quantity of an r second space batch normalization layer in a second space feature fusion network as a parameter to be optimized;
the input data of the second spatial feature fusion network is Feat1 which is a low-dimensional feature map, Feat2 which is a medium-dimensional feature map, and Feat4 which is a spatial fusion feature map, and the output data is Feat5 (M) which is a first fusion feature map5×N5×C5) Feat6 (M), the second fused feature map6×N6×C6) Feat7 (M), which is the third fused feature map7×N7×C7);
In the output data of the second spatial feature fusion network, M552 is the width of the first fused feature map Feat5, N552 is the height of the first fused feature map, Feat5, C5128 is the channel number of the first fused feature map Feat 5; m626 is the width of the second fused feature map Feat6, N626 is the height of the second fused feature map, Feat6, C6256 is the number of channels in the second fused feature map Feat 6; m713 is the width of the third fused feature map Feat7, N713 is the height of the third fused feature map, Feat7, C7512 is the number of channels in the third fused feature map Feat 7;
the channel feature fusion network comprises: the average pooling layer, the full-connection layer, the ReLU activation layer and the Sigmoid activation layer are sequentially stacked and cascaded;
the channel feature fusion network is defined as:
fSE(SE_kernelz),z∈[1,NSE]
wherein N isSERepresenting the number of layers of a full connection layer in the channel characteristic fusion network; SE _ kernelzRepresenting the parameter of the z-th full connection layer in the channel feature fusion network as the parameter to be optimized;
the input data of the channel feature fusion network are a low-dimensional feature map, i.e., Feat1, a medium-dimensional feature map, i.e., Feat2, and a high-dimensional feature map, i.e., Feat3, and the output data is a first Tensor, i.e., Tensor1(T × T)1) The second Tensor, Tensor2 (T)2) And the third Tensor, Tensor3(T3);
In the output data of the channel characteristic fusion network, T1 is the line number of the first Tensor Tensor1, the second Tensor Tensor2 and the third Tensor Tensor3, and T is1256 is the number of columns of the first quantity of tensors 1, T2512 is the number of columns in the second Tensor2, T31024 is the number of columns of the third Tensor 3;
the multi-scale prediction layer: sequentially stacking and cascading a prediction convolution layer, a prediction batch normalization layer and a ReLU activation layer;
the multi-scale prediction layer is defined as:
Figure BDA0002815286210000181
wherein the content of the first and second substances,
Figure BDA0002815286210000182
indicating the number of predicted convolutional layers in the multi-scale prediction layer,
Figure BDA0002815286210000183
representing the number of layers of a prediction batch normalization layer in the multi-scale prediction layer; YO _ kernelxRepresenting the parameter of the xth predicted convolutional layer in the multi-scale predicted layer, which is the parameter to be optimized; YO _ gammayRepresenting the translation amount of the ith prediction batch normalization layer in the multi-scale prediction layer, wherein the translation amount is a parameter to be optimized; YO _ betayRepresenting the zoom quantity of the ith prediction batch normalization layer in the multi-scale prediction layer as a parameter to be optimized;
the input data for the multi-scale prediction layer is the first fused feature map, Feat5, the second fused feature map, Feat6, and the third fused feature map, Feat7, and the output data is the first predicted feature map, Feat8 (M)8×N8×C8) Feat9 (M), the second prediction feature map9×N9×C9) Feat10 (M), the third predictive feature map10×N10×C10);
In the output data of the multi-scale prediction layer, M852 is the width of the first predicted feature map Feat8, N852 is the height of the first predicted feature map, Feat8, C833 is the number of channels of the first predicted feature map Feat 8; m926 is the width of the second predicted feature map Feat9, N926 is the height of the second predicted feature map Feat9, C933 is the number of channels of the second predicted feature map Feat 9; m1013 is the width of the third predicted feature map Feat10, N1013 is the height of the third predicted feature map Feat10, C1033 is the number of channels of the third predicted feature map Feat 10;
step 3, constructing and constructing a train compartment abnormal fault recognition network loss function through a positioning loss function, a confidence coefficient loss function and a classification loss function;
when the train car fault image is input into the train car abnormal fault recognition network for training, the train car fault image is divided into A × A (A is 52, 26 and 13) grids, each grid is preset with B is 3 anchor boxes, and each anchor box obtains corresponding A × A × B prediction frames through network regression, but not all the prediction frames participate in calculation of the loss function. When a certain car fault image trainsA certain fault flag box (box) in (m, n)s,k,labels,k,c) When the central point of (B) anchor boxes falls on the ith grid, the one of the B anchor boxes with the largest IOU value between the fault mark boxes is selected to learn the characteristic information of the fault and is considered as a positive sample, and the rest of the B-1 anchor boxes are considered as negative samples.
The positioning loss function is:
Figure BDA0002815286210000191
and is
Figure BDA0002815286210000192
Figure BDA0002815286210000201
Figure BDA0002815286210000202
Figure BDA0002815286210000203
Wherein the content of the first and second substances,
Figure BDA0002815286210000204
whether the jth anchor box under the ith grid is responsible for predicting a certain fault or not is shown, if so, the value is 1, and if not, the value is 0; so-called
Figure BDA0002815286210000205
Indicating that the IOU of the jth anchor box and the mark frame of the fault is maximum in all B anchor boxes under the ith grid; IoUiFor car fault image trains(m, n) a fault flag box (box) falling within the ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labels,k,c) Cross-over ratio of (d)iMarking boxes (box) for failures in ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labels,k,c) Euclidean distance of two central points, liTo be able to cover the fault mark box (box) at the same times,k,labels,k,c) And a failure prediction block (p _ box)s,k,p_labels,k,c) Is the diagonal distance, v, of the smallest rectangleiFor measuring aspect ratio uniformity, alphaiIs the trade-off parameter. So that the positioning loss LlocIndicating when an image train is inputsThe kth failure flag box (box) of (m, n)s,k,labels,k,c) Falls in the ith grid and the jth anchor box is responsible for predicting the fault, then the anchor box generates a fault prediction box (p _ box)s,k,p_labels,k,c) Should be associated with the label box (box) of the faults,k,labels,k,c) The positioning losses are calculated together.
The confidence loss function is:
Figure BDA0002815286210000206
and is
Figure BDA0002815286210000207
Figure BDA0002815286210000211
Figure BDA0002815286210000212
Wherein the content of the first and second substances,
Figure BDA0002815286210000213
the jth anchor box representing the ith mesh is not responsible for predicting the fault, so-called
Figure BDA0002815286210000214
That is, in the ith grid, the IOU of the jth anchor box and the failed mark box is not the largest among all the B anchor boxes; lambda [ alpha ]objAnd λnoobjRespectively indicating that the anchor box is responsible for predicting a certain fault and not responsible for predicting a certain faultWeight at fault;
Figure BDA0002815286210000215
is the true value of the confidence, if the jth anchor box of the ith grid is responsible for predicting a certain fault
Figure BDA0002815286210000216
Taking 1, otherwise, taking 0;
Figure BDA0002815286210000217
confidence of the prediction box output for the multiscale prediction layer YOLO _ head. So there is a confidence loss LlocConsisting of confidence loss for prediction boxes where objects exist and confidence loss for prediction boxes where objects do not exist.
The classification loss function is:
Figure BDA0002815286210000218
and is
Figure BDA0002815286210000219
Figure BDA00028152862100002110
Wherein the content of the first and second substances,
Figure BDA00028152862100002111
if the ith grid is a true value of the category probability, the jth anchor box under the ith grid is responsible for predicting a certain fault (box)s,k,labels,k,c) When the temperature of the water is higher than the set temperature,
Figure BDA00028152862100002112
and a one-hot matrix with dimension C x 1 is generated, the C-th dimension of the matrix is 1, and the rest are 0.
Figure BDA00028152862100002113
The class probability of the prediction box for expressing the output of the multi-scale prediction layer YOLO _ head is a matrix with the dimension of C multiplied by 1, and the loss value L between the two is calculated by using cross entropycls
The train compartment abnormal fault identification network loss function is as follows:
L=Lloc+Lconf+Lcls
wherein L islocFor the localization loss function, LconfAs a function of confidence loss, LclsIs a classification loss function.
And 4, step 4: inputting the image to be recognized into the optimized train compartment abnormal fault recognition network, predicting to obtain a first prediction characteristic diagram, a second prediction characteristic diagram and a third prediction characteristic diagram of the image to be recognized, splicing the first prediction characteristic diagram, the second prediction characteristic diagram and the third prediction characteristic diagram of the image to be recognized to obtain a primary recognition result of the image to be recognized, and performing operations such as confidence screening, non-maximum value suppression and the like to obtain a final recognition result. And finally, storing the identification result in an image and log mode, and waiting for the worker to confirm the fault. The execution flow of the train car abnormal fault identification method is shown in fig. 3, and an example of the abnormality or fault that needs to be identified by the identification network is shown in fig. 4.
Step 4, the first prediction characteristic map of the image to be recognized is Feat 8;
step 4, the second prediction characteristic map of the image to be recognized is Feat 9;
step 4, the third prediction characteristic map of the image to be recognized is Feat 10;
and 4, the preliminary identification result of the image to be identified comprises the probability that the prediction frame belongs to the foreground, the coordinate of the prediction frame and the class probability of the prediction frame.
The probability that the prediction frame belongs to the foreground in the preliminary recognition result of the image to be recognized is defined as follows:
Iv∈[0,1],v∈NRs
wherein N isRsRepresenting the number of preliminary recognition results, I, of the image to be recognizedvRepresenting the probability that the prediction frame belongs to the foreground in the preliminary identification result of the v-th image to be identified;
and the coordinates of a prediction frame in the preliminary identification result of the image to be identified are defined as:
Figure BDA0002815286210000221
Figure BDA0002815286210000222
wherein l represents the left on the image to be recognized, t represents the upper on the image to be recognized, r represents the right on the image to be recognized, and b represents the lower on the image to be recognized;
Figure BDA0002815286210000223
the coordinates of the upper left corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure BDA0002815286210000224
the abscissa representing the upper left corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure BDA0002815286210000225
representing the ordinate of the upper left corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized;
Figure BDA0002815286210000226
the coordinates of the lower right corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure BDA0002815286210000227
the abscissa representing the lower right corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure BDA0002815286210000228
the ordinate of the lower right corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized is represented;
and the prediction frame category probability in the preliminary identification result of the image to be identified is defined as:
Figure BDA0002815286210000231
wherein, PrvRepresenting a set of all six types of fault category probabilities in the preliminary identification result of the v-th image to be identified;
Figure BDA0002815286210000232
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 0 th type fault;
Figure BDA0002815286210000233
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 1 st type fault;
Figure BDA0002815286210000234
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 2 nd type fault;
Figure BDA0002815286210000235
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 3 rd type fault;
Figure BDA0002815286210000236
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 4 th type fault;
Figure BDA0002815286210000237
representing the probability that the v-th preliminary identification result belongs to the 5 th type fault;
the preliminary identification result of the image to be identified is defined as:
Figure BDA0002815286210000238
wherein R isfirstRepresenting a preliminary recognition result of the image to be recognized;
the final recognition result of the image to be recognized is defined as:
Figure BDA0002815286210000239
and is
Figure BDA00028152862100002310
Wherein R isfinalRepresenting the final recognition result of the image to be recognized, NReRepresenting the number of final recognition results of the image to be recognized;
Figure BDA00028152862100002311
representing the coordinates of the upper left corner of a prediction box in the final recognition result of the epsilon-th image to be recognized;
Figure BDA00028152862100002312
representing the coordinates of the lower right corner of a prediction frame in the final recognition result of the epsilon-th image to be recognized; plabelεAnd the final recognition result of the epsilon-th image to be recognized belongs to which type of fault.
The invention provides an automatic identification method for railway station freight train carriage abnormity and fault, which identifies a train passing sample by analyzing a command file to access a remote picture, and completes the functions of storing a fault image, generating a log file and the like. After the test is carried out on the spot, compared with the unoptimized recognition algorithm, the recognition algorithm after training optimization is greatly improved in both accuracy and recall rate.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (5)

1. An automatic identification method for abnormal faults of freight train carriages at a station is characterized by comprising the following steps:
step 1: respectively shooting a left high-resolution image, a right high-resolution image and a top high-resolution image of each carriage by using a high-speed linear array camera under the condition that a freight train does not stop to construct a carriage high-resolution image data set, reducing the carriage high-resolution image to a proper proportion by using a linear interpolation method in an equal proportion, cutting the carriage high-resolution image into four overlapped image blocks with the same size, screening out carriage image samples containing faults from all the image blocks, and constructing a carriage fault image data set by using the carriage image samples containing the faults;
step 2: manually labeling a carriage fault marking frame and fault types of each carriage fault image in the carriage fault image data set in the step 1, respectively counting the number of carriage fault image samples of each fault type, and collecting fault types of which the number of the image samples is less than a sample number threshold value until the number of the carriage fault image samples of each fault type is greater than the sample number threshold value so as to construct a train carriage abnormal fault identification network training set;
and step 3: constructing a train carriage abnormal fault recognition network, taking the train carriage abnormal fault recognition network training set in the step 2 as input data, constructing a train carriage abnormal fault recognition network loss function by combining the fault types of the carriage fault image samples in the train carriage abnormal fault recognition network training set, and obtaining the optimized train carriage abnormal fault recognition network through gradient descent algorithm training;
and 4, step 4: inputting the image to be recognized into the optimized train compartment abnormal fault recognition network, predicting to obtain a first prediction characteristic diagram, a second prediction characteristic diagram and a third prediction characteristic diagram of the image to be recognized, splicing the first prediction characteristic diagram, the second prediction characteristic diagram and the third prediction characteristic diagram of the image to be recognized to obtain a primary recognition result of the image to be recognized, and performing operations such as confidence screening, non-maximum value suppression and the like to obtain a final recognition result.
2. The automatic identification method for the abnormal fault of the freight train car at the station as claimed in claim 1, characterized in that:
step 1, the compartment fault image data set comprises:
{trains(m,n),s∈[1,S],m∈[1,M],n∈[1,N]}
wherein, trains(M, N) represents the pixel information of the mth row and the nth column of the S-th car fault image in the car fault image data set, S represents the number of all image samples in the car fault image data set, M is the row number of each fault image in the car fault image data set, and N is the column number of each fault image in the car fault image data set.
3. The automatic identification method for the abnormal fault of the freight train car at the station as claimed in claim 1, characterized in that:
step 2, the coordinates of the compartment fault marking frame of each compartment fault image in the compartment fault image data set are as follows:
Figure FDA0002815286200000021
Figure FDA0002815286200000022
Figure FDA0002815286200000023
where l denotes the left on the car trouble image, t denotes the upper on the car trouble image, r denotes the right on the car trouble image, and b denotes the lower on the car trouble image; s represents the number of fault images of all the carriages in the carriage fault image data set, KsRepresenting the total number of the compartment fault mark frames in the s compartment fault image in the compartment fault image data set; boxs,kCoordinates representing a k-th car fault flag box in the s-th car fault image in the car fault image data set,
Figure FDA0002815286200000024
the coordinates representing the upper left corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure FDA0002815286200000025
the abscissa representing the upper left corner of the kth car fault flag box in the s-th car fault image data set,
Figure FDA0002815286200000026
the ordinate of the upper left corner of the kth carriage fault marking frame in the s carriage fault image data set is represented;
Figure FDA0002815286200000027
the coordinates representing the lower right corner of the kth car fault flag box in the s-th car fault image in the car fault image dataset,
Figure FDA0002815286200000028
an abscissa representing the lower right corner of the kth car fault flag box in the s-th car fault image data set,
Figure FDA0002815286200000029
the ordinate represents the lower right corner of the kth carriage fault marking frame in the s carriage fault image data set;
step 2, the compartment fault marking frame category information of each compartment fault image in the compartment fault image data set is as follows:
labels,k,c,s∈[1,S],k∈[1,Ks],c∈[1,C]
wherein C is the total number of fault types in the carriage fault image data set; labels,k,,The kth carriage fault marking frame which represents the s carriage fault image in the carriage fault image data set belongs to the c fault type;
step 2, the training set of the train compartment abnormal fault recognition network is as follows:
{trains(m,n),(boxs,k,labels,k,c)}
s∈[1,S],m∈[1,M],n∈[1,N],k∈[1,Ks],c∈[1,C]
wherein, trains(m, n) pixel information, box, of the mth row and the nth column of the mth train car fault image in the train car abnormal fault recognition network training sets,kRepresenting the coordinates, label, of the kth carriage fault mark frame in the s carriage fault image in the train carriage abnormal fault recognition network training sets,k,cIndicating that the kth carriage fault marking frame of the ith carriage fault image in the train carriage abnormal fault recognition network training set belongs to the type c fault; s represents the number of all image samples in the train compartment abnormal fault recognition network training set, M is the number of lines of each fault image in the train compartment abnormal fault recognition network training set, N is the number of columns of each fault image in the train compartment abnormal fault recognition network training set, and K is the number of columns of each fault image in the train compartment abnormal fault recognition network training setsAnd C is the total number of fault types in the train carriage abnormal fault recognition network training set.
4. The automatic identification method for the abnormal fault of the freight train car at the station as claimed in claim 1, characterized in that:
and 3, the train compartment abnormal fault identification network specifically comprises: the system comprises a feature extraction network, a channel feature fusion network, a first spatial feature fusion network, a second spatial feature fusion network and a multi-scale prediction layer;
the channel feature fusion network is embedded in the feature extraction network as a sub-module; the feature extraction network is serially cascaded with the first spatial feature fusion network and then is connected with the second spatial feature fusion network in parallel; the second spatial feature fusion network is serially cascaded with the multi-scale prediction layer;
the feature extraction network: the dimensionality reduction convolution module and the residual error module are sequentially stacked and cascaded;
the dimension reduction convolution module is formed by sequentially stacking and cascading a dimension reduction convolution layer, a dimension reduction batch normalization layer and a Leaky ReLU activation layer;
the residual module is formed by sequentially stacking and cascading a plurality of Ghost residual blocks;
the Ghost residual block is composed of a residual convolution layer, a residual batch normalization layer and a ReLU activation layer according to the stacking mode of the traditional residual block;
the feature extraction network is defined as:
Figure FDA0002815286200000031
a1∈[1,NJ],a2∈[1,NC],a3∈[1,NG]
Figure FDA0002815286200000032
wherein N isJRepresenting the number of dimension-reducing convolution modules, N, in a feature extraction networkCRepresenting the number of residual modules in the feature extraction network, NGRepresenting the number of Ghost residual blocks in each residual module in the feature extraction network,
Figure FDA0002815286200000033
represents the number of layers of the dimensionality reduction convolution layer in each dimensionality reduction convolution module,
Figure FDA0002815286200000034
representing the number of layers of the dimensionality reduction batch normalization layer in each dimensionality reduction convolution module,
Figure FDA0002815286200000035
indicates the number of layers of the residual convolutional layer in each Ghost residual block,
Figure FDA0002815286200000036
representing the number of layers of a residual error batch normalization layer in each Ghost residual block;
Figure FDA0002815286200000037
representing the parameters in the b1 dimension reduction convolution layer in the a1 dimension reduction convolution module as the parameters to be optimized;
Figure FDA0002815286200000038
representing the translation amount of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure FDA0002815286200000039
representing the scaling quantity of a b2 dimension reduction batch normalization layer in an a1 dimension reduction convolution module as a parameter to be optimized;
Figure FDA0002815286200000041
representing the parameters in the b3 th residual convolution layer in the a3 th Ghost residual block under the a2 th residual module as the parameters to be optimized;
Figure FDA0002815286200000042
representing the translation amount of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module, wherein the translation amount is a parameter to be optimized;
Figure FDA0002815286200000043
representing the scaling quantity of a b4 th residual error batch normalization layer in a3 th Ghost residual error block under an a2 th residual error module as a parameter to be optimized;
the input data of the feature extraction network is a single image in the train compartment abnormal fault recognition network training set in the step 2, and the output data is a low-dimensional feature map, namely Feat1 (M)1×N1×C1) Middle dimension characteristic diagram, namely Feat2 (M)2×N2×C2) High-dimensional feature map, i.e., Feat3 (M)3×N3×C3);
The characteristic is providedIn the output data of the network, M1Is the width, N, of the low-dimensional feature map Feat11Height, C, of the low-dimensional feature map Feat11The number of channels is the low-dimensional feature map Feat 1; m2For the width, N, of the medium-dimensional feature map Feat22Height, C, of the medium dimensional feature map Feat22The number of channels of the middle-dimensional feature map Feat 2; m3Is the width, N, of the high-dimensional feature map Feat33Height, C, of high-dimensional feature map Feat33The number of channels of the high-dimensional feature map Feat 3;
the first spatial feature fusion network: the first space convolution layer, the first space batch normalization layer and the maximum pooling module are sequentially stacked and cascaded;
the maximum pooling module is formed by connecting a first maximum pooling layer, a second maximum pooling layer, a third maximum pooling layer and a fourth maximum pooling layer in parallel;
the first spatial feature fusion network is defined as:
Figure FDA0002815286200000044
wherein the content of the first and second substances,
Figure FDA0002815286200000045
representing the number of layers of the first spatial convolution layer in the first spatial feature fusion network,
Figure FDA0002815286200000046
representing the number of layers of a first spatial batch normalization layer in a first spatial feature fusion network; SPP _ kerneleRepresenting the parameters in the e-th first space convolution layer in the first space feature fusion network as the parameters to be optimized; SPP _ GammagRepresenting the translation amount of the g-th first space batch normalization layer in the first space feature fusion network, wherein the translation amount is a parameter to be optimized; SPP _ betagRepresenting the scaling quantity of the g-th first space batch normalization layer in the first space feature fusion network as a parameter to be optimized;
input number of first spatial feature fusion networkThe data is a high-dimensional feature map, Feat3, and the output data is a spatially fused feature map, Feat4 (M)4×N4×C4);
In the output data of the first spatial feature fusion network, M4For the width, N, of the spatially fused feature map Feat44Height, C, of spatially fused feature map Feat44The number of channels is the spatial fusion feature map Feat 4;
the second spatial feature fusion network: the device consists of a second space convolution layer, a second space deconvolution layer, a second space batch normalization layer and a ReLU activation layer which are connected in a cross way;
the second spatial feature fusion network is defined as:
fPAN(PAN_kernelp,PAN_Ukernelq,PAN_γr,PAN_βr)
Figure FDA0002815286200000051
wherein the content of the first and second substances,
Figure FDA0002815286200000052
representing the number of layers of a second spatial convolution layer in a second spatial feature fusion network,
Figure FDA0002815286200000053
representing the number of layers of a second spatial deconvolution layer in a second spatial feature fusion network,
Figure FDA0002815286200000054
representing the number of layers of a second spatial batch normalization layer in a second spatial feature fusion network; PAN _ kernelpRepresenting the parameters in the p second space convolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ kernelqRepresenting the parameters in the qth second space deconvolution layer in the second space feature fusion network as the parameters to be optimized; PAN _ gammarRepresenting the r-th second space batch in the second space feature fusion networkThe translation amount of a normalization layer is a parameter to be optimized; PAN _ betarRepresenting the scaling quantity of an r second space batch normalization layer in a second space feature fusion network as a parameter to be optimized;
the input data of the second spatial feature fusion network is Feat1 which is a low-dimensional feature map, Feat2 which is a medium-dimensional feature map, and Feat4 which is a spatial fusion feature map, and the output data is Feat5 (M) which is a first fusion feature map5×N5×C5) F6at6 (M) as a second fused feature map6×N6×C6) Feat7) M as the third fused feature map7×N7×C7);
In the output data of the second spatial feature fusion network, M5Is the width, N, of the first fused feature map, Feat55Height, C, of the first fused feature map, Feat55The number of channels being the first fused feature map Feat 5; m6Is the width, N, of the second fused feature map Feat66Height, C, of the second fused feature map Feat66The number of channels being the second fused feature map Feat 6; m7Is the width, N, of the third fused feature map Feat77Height, C, of the third fused feature map Feat77The number of channels being the third fused feature map Feat 7;
the channel feature fusion network comprises: the average pooling layer, the full-connection layer, the ReLU activation layer and the Sigmoid activation layer are sequentially stacked and cascaded;
the channel feature fusion network is defined as:
fSE(SE_kernelz),z∈[1,NSE]
wherein N isSERepresenting the number of layers of a full connection layer in the channel characteristic fusion network; SE _ kernelzRepresenting the parameter of the z-th full connection layer in the channel feature fusion network as the parameter to be optimized;
the input data of the channel feature fusion network are a low-dimensional feature map, i.e., Feat1, a medium-dimensional feature map, i.e., Feat2, and a high-dimensional feature map, i.e., Feat3, and the output data is a first Tensor, i.e., Tensor1(T × T)1) The second Tensor, Tensor2 (T)2) And the third Tensor, Tensor3(T3);
In the output data of the channel characteristic fusion network, T is the line number of a first Tensor Tensor1, a second Tensor Tensor2 and a third Tensor Tensor3, and T is1Is the column number, T, of the first quantity, Tensor12Is the column number, T, of the second Tensor Tensor23Column number for the third Tensor 3;
the multi-scale prediction layer: sequentially stacking and cascading a prediction convolution layer, a prediction batch normalization layer and a ReLU activation layer;
the multi-scale prediction layer is defined as:
Figure FDA0002815286200000061
wherein the content of the first and second substances,
Figure FDA0002815286200000062
indicating the number of predicted convolutional layers in the multi-scale prediction layer,
Figure FDA0002815286200000063
representing the number of layers of a prediction batch normalization layer in the multi-scale prediction layer; YO _ kernelxRepresenting the parameter of the xth predicted convolutional layer in the multi-scale predicted layer, which is the parameter to be optimized; YO _ gammayRepresenting the translation amount of the ith prediction batch normalization layer in the multi-scale prediction layer, wherein the translation amount is a parameter to be optimized; YO _ betayRepresenting the zoom quantity of the ith prediction batch normalization layer in the multi-scale prediction layer as a parameter to be optimized;
the input data for the multi-scale prediction layer is the first fused feature map, Feat5, the second fused feature map, Feat6, and the third fused feature map, Feat7, and the output data is the first predicted feature map, Feat8 (M)8×N8×C8) Feat9 (M), the second prediction feature map9×N9×C9) Feat10 (M), the third predictive feature map10×N10×C10);
In the output data of the multi-scale prediction layer, M8For the width, N, of the first predicted feature map, Feat88Is as followsA height, C, of a predicted feature map Feat88The number of channels being the first predicted feature map Feat 8; m9For the width, N, of the second predicted feature map Feat99Height, C, of the second predicted feature map, Feat99The number of channels being the second predicted feature map Feat 9; m10For the width, N, of the third predicted feature map Feat1010Height, C, of the third predicted feature map Feat1010The number of channels is the third predicted feature map Feat 10;
step 3, constructing and constructing a train compartment abnormal fault recognition network loss function through a positioning loss function, a confidence coefficient loss function and a classification loss function;
when the train carriage fault image is input into a train carriage abnormal fault recognition network for training, the train carriage fault image is divided into A multiplied by A grids, each grid is preset with B anchor boxes, each anchor box obtains corresponding A multiplied by B prediction frames through network regression, but not all the prediction frames participate in the calculation of a loss function; when a certain car fault image trainsA certain fault flag box (box) in (m, n)s,k,labels,k,c) When the central point of the B anchor boxes is located in the ith grid, selecting the one of the B anchor boxes which has the largest IOU value with the fault marking frame to learn the characteristic information of the fault, and regarding the characteristic information as a positive sample, and regarding the rest B-1 anchor boxes as negative samples;
the positioning loss function is:
Figure FDA0002815286200000071
and is
Figure FDA0002815286200000072
Figure FDA0002815286200000073
Figure FDA0002815286200000074
Figure FDA0002815286200000075
Wherein the content of the first and second substances,
Figure FDA0002815286200000076
whether the jth anchor box under the ith grid is responsible for predicting a certain fault or not is shown, if so, the value is 1, and if not, the value is 0; so-called
Figure FDA0002815286200000077
Indicating that the IOU of the jth anchor box and the mark frame of the fault is maximum in all B anchor boxes under the ith grid; IoUiFor car fault image trains(m, n) a fault flag box (box) falling within the ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labels,k,c) Cross-over ratio of (d)iMarking boxes (box) for failures in ith grids,k,labels,k,c) With corresponding failure prediction blocks (p _ box)s,k,p_labela,k,c) Euclidean distance of two central points, liTo be able to cover the fault mark box (box) at the same times,k,labels,k,c) And a failure prediction block (p _ box)s,k,p_labels,k,c) Is the diagonal distance, v, of the smallest rectangleiFor measuring aspect ratio uniformity, alphaiIs a trade-off parameter; so that the positioning loss LlocIndicating when an image train is inputsThe kth failure flag box (box) of (m, n)s,k,labels,k,c) Falls in the ith grid and the jth anchor box is responsible for predicting the fault, then the anchor box generates a fault prediction box (p _ box)s,k,p_labels,k,c) Should be associated with the label box (box) of the faults,k,labels,k,c) Calculating the positioning loss together;
the confidence loss function is:
Figure FDA0002815286200000081
and is
Figure FDA0002815286200000082
Figure FDA0002815286200000083
Figure FDA0002815286200000084
Wherein the content of the first and second substances,
Figure FDA0002815286200000085
the jth anchor box representing the ith mesh is not responsible for predicting the fault, so-called
Figure FDA0002815286200000086
That is, in the ith grid, the IOU of the jth anchor box and the failed mark box is not the largest among all the B anchor boxes; lambda [ alpha ]objAnd λnoobjRespectively representing the weights when the anchor box is responsible for predicting a certain fault and not responsible for predicting the certain fault;
Figure FDA0002815286200000087
is the true value of the confidence, if the jth anchor box of the ith grid is responsible for predicting a certain fault
Figure FDA0002815286200000088
Taking 1, otherwise, taking 0;
Figure FDA0002815286200000089
confidence of the prediction box output for the multiscale prediction layer YOLO _ head; so there is a confidence loss LlocConsisting of confidence loss for prediction boxes with objects present and confidence loss for prediction boxes without objects present;
the classification loss function is:
Figure FDA00028152862000000810
and is
Figure FDA00028152862000000811
Figure FDA00028152862000000812
Wherein the content of the first and second substances,
Figure FDA00028152862000000813
if the ith grid is a true value of the category probability, the jth anchor box under the ith grid is responsible for predicting a certain fault (box)s,k,labels,k,c) When the temperature of the water is higher than the set temperature,
Figure FDA0002815286200000091
and a one-hot matrix with the dimensionality of C multiplied by 1 is generated, the C-th dimensionality of the matrix is 1, and the rest dimensions are 0;
Figure FDA0002815286200000092
the class probability of the prediction box for expressing the output of the multi-scale prediction layer YOLO _ head is a matrix with the dimension of C multiplied by 1, and the loss value L between the two is calculated by using cross entropycls
The train compartment abnormal fault identification network loss function is as follows:
L=Lloc+Lconf+Lcls
wherein L islocFor the localization loss function, LconfAs a function of confidence loss, LclsIs a classification loss function.
5. The automatic identification method for the abnormal fault of the freight train car at the station as claimed in claim 1, characterized in that:
step 4, the first prediction characteristic map of the image to be recognized is Feat 8;
step 4, the second prediction characteristic map of the image to be recognized is Feat 9;
step 4, the third prediction characteristic map of the image to be recognized is Feat 10;
step 4, the preliminary identification result of the image to be identified comprises the probability that the prediction frame belongs to the foreground, the coordinate of the prediction frame and the class probability of the prediction frame;
the probability that the prediction frame belongs to the foreground in the preliminary recognition result of the image to be recognized is defined as follows:
Iv∈[0,1],v∈NRs
wherein N isRsRepresenting the number of preliminary recognition results, I, of the image to be recognizedvRepresenting the probability that the prediction frame belongs to the foreground in the preliminary identification result of the v-th image to be identified;
and the coordinates of a prediction frame in the preliminary identification result of the image to be identified are defined as:
Figure FDA0002815286200000093
Figure FDA0002815286200000094
wherein l represents the left on the image to be recognized, t represents the upper on the image to be recognized, r represents the right on the image to be recognized, and b represents the lower on the image to be recognized;
Figure FDA0002815286200000095
the coordinates of the upper left corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure FDA0002815286200000096
the abscissa representing the upper left corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure FDA0002815286200000097
representing the ordinate of the upper left corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized;
Figure FDA0002815286200000098
the coordinates of the lower right corner of the prediction frame in the preliminary recognition result of the v-th image to be recognized are shown,
Figure FDA0002815286200000099
the abscissa representing the lower right corner of the prediction box in the preliminary recognition result of the v-th image to be recognized,
Figure FDA00028152862000000910
the ordinate of the lower right corner of a prediction frame in the preliminary recognition result of the v-th image to be recognized is represented;
and the prediction frame category probability in the preliminary identification result of the image to be identified is defined as:
Figure FDA0002815286200000101
Figure FDA0002815286200000102
wherein, PrvRepresenting a set of all six types of fault category probabilities in the preliminary identification result of the v-th image to be identified;
Figure FDA0002815286200000103
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 0 th type fault;
Figure FDA0002815286200000104
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 1 st type fault;
Figure FDA0002815286200000105
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 2 nd type fault;
Figure FDA0002815286200000106
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 3 rd type fault;
Figure FDA0002815286200000107
representing the probability that the preliminary identification result of the v-th image to be identified belongs to the 4 th type fault;
Figure FDA0002815286200000108
representing the probability that the v-th preliminary identification result belongs to the 5 th type fault;
the preliminary identification result of the image to be identified is defined as:
Figure FDA0002815286200000109
wherein R isfirstRepresenting a preliminary recognition result of the image to be recognized;
the final recognition result of the image to be recognized is defined as:
Figure FDA00028152862000001010
and is
Figure FDA00028152862000001011
Wherein R isfinalRepresenting the final recognition result of the image to be recognized, NReRepresenting the number of final recognition results of the image to be recognized;
Figure FDA00028152862000001012
representing the coordinates of the upper left corner of a prediction box in the final recognition result of the epsilon-th image to be recognized;
Figure FDA00028152862000001013
representing the coordinates of the lower right corner of a prediction frame in the final recognition result of the epsilon-th image to be recognized; plabelεAnd the final recognition result of the epsilon-th image to be recognized belongs to which type of fault.
CN202011415713.6A 2020-12-03 2020-12-03 Automatic recognition method for abnormal faults of freight train carriage of station Active CN112464846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011415713.6A CN112464846B (en) 2020-12-03 2020-12-03 Automatic recognition method for abnormal faults of freight train carriage of station

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011415713.6A CN112464846B (en) 2020-12-03 2020-12-03 Automatic recognition method for abnormal faults of freight train carriage of station

Publications (2)

Publication Number Publication Date
CN112464846A true CN112464846A (en) 2021-03-09
CN112464846B CN112464846B (en) 2024-04-02

Family

ID=74801973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011415713.6A Active CN112464846B (en) 2020-12-03 2020-12-03 Automatic recognition method for abnormal faults of freight train carriage of station

Country Status (1)

Country Link
CN (1) CN112464846B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907585A (en) * 2021-03-30 2021-06-04 天津博诺机器人技术有限公司 Multi-scale fusion steel rail bolt assembly fault detection method
CN113469073A (en) * 2021-07-06 2021-10-01 西安电子科技大学 SAR image ship detection method and system based on lightweight deep learning
CN113852858A (en) * 2021-08-19 2021-12-28 阿里巴巴(中国)有限公司 Video processing method and electronic equipment
CN115063903A (en) * 2022-06-08 2022-09-16 深圳市永达电子信息股份有限公司 Railway freight train compartment body abnormity monitoring method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10373027B1 (en) * 2019-01-30 2019-08-06 StradVision, Inc. Method for acquiring sample images for inspecting label among auto-labeled images to be used for learning of neural network and sample image acquiring device using the same
WO2019169816A1 (en) * 2018-03-09 2019-09-12 中山大学 Deep neural network for fine recognition of vehicle attributes, and training method thereof
CN111080605A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Method for identifying railway wagon manual brake shaft chain falling fault image
CN111079627A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Railway wagon brake beam body breaking fault image identification method
CN111079818A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Railway wagon coupler joist breakage detection method
CN111310861A (en) * 2020-03-27 2020-06-19 西安电子科技大学 License plate recognition and positioning method based on deep neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019169816A1 (en) * 2018-03-09 2019-09-12 中山大学 Deep neural network for fine recognition of vehicle attributes, and training method thereof
US10373027B1 (en) * 2019-01-30 2019-08-06 StradVision, Inc. Method for acquiring sample images for inspecting label among auto-labeled images to be used for learning of neural network and sample image acquiring device using the same
CN111080605A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Method for identifying railway wagon manual brake shaft chain falling fault image
CN111079627A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Railway wagon brake beam body breaking fault image identification method
CN111079818A (en) * 2019-12-12 2020-04-28 哈尔滨市科佳通用机电股份有限公司 Railway wagon coupler joist breakage detection method
CN111310861A (en) * 2020-03-27 2020-06-19 西安电子科技大学 License plate recognition and positioning method based on deep neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘清华;: "浅析高等级公路水泥碎石基层施工质量控制", 河北工业大学成人教育学院学报, no. 03, 30 September 2007 (2007-09-30) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907585A (en) * 2021-03-30 2021-06-04 天津博诺机器人技术有限公司 Multi-scale fusion steel rail bolt assembly fault detection method
CN113469073A (en) * 2021-07-06 2021-10-01 西安电子科技大学 SAR image ship detection method and system based on lightweight deep learning
CN113469073B (en) * 2021-07-06 2024-02-20 西安电子科技大学 SAR image ship detection method and system based on lightweight deep learning
CN113852858A (en) * 2021-08-19 2021-12-28 阿里巴巴(中国)有限公司 Video processing method and electronic equipment
CN115063903A (en) * 2022-06-08 2022-09-16 深圳市永达电子信息股份有限公司 Railway freight train compartment body abnormity monitoring method and system

Also Published As

Publication number Publication date
CN112464846B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
CN112464846B (en) Automatic recognition method for abnormal faults of freight train carriage of station
CN110674861B (en) Intelligent analysis method and device for power transmission and transformation inspection images
CN109859163A (en) A kind of LCD defect inspection method based on feature pyramid convolutional neural networks
CN114240878A (en) Routing inspection scene-oriented insulator defect detection neural network construction and optimization method
CN108711148B (en) Tire defect intelligent detection method based on deep learning
CN114399672A (en) Railway wagon brake shoe fault detection method based on deep learning
CN111489339A (en) Method for detecting defects of bolt spare nuts of high-speed railway positioner
CN111598843A (en) Power transformer respirator target defect detection method based on deep learning
CN109165541A (en) Coding method for vehicle component in intelligent recognition rail traffic vehicles image
CN112070135A (en) Power equipment image detection method and device, power equipment and storage medium
CN115546565A (en) YOLOCBF-based power plant key area pipeline oil leakage detection method
CN113989487A (en) Fault defect detection method and system for live-action scheduling
CN112967252A (en) Rail vehicle machine sense hanger assembly bolt loss detection method
CN115546223A (en) Method and system for detecting loss of fastening bolt of equipment under train
CN111079645A (en) Insulator self-explosion identification method based on AlexNet network
CN114612803A (en) Transmission line insulator defect detection method for improving CenterNet
CN114202803A (en) Multi-stage human body abnormal action detection method based on residual error network
CN113221839A (en) Automatic truck image identification method and system
CN113298767A (en) Reliable go map recognition method capable of overcoming light reflection phenomenon
CN115359306B (en) Intelligent identification method and system for high-definition images of railway freight inspection
CN116994161A (en) Insulator defect detection method based on improved YOLOv5
CN115204678B (en) Tourist evaluation analysis system based on scenic spot tourist platform
CN113343977B (en) Multipath automatic identification method for container terminal truck collection license plate
CN115631411A (en) Method for detecting damage of insulator in different environments based on STEN network
CN112132088B (en) Inspection point missing inspection identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant