CN114758150A

CN114758150A - Method, device and equipment for identifying state of train brake chain and storage medium

Info

Publication number: CN114758150A
Application number: CN202011583368.7A
Authority: CN
Inventors: 徐永燊; 刘玉珠
Original assignee: Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd; Guangzhou Huiruisitong Technology Co Ltd
Current assignee: Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd; Guangzhou Huiruisitong Technology Co Ltd
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-07-15

Abstract

The application relates to a method, a device, equipment and a storage medium for identifying the state of a train brake chain, wherein the method comprises the following steps: acquiring image data of a train brake chain; carrying out feature extraction on the image data to obtain feature maps with different sizes; inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data; determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model; carrying out duplicate removal processing on the candidate prediction coordinates under each prediction model to obtain the corresponding optimal prediction coordinates of a plurality of preset key points on the brake chain in the image data; the brake chain state in the image data is determined. The method and the device realize automatic identification of the state of the brake chain based on the prediction model, do not need manual inspection, save heavy and low-efficiency work, can effectively improve the running efficiency of the train, and ensure the running safety of the train.

Description

Method, device and equipment for identifying state of train brake chain and storage medium

Technical Field

The application relates to the technical field of intelligent rail transit, in particular to a method, a device, equipment and a storage medium for identifying the state of a train brake chain.

Background

Computer vision processing technology is an important technology in the field of intelligent rail transit, and is receiving more and more attention. The computer vision processing technology is that various imaging systems are used to replace visual sense organs as input means of visual information, and a computer replaces a brain to complete processing and interpretation. Computer vision processing techniques enable machines to not only perceive geometric information (e.g., position, size, shape, motion, etc.) in an environment, but also describe, interpret, and understand such information. Therefore, the computer vision processing technology can provide an intuitive and convenient analysis means for the intelligent rail transit field.

Before the train runs, the brake chains among all the carriages need to be ensured to be in a loose state, and accidents are avoided in the running stage of the train. When the brake chain is in a tight state, the carriages are in a brake state, and the train can not be started to run; when the brake chain is in a slack state and the normal state is between the carriages, the train can start running. At present, the brake chain of each carriage is inspected through a manual inspection mode, the traditional mode consumes manpower and is low in efficiency, the departure time of a train is seriously prolonged, and the running efficiency of the whole rail transit is further influenced.

Disclosure of Invention

In order to solve the technical problems or at least partially solve the technical problems, the application provides a method and a device for identifying the state of a train brake chain, equipment and a storage medium.

In a first aspect, the present application provides a method for identifying a state of a train brake chain, including: acquiring image data of a train brake chain;

extracting the features of the image data to obtain feature maps with different sizes;

inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, wherein the prediction data comprises first prediction data and second prediction data, the first prediction data comprises the classification probability of each pixel point in the feature map corresponding to a brake chain, and the second prediction data comprises the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;

determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;

performing duplicate removal processing on candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data;

and determining the state of the brake chain in the image data according to the corresponding optimal predicted coordinates of a plurality of preset key points on the brake chain in the image data.

In a second aspect, the present application provides a state recognition device for a train brake chain, comprising: the image acquisition module is used for acquiring image data of the train brake chain;

the characteristic extraction module is used for extracting the characteristics of the image data to obtain characteristic graphs with different sizes;

the model prediction model is used for inputting the feature map of each size into the corresponding prediction model to obtain corresponding prediction data, the prediction data comprise first prediction data and second prediction data, the first prediction data comprise the classification probability of each pixel point in the feature map corresponding to the brake chain, and the second prediction data comprise the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;

the coordinate calculation module is used for determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;

the chain duplicate removal module is used for carrying out duplicate removal processing on candidate prediction coordinates, corresponding to the plurality of preset key points on the brake chain in the image data, under each prediction model to obtain optimal prediction coordinates, corresponding to the plurality of preset key points on the brake chain in the image data;

and the state determining module is used for determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of a plurality of preset key points on the brake chain in the image data.

In a third aspect, the present application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.

The method for identifying the state of the train brake chain provided by the embodiment is characterized in that the image data is subjected to feature extraction to obtain feature maps of various sizes, and the feature maps of each size are input into a corresponding prediction model. Therefore, in order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the states of the brake chains is improved. And the method and the device realize automatic identification of the state of the brake chain based on the prediction model, do not need manual inspection, save heavy and low-efficiency work, can effectively improve the running efficiency of the train and ensure the running safety of the train.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

FIG. 1 is a schematic flow chart of a method for identifying the state of a train brake chain according to an embodiment of the present application;

fig. 2 is a schematic flowchart of feature extraction performed on the image data in the embodiment of the present application;

FIG. 3 is a schematic flowchart of an example of performing feature extraction on the image data and outputting prediction data by using a prediction model in an embodiment of the present application;

FIG. 4 is a schematic diagram of a plurality of key points selected on a brake chain according to an embodiment of the present disclosure;

FIG. 5a is a schematic flowchart of step S140 in the embodiment of the present application;

FIG. 5b is a schematic flowchart illustrating step S142 in the embodiment of the present application;

FIG. 6a is a schematic flow chart of determining a priori chain width in an embodiment of the present application;

FIG. 6b is a schematic flow chart illustrating the determination of a priori chain sag in an embodiment of the present application;

FIG. 7 is a flowchart illustrating step S150 in the embodiment of the present application;

FIG. 8 is a flowchart illustrating step S160 in the embodiment of the present application;

FIG. 9 is a schematic flow chart of a positive sample extraction process in an embodiment of the present application;

fig. 10 is a block diagram showing a state recognition device for a train brake chain according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making creative efforts shall fall within the protection scope of the present application.

Fig. 1 is a state identification method for a train brake chain according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:

s110, collecting image data of a train brake chain;

when the system is specifically implemented, a high-definition camera can be erected beside a train track, the high-definition camera is used for shooting an image containing a train brake chain, and then image data are collected from the high-definition camera.

It can be understood that the training sample used in the pre-training process of the prediction model used in the subsequent step may also be an image collected from a high-definition camera beside the train track, and the image is marked, specifically, information such as coordinates of a plurality of key points of the brake chain, whether the brake chain is in a slack state or a tight state, and the like may be marked.

S120, extracting the features of the image data to obtain feature maps with different sizes;

it is understood that the dimensions may include various dimensions 13x13, 26x26, 52x52, etc. For example, feature maps of four sizes are extracted in this step, four prediction models are required in the subsequent steps, and feature maps of one size are input into one corresponding prediction model.

In specific implementation, assuming that the type of the size of the feature map is N, where N is an integer greater than or equal to 2, as shown in fig. 2, the process of extracting the features of the image data in step S120 may include the following steps:

s121, performing downsampling processing on the image data for multiple times to obtain a first feature map, wherein the size of the first feature map is a first size;

in specific implementation, a convolution module and/or a bottleneck module (namely, a bottleeck module) can be adopted for feature extraction, each extraction mode has different advantages, wherein the bottleeck module has low calculation amount, and the operation efficiency of the whole method flow can be improved.

For example, referring to fig. 3, the input image of size 416x416 is subjected to the downsampling process 5 times, and the first feature map of size 13x13 is obtained. After each down-sampling, the size of the image becomes smaller, the size of the image obtained after the first down-sampling (i.e. conv2d — 3x3(208x208) in the figure) is 208x208, the size of the image obtained after the second down-sampling (i.e. bottleeck (104x104) in the figure) is 104x104, the size of the image obtained after the third down-sampling (52x52) in the figure) is 52x52, the size of the image obtained after the fourth down-sampling (26x26) in the figure) is 26x26, and the size of the image obtained after the fifth down-sampling (13x13) in the figure) is 13x 13.

S122, determining any one of a second characteristic diagram to an Nth characteristic diagram based on the first characteristic diagram; the process for determining the ith feature map comprises the following steps: up-sampling the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple times of downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger if i is larger; i is an integer of 2 or more and N or less.

For example, referring to fig. 3, when N is 3, an image with a size of 416x416 is subjected to one feature extraction by a convolution module (with a convolution kernel size of 3x3) and four feature extractions by a bottleneck module, so as to obtain a first feature map with a size of 13x 13. And then, performing up-sampling on the first feature map to obtain a feature map with the size of 26x26, and then splicing the feature map with a feature map with the size of 26x26 obtained after the down-sampling processing of a third bottoming module to obtain a second feature map. And upsampling the second feature map to obtain a feature map with the size of 52x52, and splicing the feature map with the size of 52x52 obtained after the downsampling processing of the second bottleeck module to obtain a third feature map. Thus, three dimensional signatures were obtained: a first feature map with dimension 13x13, a second feature map with dimension 26x26, and a third feature map with dimension 52x 52. Of course, before each upsampling process, a certain convolution process may be performed on the feature map to optimize the feature map.

S130, inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, wherein the prediction data comprise first prediction data and second prediction data, the first prediction data comprise classification probabilities of each pixel point in the feature map corresponding to a brake chain, and the second prediction data comprise prediction coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel points;

the number of the prediction models is the same as the size category of the feature map;

for example, as shown in fig. 3, where N is 3, the first feature map is input into the first prediction model, the second feature map is input into the second prediction model, and the third feature map is input into the third prediction model, each of which outputs prediction data.

In specific implementation, the preset key points may be selected according to specific situations, for example, in fig. 4, one end point a, the other end point B of the arc-shaped brake chain AB, and an intersection point D between a perpendicular bisector of the straight line segment AB between the two end points and the brake chain are selected. The point C in the graph is the middle point of the straight line segment AB, the straight line segment AB is perpendicular to the straight line segment CD, the point E is the middle point of the straight line segment CD, the brake chain is arc-shaped, and therefore the point E is also the center of the minimum external rectangle of the brake chain, the point E can be used as an anchor point, the anchor point can be a point representing the whole chain, when feature extraction is continuously carried out on image data, the whole chain can be contracted into a pixel point or a plurality of pixel points in a feature diagram, and at the moment, the pixel point or the centers of the pixel points are used as the anchor point to represent the whole chain.

S140, determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;

it can be understood that each pixel point in the feature map represents a corresponding plurality of pixel points in the image data. In one image data, some pixel points are pixel points of a brake chain, and some pixel points are pixel points of other things. Each pixel point in the feature map has a corresponding classification probability, and the classification probability corresponding to one pixel point in the feature map is higher, which means that the probability that a plurality of corresponding pixel points in the image data are pixel points of a brake chain is very high; if the classification probability corresponding to one pixel point in the feature map is low, it means that a plurality of corresponding pixel points in the image data are likely to be pixel points of other things. For example, in one feature map, the classification probabilities corresponding to four pixel points in the tenth row and the eleventh row, which intersect with the third column and the fourth column, are all greater than 80%, which indicates that a plurality of pixel points corresponding to the four pixel points in the image data are pixel points of the brake chain, and the remaining pixel points are not related to the brake chain. Therefore, the position of the brake chain in the image data can be determined according to the classification probability in the feature map.

It can be understood that each pixel point in the feature map corresponds to the predicted coordinate offset of the plurality of preset key points in addition to the classification probability, and therefore after the position of the brake chain in the image data is determined according to the classification probability, the coordinates of the plurality of key points in the image data can be determined according to the predicted coordinate offset of the plurality of preset key points.

Based on the above understanding, as shown in fig. 5a, step S140 may include the steps of:

s141, determining coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;

it can be understood that step S141 is the process of determining the position of the brake chain in the image data. The coordinates of the corresponding pixel points of the brake chain in the image data may be the centers of the pixels corresponding to the pixel points in the image data, where the classification probability of each pixel point is greater than the preset probability value, in the feature map, that is, the coordinates of the anchor point of the brake chain in the image data, so that the coordinates determined in step S141 are the coordinates of the anchor point in the image data.

And S142, determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.

And the predicted coordinate offset of the key point is the coordinate offset of the coordinate of the key point relative to the anchor point.

It is understood that step S142 is a process of determining coordinates of a plurality of key points in the image data, where the coordinates of the plurality of key points in the image data are calculated according to the prediction data output by each prediction model, and the coordinates are referred to as candidate prediction coordinates under the prediction model.

In specific implementation, because the brake chain states include a slack state and a tight state, in order to improve the identification accuracy of the brake chain states, two states can be set for each prediction model: the prediction data in each state comprises classification probability that each pixel point in a feature diagram is a brake chain and prediction coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel point, namely the prediction data output by each prediction model comprises the first prediction data and the second prediction data under each priori chain looseness. For example, after a first feature map with a size of 13x13 is input into a first prediction model, prediction data in a relaxed state (with a size of 13x13x7) and prediction data in a tight state (with a size of 13x13x7) are obtained, each pixel point in the feature map in each state corresponds to 7 values, and each of the 7 values may include 1 classification probability and 6 coordinate offsets (including the abscissa offset and the ordinate offset of each of three key points).

Aiming at the setting that a prediction model can output prediction data under two brake chain states, one pixel point in a feature map corresponds to two classification probabilities, and whether the pixel point in the feature map corresponds to the pixel point of a brake chain in image data can be judged by selecting the larger classification probability, namely when the larger classification probability in the two classification probabilities of one pixel point in the feature map is very high, the probability that a plurality of pixel points corresponding to the pixel point in the image data are brake chains is very high, at the moment, the brake chain state corresponding to the larger classification probability is used as the brake chain state of a plurality of pixel points corresponding to the image data, and the brake chain state corresponding to the larger classification probability is used as the brake chain state in the image data. And when the larger classification probability of the two classification probabilities of one pixel point in the feature map is very low, the fact that a plurality of pixel points corresponding to the pixel point in the image data are pixel points of other things is shown. As can be seen, in step S141, when determining the coordinates of the corresponding pixel points of the brake chain in the image data under a prediction model (i.e., determining the position of the brake chain in the image data), the larger classification probability of the two classification probabilities corresponding to a pixel point under the prediction model is based on.

Based on the above understanding, referring to fig. 5b, step S142 may specifically include: and determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset prior chain width for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offsets of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model.

The prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain slack is a prior value representing the brake chain slack, and the prior chain slack preset for each prediction model comprises a first prior chain slack corresponding to a slack state and a second prior chain slack corresponding to a tight state. For example, when using points A, B and D as described above as key points for a brake chain, the chain slack may be the ratio of the length of the mid-vertical section CD to the width of the chain AB.

The smaller the size of the feature map corresponding to one prediction model is, the larger the prior chain width preset for the prediction model can be.

It will be appreciated that the shortest distance between two points is the length of the straight line segment formed by the two points.

The larger classification probability is a larger value of the classification probability corresponding to the first priori chain sag and the classification probability corresponding to the second priori chain sag for the same pixel point.

For example, if a pixel point in the feature map corresponds to one classification probability of 60% in two states, the classification probability corresponds to the first prior chain sag, and the other classification probability is 90%, the classification probability corresponds to the second prior chain sag, and since 90% is very high, the pixel point in the feature map corresponds to a plurality of pixel points of the brake chain in fig. 4, the prior chain sag corresponding to the larger classification probability when calculating the candidate prediction coordinate according to step S142 is the second prior chain sag. It can be understood that all the pixels of the brake chain in fig. 4 generally correspond to the second a priori chain slack at this time.

In a specific implementation, if the endpoint a is selected as a key point, step S142 may use a first formula to calculate candidate predicted coordinates of an endpoint of the brake chain in the image data under a prediction model, where the first formula includes:

x_A＝x_c-w×sigmoid(t_xA)

y_A＝y_c-w×s×sigmoid(t_yA)

in the formula (x)_A，y_A) Is that the brakeCandidate predicted coordinates corresponding to an end point A on the chain in the image data, (x)_c，y_c) (t) coordinates of corresponding pixel points of said brake chain in said image data_xA，t_yA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is used, w is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;

in a specific implementation, if the endpoint B is selected as another key point, step S142 may use a second formula to calculate a candidate predicted coordinate of another endpoint on the brake chain in the image data under a prediction model, where the second formula includes:

x_B＝x_c+w×sigmoid(t_xB)

y_B＝y_c-w×s×sigmoid(t_yB)

in the formula (x)_B，y_B) (t) for the corresponding candidate predicted coordinate of the other end point B on the brake chain in the image data_xB，t_yB) The predicted coordinate offset of the other end point B on the brake chain under the prior chain sag corresponding to the larger classification probability;

in a specific implementation, if the intersection point D is selected as the third key point, step S142 may use a third formula to calculate candidate predicted coordinates of the intersection point in the image data under one prediction model, where the third formula includes:

x_D＝x_c+w×(sigmoid(t_xD)-0.5)

y_D＝y_c+w×s×sigmoid(t_yD)

wherein (A), (B), (C), (B), (C), (B), (C), (B), (C)_xD，y_D) (t) for the corresponding candidate predicted coordinate in the image data for intersection point D_xD，t_yD) The predicted coordinate offset of the intersection point D under the prior chain sag corresponding to the larger classification probability.

In an implementation, the a priori chain width is predetermined, and as shown in fig. 6a, a specific determination process may include:

s001a, calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model;

s002a, clustering the chain widths corresponding to all the training samples used for training the prediction model to obtain the prior chain widths consistent with the number of the prediction model.

The clustering mode can adopt K-means clustering.

For example, with three prediction models, three prior chain widths need to be clustered: w1, w2, w3, w1 being the largest and w3 the smallest, w1 being assigned to the first predictive model for the first feature map of size 13x13, w2 being assigned to the second predictive model for the second feature map of size 26x26, and w3 being assigned to the third predictive model for the third feature map of size 52x 52. Because the size of the first feature map is minimum, the first feature map needs to be subjected to larger up-sampling when being restored to the size of image data, and therefore the maximum prior chain width is allocated to the first prediction model; the third feature map has the largest size, and needs to undergo smaller upsampling when being restored to the size of the image data, so that the third prediction model is assigned with the smallest prior chain width.

In a specific implementation, the a priori chain slack is also predetermined, and as shown in fig. 6b, a specific determination process may include:

s001b, calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model;

s002b, clustering chain sag corresponding to all training samples used for training the prediction model to obtain the first prior chain sag and the second prior chain sag.

The clustering mode can also adopt K-means clustering.

It can be understood that, because the coordinates of each key point are marked in the training samples, the chain slack and the chain width corresponding to each training sample can be calculated according to the marking information.

For example, the distribution of the a priori chain width and the a priori chain slack is shown in table 1 below:

TABLE 1 Allocation Table for Prior chain Width and Prior chain sag

In the above table, s1 is the first prior chain slack and s2 is the second prior chain slack.

S150, performing de-duplication processing on candidate prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data;

it is understood that, based on each prediction model, the corresponding candidate predicted coordinate can be obtained, and the problem to be solved in step S150 is to determine the candidate predicted coordinate obtained based on which prediction model as the optimal predicted coordinate. For example, N is 3, candidate predicted coordinates of a plurality of key points obtained based on a first prediction model constitute a first predicted coordinate set G1 { (x1, y1), (x2, y2) … … }, candidate predicted coordinates of a plurality of key points obtained based on a second prediction model constitute a second predicted coordinate set G2 { (x1, y1), (x2, y2) … … }, candidate predicted coordinates of a plurality of key points obtained based on a third prediction model constitute a third predicted coordinate set G3 { (x1, y1), (x2, y2) … … }, and one predicted coordinate set needs to be selected as an optimal predicted coordinate set.

In specific implementation, as shown in fig. 7, the process of the deduplication processing in step S150 may include: and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.

For example, an end point A, B and an intersection point D are selected as key points, and for a first prediction model and a second prediction model, if any one key point (A, B or D) in a first prediction coordinate set G1 is located in a triangular region formed by three key points in a second prediction coordinate set G2, the first prediction coordinate set G1 and the second prediction coordinate set G2 are considered to correspond to the same brake chain, at this time, a candidate prediction coordinate (for example, the first prediction coordinate set G1) corresponding to a larger classification probability is retained and used as the optimal prediction coordinate, and prediction data corresponding to a smaller classification probability is deleted.

Understandably, when the high-definition camera is used for shooting the image containing the brake chain, some brake chains are closer to the high-definition camera, so that the brake chain is larger in the image data; and some brake chains are far away from the high-definition camera, so that the brake chains are small in image data. If various image data are processed into a feature map with one size and a prediction model is used for prediction processing, errors of predicted coordinates of some image data are large, and the recognition accuracy of the brake chain state is affected. In order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, so that multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the brake chain state is improved.

And S160, determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of the plurality of preset key points on the brake chain in the image data.

In practical implementation, when the endpoints a and B and the intersection D are used as the key points, step S160 may specifically include the steps shown in fig. 8:

s161, calculating chain sag in the image data according to the corresponding optimal prediction coordinates of the two end points and the intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of the middle vertical line to the width of the chain;

the chain width is the shortest distance | AB | of the two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance | CD | of the middle point of the straight line between the two end points and the intersection point in the image data.

And S162, judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the brake chain state in the image data is a loose state.

It can be understood that if the ratio | CD |/| AB | is less than or equal to the preset ratio threshold, the brake chain state in the image data is a tight state.

In a specific implementation, the prediction model is obtained by pre-training, and referring to fig. 3, a specific structure of the prediction model may include two convolution modules, for example, a convolution kernel of one convolution module is 3x3, and a convolution kernel of one convolution module is 1x 1. In the training process of the prediction model, the loss value of the prediction model can be calculated according to the loss function, and then each parameter in the prediction model is adjusted according to the loss value. Specific loss functions may include:

in the formula, L_totalFor the total loss of the predictive model,

for presetting the balance parameter, N_clsIs the number of all training samples, L, required to train the prediction model_cls() As a function of classification loss, y_iAnd

respectively pre-marking the brake chain state of the ith training sample and the classification probability, N, output after the ith training sample is input into the prediction model_regIs the number of positive samples, L, in all the training samples_reg() As a function of the regression loss, t_piAnd

the real coordinates of the key point P in the ith training sample and the prediction coordinates output after the ith training sample is input into the prediction model are respectively.

Understandably, the brake chain state y_iThe method comprises 0 (tension state) and 1 (relaxation state), if two prior chain relaxities are set for each prediction model, each prediction model needs to be trained respectively according to the two prior chain relaxities, so that prediction data are respectively output according to the two prior chain relaxities, and therefore for each prior chain relaxivity, a prediction module can adjust parameters in the prediction model according to the loss function.

The above loss function refers to a positive sample, and the selection process of the positive sample is shown in fig. 9, and may include the following steps:

s010, calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of anchor points in the training sample, wherein the anchor points are the centers of the minimum circumscribed rectangles of the brake chains;

for example, for a training sample, the key points are two endpoints a and B, and an intersection point D, a priori chain slack s and a priori chain width | AB | are set for the training sample, and s is | CD |/| AB |, so | CD | can be determined according to s and | AB |, and when the middle point of the CD, i.e., the coordinate of the anchor point, the coordinates of three key points can be determined according to the anchor point coordinates, | CD |, and | AB |.

S020, determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample;

for example, the shortest distance D1 between the coordinates of the point a calculated in step S010 and the real coordinates of the point a in the training sample, the shortest distance D2 between the coordinates of the point B calculated in step S010 and the real coordinates of the point B in the training sample, the shortest distance D3 between the coordinates of the point D calculated in step S010 and the real coordinates of the point D in the training sample are calculated, and then the sum of the distances D1, D2, D3 is calculated.

It can be understood that the coincidence degree between the rectangular frame (namely the minimum bounding rectangle of the brake chain) determined according to the priori chain slack, the priori chain width and the anchor point coordinates and the real rectangular frame of the training sample can be represented by the sum of the distances, wherein the smaller the sum of the distances is, the higher the coincidence degree is, the larger the sum of the distances is, and the lower the coincidence degree is.

And S030, judging whether the sum of the distances is smaller than a preset distance threshold value, and if so, taking the training sample as a positive sample corresponding to the prior chain sag.

It can be appreciated that when the sum of distances is small, the overlap ratio is high, and the training sample is taken as a positive sample corresponding to the a priori chain slack.

In specific implementation, the preset distance threshold is referred to as a first distance threshold, a second distance threshold larger than the first distance threshold is further set, when the sum of the distances is larger than the second distance threshold, the training sample can be used as a negative sample corresponding to the priori chain sag, and if the sum of the distances is between the first distance threshold and the second distance threshold, the training sample is ignored.

Through the steps S010-S030, a positive sample and a negative sample corresponding to each priori chain sag can be determined and further used for calculating the loss function.

The method for identifying the state of the train brake chain provided by the embodiment is characterized in that the image data is subjected to feature extraction to obtain feature maps of various sizes, and the feature maps of each size are input into a corresponding prediction model. Therefore, in order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the brake chain state is improved. And the method realizes automatic identification of the state of the brake chain based on a prediction model, does not need manual inspection, saves heavy and inefficient work, can effectively improve the running efficiency of the train, and ensures the running safety of the train.

As shown in fig. 10, an embodiment of the present application provides a state identification device for a train brake chain, which may specifically include:

the image acquisition module 110 is used for acquiring image data of a train brake chain;

a feature extraction module 120, configured to perform feature extraction on the image data to obtain feature maps of multiple different sizes;

the model prediction model 130 is configured to input the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, where the prediction data includes first prediction data and second prediction data, the first prediction data includes a classification probability that each pixel point in the feature map corresponds to a brake chain, and the second prediction data includes predicted coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel point;

the coordinate calculation module 140 is configured to determine candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the predicted coordinate offset output by each prediction model;

a chain deduplication module 150, configured to perform deduplication processing on candidate predicted coordinates, corresponding to the image data, of a plurality of preset key points on the brake chain under each prediction model, so as to obtain optimal predicted coordinates, corresponding to the image data, of the plurality of preset key points on the brake chain;

the state determining module 160 is configured to determine a brake chain state in the image data according to the corresponding optimal predicted coordinates of the plurality of preset key points on the brake chain in the image data.

In specific implementation, the size category of the characteristic diagram is N, wherein N is an integer greater than or equal to 2; the feature extraction module 120 includes:

a first extraction unit, configured to perform downsampling processing on the image data for multiple times to obtain a first feature map, where a size of the first feature map is a first size;

the second extraction unit is used for determining any one of the second feature map to the Nth feature map based on the first feature map; the process for determining the ith feature map comprises the following steps: up-sampling the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger when i is larger; i is an integer of 2 or more and N or less.

In specific implementation, the coordinate calculation module 140 specifically includes:

the position determining unit is used for determining the coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;

and the coordinate determination unit is used for determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.

In particular implementations, the prediction data output by each prediction model includes the first prediction data and the second prediction data at each a priori chain sag; the coordinate determination unit is specifically configured to: determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset width of the prior chain for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offset of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model; the prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain sag is a prior value representing the brake chain sag, the prior chain sag preset for each prediction model comprises a first prior chain sag corresponding to a slack state and a second prior chain sag corresponding to a tight state, the larger classification probability is for the same pixel point, and the larger value of the classification probability corresponding to the first prior chain sag and the classification probability corresponding to the second prior chain sag is obtained.

In particular implementation, the apparatus further comprises:

the priori width determining module is used for calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; clustering chain widths corresponding to all training samples for training the prediction model to obtain prior chain widths consistent with the number of the prediction model; and/or

The prior sag determining module is used for calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; and clustering chain sag corresponding to all training samples for training the prediction model to obtain the first prior chain sag and the second prior chain sag.

When the brake chain is specifically implemented, the preset key points comprise two end points of the brake chain, a perpendicular bisector of a straight line section between the two end points and an intersection point of the brake chain.

In a specific implementation, the coordinate determination unit calculates candidate predicted coordinates corresponding to an end point of the brake chain in the image data under a prediction model by using a first formula, where the first formula includes:

x_A＝x_c-w×sigmoid(t_xA)

y_A＝y_c-w×s×sigmoid(t_yA)

in the formula (x)_A，y_A) (x) for a candidate predicted coordinate corresponding to an endpoint A on said brake chain in said image data_c，y_c) (t) coordinates of corresponding pixel points of said brake chain in said image data_xA，t_yA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is represented by w, which is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;

and/or the coordinate determination unit adopts a second formula to calculate the candidate predicted coordinate corresponding to the other end point of the brake chain in the image data under a prediction model, wherein the second formula comprises:

x_B＝x_c+w×sigmoid(t_xB)

y_B＝y_c-w×s×sigmoid(t_yB)

and/or the coordinate determination unit calculates candidate predicted coordinates corresponding to the intersection point in the image data under one prediction model by adopting a third formula, wherein the third formula comprises:

x_D＝x_c+w×(sigmoid(t_xD)-0.5)

y_D＝y_c+w×s×sigmoid(t_yD)

in the formula (x)_D，y_D) (t) for the corresponding candidate predicted coordinate in the image data for intersection point D_xD，t_yD) The predicted coordinate offset of the intersection point D under the prior chain slack corresponding to the larger classification probability.

In a specific implementation, the apparatus further comprises:

the model training module is used for training a prediction model in advance and performing loss calculation on a training sample by adopting a preset loss function in the training process of the prediction model, wherein the loss function comprises:

in the formula, L_totalFor the total loss of the predictive model,

In particular, the model training module comprises:

the sample classification module is used for selecting a positive sample, and the selection process comprises the following steps: calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of an anchor point in the training sample, wherein the anchor point is the center of the minimum circumscribed rectangle of the brake chain; determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample; and judging whether the sum of the distances is smaller than a preset distance threshold value, if so, taking the training sample as a positive sample corresponding to the prior chain sag.

In particular implementation, the chain deduplication module is specifically configured to: and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.

In a specific implementation, the state determination module is specifically configured to: calculating the chain sag in the image data according to the corresponding optimal prediction coordinates of two end points and an intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of a middle vertical line to the width of the chain; the chain width is the shortest distance between two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance between the midpoint of a straight line segment between the two end points and the intersection point in the image data; and judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the state of the brake chain in the image data is a loose state.

Embodiments of the present application further provide a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the steps of the above method when the computer program is executed.

Embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above method.

It is understood that, for the apparatuses, the computer devices, and the computer-readable storage media provided in the embodiments of the present application, for explanation, examples, and beneficial effects of the contents, reference may be made to corresponding parts in the foregoing methods, and details are not described here.

It is to be appreciated that any reference to memory, storage, database, or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus (Rambus) direct RAM (RDRAM), direct bused dynamic RAM (DRDRAM), and bused dynamic RAM (RDRAM).

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the methods according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A state identification method for a train brake chain is characterized by comprising the following steps:

acquiring image data of a train brake chain;

2. The method according to claim 1, wherein the size category of the feature map is N, and N is an integer greater than or equal to 2; the feature extraction is performed on the image data to obtain feature maps with different sizes, and the feature maps comprise:

performing down-sampling processing on the image data for multiple times to obtain a first feature map, wherein the size of the first feature map is a first size;

determining any one of a second feature map to an Nth feature map based on the first feature map; the process for determining the ith feature map comprises the following steps: carrying out up-sampling on the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger when i is larger; i is an integer of 2 or more and N or less.

3. The method according to claim 1, wherein the determining candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the predicted coordinate offset output by the prediction model comprises:

determining the coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;

and determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.

4. The method of claim 3, wherein the prediction data output by each prediction model comprises the first prediction data and the second prediction data at each a priori chain sag;

determining candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the preset key points, wherein the candidate predicted coordinates comprise:

determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset prior chain width for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offsets of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model;

the prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain slack is a prior value representing the brake chain slack, and the prior chain slack preset for each prediction model comprises a first prior chain slack corresponding to a slack state and a second prior chain slack corresponding to a tight state; the larger classification probability is a larger value of the classification probability corresponding to the first priori chain sag and the classification probability corresponding to the second priori chain sag for the same pixel point.

5. The method of claim 4, wherein the a priori chain width determination comprises: calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; clustering chain widths corresponding to all training samples for training the prediction model to obtain prior chain widths consistent with the number of the prediction model;

and/or the determination of the a priori chain sag comprises: calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; and clustering chain sag corresponding to all training samples for training the prediction model to obtain the first prior chain sag and the second prior chain sag.

6. The method of claim 4, wherein the plurality of predetermined key points comprises two end points of the brake chain and an intersection of a perpendicular bisector of a straight line segment between the two end points and the brake chain.

7. The method of claim 6, wherein the candidate predicted coordinates for an end point on the brake chain in the image data under a prediction model are calculated using a first formula, the first formula comprising:

x_A＝x_c-w×sigmoid(t_xA)

y_A＝y_c-w×s×sigmoid(t_yA)

in the formula (x)_A，y_A) (x) for the corresponding candidate predicted coordinate in the image data of one end point A on the brake chain_c，y_c) (t) coordinates of corresponding pixel points of said brake chain in said image data_xA，t_yA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is used, w is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;

and/or calculating the corresponding candidate prediction coordinate of the other end point of the brake chain in the image data under a prediction model by adopting a second formula, wherein the second formula comprises the following steps:

x_B＝x_c+w×sigmoid(t_xB)

y_B＝y_c-w×s×sigmoid(t_yB)

and/or, calculating candidate prediction coordinates corresponding to the intersection point in the image data under a prediction model by adopting a third formula, wherein the third formula comprises:

x_D＝x_c+w×(sigmoid(t_xD)-0.5)

y_D＝y_c+w×s×sigmoid(t_yD)

in the formula (x)_D，y_D) For the corresponding candidate predicted coordinate in the image data of intersection point D, (t)_xD，t_yD) The predicted coordinate offset of the intersection point D under the prior chain sag corresponding to the larger classification probability.

8. The method according to claim 4, wherein the prediction model is obtained by pre-training, and a loss calculation is performed on training samples by using a preset loss function in the training process of the prediction model, wherein the loss function comprises:

in the formula, L_totalFor the total loss of the predictive model,

respectively pre-marking the brake chain state of the ith training sample and the classification probability, N, output after the ith training sample is input into the prediction model_regIs the number of positive samples, L, in all training samples_reg() As a function of the regression loss, t_piAnd

9. The method of claim 8, wherein the selecting of the positive sample comprises:

calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of an anchor point in the training sample, wherein the anchor point is the center of a minimum circumscribed rectangle of the brake chain;

determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample;

and judging whether the sum of the distances is smaller than a preset distance threshold value, if so, taking the training sample as a positive sample corresponding to the prior chain sag.

10. The method of claim 1, wherein the performing de-duplication processing on candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data comprises:

and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.

11. The method of claim 6, wherein determining the brake chain status in the image data according to the corresponding optimal predicted coordinates of the plurality of predetermined key points on the brake chain in the image data comprises:

calculating the chain sag in the image data according to the corresponding optimal prediction coordinates of the two end points and the intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of the middle vertical line to the width of the chain; the chain width is the shortest distance between two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance between the midpoint of a straight line segment between the two end points and the intersection point in the image data;

and judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the state of the brake chain in the image data is a loose state.

12. A state recognition device of a train brake chain is characterized by comprising:

the image acquisition module is used for acquiring image data of the train brake chain;

the chain duplicate removal module is used for performing duplicate removal processing on candidate predicted coordinates, corresponding to the image data, of a plurality of preset key points on the brake chain under each prediction model to obtain optimal predicted coordinates, corresponding to the image data, of the plurality of preset key points on the brake chain;

and the state determining module is used for determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of the plurality of preset key points on the brake chain in the image data.

13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 11 are implemented by the processor when executing the computer program.

14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 11.