CN114758150A - Method, device and equipment for identifying state of train brake chain and storage medium - Google Patents

Method, device and equipment for identifying state of train brake chain and storage medium Download PDF

Info

Publication number
CN114758150A
CN114758150A CN202011583368.7A CN202011583368A CN114758150A CN 114758150 A CN114758150 A CN 114758150A CN 202011583368 A CN202011583368 A CN 202011583368A CN 114758150 A CN114758150 A CN 114758150A
Authority
CN
China
Prior art keywords
chain
prediction
image data
prediction model
coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011583368.7A
Other languages
Chinese (zh)
Inventor
徐永燊
刘玉珠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd
Guangzhou Huiruisitong Technology Co Ltd
Original Assignee
Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd
Guangzhou Huiruisitong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd, Guangzhou Huiruisitong Technology Co Ltd filed Critical Guangzhou Huiruisitong Artificial Intelligence Technology Co ltd
Priority to CN202011583368.7A priority Critical patent/CN114758150A/en
Publication of CN114758150A publication Critical patent/CN114758150A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a method, a device, equipment and a storage medium for identifying the state of a train brake chain, wherein the method comprises the following steps: acquiring image data of a train brake chain; carrying out feature extraction on the image data to obtain feature maps with different sizes; inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data; determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model; carrying out duplicate removal processing on the candidate prediction coordinates under each prediction model to obtain the corresponding optimal prediction coordinates of a plurality of preset key points on the brake chain in the image data; the brake chain state in the image data is determined. The method and the device realize automatic identification of the state of the brake chain based on the prediction model, do not need manual inspection, save heavy and low-efficiency work, can effectively improve the running efficiency of the train, and ensure the running safety of the train.

Description

Method, device and equipment for identifying state of train brake chain and storage medium
Technical Field
The application relates to the technical field of intelligent rail transit, in particular to a method, a device, equipment and a storage medium for identifying the state of a train brake chain.
Background
Computer vision processing technology is an important technology in the field of intelligent rail transit, and is receiving more and more attention. The computer vision processing technology is that various imaging systems are used to replace visual sense organs as input means of visual information, and a computer replaces a brain to complete processing and interpretation. Computer vision processing techniques enable machines to not only perceive geometric information (e.g., position, size, shape, motion, etc.) in an environment, but also describe, interpret, and understand such information. Therefore, the computer vision processing technology can provide an intuitive and convenient analysis means for the intelligent rail transit field.
Before the train runs, the brake chains among all the carriages need to be ensured to be in a loose state, and accidents are avoided in the running stage of the train. When the brake chain is in a tight state, the carriages are in a brake state, and the train can not be started to run; when the brake chain is in a slack state and the normal state is between the carriages, the train can start running. At present, the brake chain of each carriage is inspected through a manual inspection mode, the traditional mode consumes manpower and is low in efficiency, the departure time of a train is seriously prolonged, and the running efficiency of the whole rail transit is further influenced.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the application provides a method and a device for identifying the state of a train brake chain, equipment and a storage medium.
In a first aspect, the present application provides a method for identifying a state of a train brake chain, including: acquiring image data of a train brake chain;
extracting the features of the image data to obtain feature maps with different sizes;
inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, wherein the prediction data comprises first prediction data and second prediction data, the first prediction data comprises the classification probability of each pixel point in the feature map corresponding to a brake chain, and the second prediction data comprises the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;
determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;
performing duplicate removal processing on candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data;
and determining the state of the brake chain in the image data according to the corresponding optimal predicted coordinates of a plurality of preset key points on the brake chain in the image data.
In a second aspect, the present application provides a state recognition device for a train brake chain, comprising: the image acquisition module is used for acquiring image data of the train brake chain;
the characteristic extraction module is used for extracting the characteristics of the image data to obtain characteristic graphs with different sizes;
the model prediction model is used for inputting the feature map of each size into the corresponding prediction model to obtain corresponding prediction data, the prediction data comprise first prediction data and second prediction data, the first prediction data comprise the classification probability of each pixel point in the feature map corresponding to the brake chain, and the second prediction data comprise the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;
the coordinate calculation module is used for determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;
the chain duplicate removal module is used for carrying out duplicate removal processing on candidate prediction coordinates, corresponding to the plurality of preset key points on the brake chain in the image data, under each prediction model to obtain optimal prediction coordinates, corresponding to the plurality of preset key points on the brake chain in the image data;
and the state determining module is used for determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of a plurality of preset key points on the brake chain in the image data.
In a third aspect, the present application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.
The method for identifying the state of the train brake chain provided by the embodiment is characterized in that the image data is subjected to feature extraction to obtain feature maps of various sizes, and the feature maps of each size are input into a corresponding prediction model. Therefore, in order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the states of the brake chains is improved. And the method and the device realize automatic identification of the state of the brake chain based on the prediction model, do not need manual inspection, save heavy and low-efficiency work, can effectively improve the running efficiency of the train and ensure the running safety of the train.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a schematic flow chart of a method for identifying the state of a train brake chain according to an embodiment of the present application;
fig. 2 is a schematic flowchart of feature extraction performed on the image data in the embodiment of the present application;
FIG. 3 is a schematic flowchart of an example of performing feature extraction on the image data and outputting prediction data by using a prediction model in an embodiment of the present application;
FIG. 4 is a schematic diagram of a plurality of key points selected on a brake chain according to an embodiment of the present disclosure;
FIG. 5a is a schematic flowchart of step S140 in the embodiment of the present application;
FIG. 5b is a schematic flowchart illustrating step S142 in the embodiment of the present application;
FIG. 6a is a schematic flow chart of determining a priori chain width in an embodiment of the present application;
FIG. 6b is a schematic flow chart illustrating the determination of a priori chain sag in an embodiment of the present application;
FIG. 7 is a flowchart illustrating step S150 in the embodiment of the present application;
FIG. 8 is a flowchart illustrating step S160 in the embodiment of the present application;
FIG. 9 is a schematic flow chart of a positive sample extraction process in an embodiment of the present application;
fig. 10 is a block diagram showing a state recognition device for a train brake chain according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making creative efforts shall fall within the protection scope of the present application.
Fig. 1 is a state identification method for a train brake chain according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:
s110, collecting image data of a train brake chain;
when the system is specifically implemented, a high-definition camera can be erected beside a train track, the high-definition camera is used for shooting an image containing a train brake chain, and then image data are collected from the high-definition camera.
It can be understood that the training sample used in the pre-training process of the prediction model used in the subsequent step may also be an image collected from a high-definition camera beside the train track, and the image is marked, specifically, information such as coordinates of a plurality of key points of the brake chain, whether the brake chain is in a slack state or a tight state, and the like may be marked.
S120, extracting the features of the image data to obtain feature maps with different sizes;
it is understood that the dimensions may include various dimensions 13x13, 26x26, 52x52, etc. For example, feature maps of four sizes are extracted in this step, four prediction models are required in the subsequent steps, and feature maps of one size are input into one corresponding prediction model.
In specific implementation, assuming that the type of the size of the feature map is N, where N is an integer greater than or equal to 2, as shown in fig. 2, the process of extracting the features of the image data in step S120 may include the following steps:
s121, performing downsampling processing on the image data for multiple times to obtain a first feature map, wherein the size of the first feature map is a first size;
in specific implementation, a convolution module and/or a bottleneck module (namely, a bottleeck module) can be adopted for feature extraction, each extraction mode has different advantages, wherein the bottleeck module has low calculation amount, and the operation efficiency of the whole method flow can be improved.
For example, referring to fig. 3, the input image of size 416x416 is subjected to the downsampling process 5 times, and the first feature map of size 13x13 is obtained. After each down-sampling, the size of the image becomes smaller, the size of the image obtained after the first down-sampling (i.e. conv2d — 3x3(208x208) in the figure) is 208x208, the size of the image obtained after the second down-sampling (i.e. bottleeck (104x104) in the figure) is 104x104, the size of the image obtained after the third down-sampling (52x52) in the figure) is 52x52, the size of the image obtained after the fourth down-sampling (26x26) in the figure) is 26x26, and the size of the image obtained after the fifth down-sampling (13x13) in the figure) is 13x 13.
S122, determining any one of a second characteristic diagram to an Nth characteristic diagram based on the first characteristic diagram; the process for determining the ith feature map comprises the following steps: up-sampling the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple times of downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger if i is larger; i is an integer of 2 or more and N or less.
For example, referring to fig. 3, when N is 3, an image with a size of 416x416 is subjected to one feature extraction by a convolution module (with a convolution kernel size of 3x3) and four feature extractions by a bottleneck module, so as to obtain a first feature map with a size of 13x 13. And then, performing up-sampling on the first feature map to obtain a feature map with the size of 26x26, and then splicing the feature map with a feature map with the size of 26x26 obtained after the down-sampling processing of a third bottoming module to obtain a second feature map. And upsampling the second feature map to obtain a feature map with the size of 52x52, and splicing the feature map with the size of 52x52 obtained after the downsampling processing of the second bottleeck module to obtain a third feature map. Thus, three dimensional signatures were obtained: a first feature map with dimension 13x13, a second feature map with dimension 26x26, and a third feature map with dimension 52x 52. Of course, before each upsampling process, a certain convolution process may be performed on the feature map to optimize the feature map.
S130, inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, wherein the prediction data comprise first prediction data and second prediction data, the first prediction data comprise classification probabilities of each pixel point in the feature map corresponding to a brake chain, and the second prediction data comprise prediction coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel points;
the number of the prediction models is the same as the size category of the feature map;
for example, as shown in fig. 3, where N is 3, the first feature map is input into the first prediction model, the second feature map is input into the second prediction model, and the third feature map is input into the third prediction model, each of which outputs prediction data.
In specific implementation, the preset key points may be selected according to specific situations, for example, in fig. 4, one end point a, the other end point B of the arc-shaped brake chain AB, and an intersection point D between a perpendicular bisector of the straight line segment AB between the two end points and the brake chain are selected. The point C in the graph is the middle point of the straight line segment AB, the straight line segment AB is perpendicular to the straight line segment CD, the point E is the middle point of the straight line segment CD, the brake chain is arc-shaped, and therefore the point E is also the center of the minimum external rectangle of the brake chain, the point E can be used as an anchor point, the anchor point can be a point representing the whole chain, when feature extraction is continuously carried out on image data, the whole chain can be contracted into a pixel point or a plurality of pixel points in a feature diagram, and at the moment, the pixel point or the centers of the pixel points are used as the anchor point to represent the whole chain.
S140, determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;
it can be understood that each pixel point in the feature map represents a corresponding plurality of pixel points in the image data. In one image data, some pixel points are pixel points of a brake chain, and some pixel points are pixel points of other things. Each pixel point in the feature map has a corresponding classification probability, and the classification probability corresponding to one pixel point in the feature map is higher, which means that the probability that a plurality of corresponding pixel points in the image data are pixel points of a brake chain is very high; if the classification probability corresponding to one pixel point in the feature map is low, it means that a plurality of corresponding pixel points in the image data are likely to be pixel points of other things. For example, in one feature map, the classification probabilities corresponding to four pixel points in the tenth row and the eleventh row, which intersect with the third column and the fourth column, are all greater than 80%, which indicates that a plurality of pixel points corresponding to the four pixel points in the image data are pixel points of the brake chain, and the remaining pixel points are not related to the brake chain. Therefore, the position of the brake chain in the image data can be determined according to the classification probability in the feature map.
It can be understood that each pixel point in the feature map corresponds to the predicted coordinate offset of the plurality of preset key points in addition to the classification probability, and therefore after the position of the brake chain in the image data is determined according to the classification probability, the coordinates of the plurality of key points in the image data can be determined according to the predicted coordinate offset of the plurality of preset key points.
Based on the above understanding, as shown in fig. 5a, step S140 may include the steps of:
s141, determining coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;
it can be understood that step S141 is the process of determining the position of the brake chain in the image data. The coordinates of the corresponding pixel points of the brake chain in the image data may be the centers of the pixels corresponding to the pixel points in the image data, where the classification probability of each pixel point is greater than the preset probability value, in the feature map, that is, the coordinates of the anchor point of the brake chain in the image data, so that the coordinates determined in step S141 are the coordinates of the anchor point in the image data.
And S142, determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.
And the predicted coordinate offset of the key point is the coordinate offset of the coordinate of the key point relative to the anchor point.
It is understood that step S142 is a process of determining coordinates of a plurality of key points in the image data, where the coordinates of the plurality of key points in the image data are calculated according to the prediction data output by each prediction model, and the coordinates are referred to as candidate prediction coordinates under the prediction model.
In specific implementation, because the brake chain states include a slack state and a tight state, in order to improve the identification accuracy of the brake chain states, two states can be set for each prediction model: the prediction data in each state comprises classification probability that each pixel point in a feature diagram is a brake chain and prediction coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel point, namely the prediction data output by each prediction model comprises the first prediction data and the second prediction data under each priori chain looseness. For example, after a first feature map with a size of 13x13 is input into a first prediction model, prediction data in a relaxed state (with a size of 13x13x7) and prediction data in a tight state (with a size of 13x13x7) are obtained, each pixel point in the feature map in each state corresponds to 7 values, and each of the 7 values may include 1 classification probability and 6 coordinate offsets (including the abscissa offset and the ordinate offset of each of three key points).
Aiming at the setting that a prediction model can output prediction data under two brake chain states, one pixel point in a feature map corresponds to two classification probabilities, and whether the pixel point in the feature map corresponds to the pixel point of a brake chain in image data can be judged by selecting the larger classification probability, namely when the larger classification probability in the two classification probabilities of one pixel point in the feature map is very high, the probability that a plurality of pixel points corresponding to the pixel point in the image data are brake chains is very high, at the moment, the brake chain state corresponding to the larger classification probability is used as the brake chain state of a plurality of pixel points corresponding to the image data, and the brake chain state corresponding to the larger classification probability is used as the brake chain state in the image data. And when the larger classification probability of the two classification probabilities of one pixel point in the feature map is very low, the fact that a plurality of pixel points corresponding to the pixel point in the image data are pixel points of other things is shown. As can be seen, in step S141, when determining the coordinates of the corresponding pixel points of the brake chain in the image data under a prediction model (i.e., determining the position of the brake chain in the image data), the larger classification probability of the two classification probabilities corresponding to a pixel point under the prediction model is based on.
Based on the above understanding, referring to fig. 5b, step S142 may specifically include: and determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset prior chain width for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offsets of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model.
The prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain slack is a prior value representing the brake chain slack, and the prior chain slack preset for each prediction model comprises a first prior chain slack corresponding to a slack state and a second prior chain slack corresponding to a tight state. For example, when using points A, B and D as described above as key points for a brake chain, the chain slack may be the ratio of the length of the mid-vertical section CD to the width of the chain AB.
The smaller the size of the feature map corresponding to one prediction model is, the larger the prior chain width preset for the prediction model can be.
It will be appreciated that the shortest distance between two points is the length of the straight line segment formed by the two points.
The larger classification probability is a larger value of the classification probability corresponding to the first priori chain sag and the classification probability corresponding to the second priori chain sag for the same pixel point.
For example, if a pixel point in the feature map corresponds to one classification probability of 60% in two states, the classification probability corresponds to the first prior chain sag, and the other classification probability is 90%, the classification probability corresponds to the second prior chain sag, and since 90% is very high, the pixel point in the feature map corresponds to a plurality of pixel points of the brake chain in fig. 4, the prior chain sag corresponding to the larger classification probability when calculating the candidate prediction coordinate according to step S142 is the second prior chain sag. It can be understood that all the pixels of the brake chain in fig. 4 generally correspond to the second a priori chain slack at this time.
In a specific implementation, if the endpoint a is selected as a key point, step S142 may use a first formula to calculate candidate predicted coordinates of an endpoint of the brake chain in the image data under a prediction model, where the first formula includes:
xA=xc-w×sigmoid(txA)
yA=yc-w×s×sigmoid(tyA)
in the formula (x)A,yA) Is that the brakeCandidate predicted coordinates corresponding to an end point A on the chain in the image data, (x)c,yc) (t) coordinates of corresponding pixel points of said brake chain in said image dataxA,tyA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is used, w is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;
in a specific implementation, if the endpoint B is selected as another key point, step S142 may use a second formula to calculate a candidate predicted coordinate of another endpoint on the brake chain in the image data under a prediction model, where the second formula includes:
xB=xc+w×sigmoid(txB)
yB=yc-w×s×sigmoid(tyB)
in the formula (x)B,yB) (t) for the corresponding candidate predicted coordinate of the other end point B on the brake chain in the image dataxB,tyB) The predicted coordinate offset of the other end point B on the brake chain under the prior chain sag corresponding to the larger classification probability;
in a specific implementation, if the intersection point D is selected as the third key point, step S142 may use a third formula to calculate candidate predicted coordinates of the intersection point in the image data under one prediction model, where the third formula includes:
xD=xc+w×(sigmoid(txD)-0.5)
yD=yc+w×s×sigmoid(tyD)
wherein (A), (B), (C), (B), (C), (B), (C), (B), (C)xD,yD) (t) for the corresponding candidate predicted coordinate in the image data for intersection point DxD,tyD) The predicted coordinate offset of the intersection point D under the prior chain sag corresponding to the larger classification probability.
In an implementation, the a priori chain width is predetermined, and as shown in fig. 6a, a specific determination process may include:
s001a, calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model;
s002a, clustering the chain widths corresponding to all the training samples used for training the prediction model to obtain the prior chain widths consistent with the number of the prediction model.
The clustering mode can adopt K-means clustering.
For example, with three prediction models, three prior chain widths need to be clustered: w1, w2, w3, w1 being the largest and w3 the smallest, w1 being assigned to the first predictive model for the first feature map of size 13x13, w2 being assigned to the second predictive model for the second feature map of size 26x26, and w3 being assigned to the third predictive model for the third feature map of size 52x 52. Because the size of the first feature map is minimum, the first feature map needs to be subjected to larger up-sampling when being restored to the size of image data, and therefore the maximum prior chain width is allocated to the first prediction model; the third feature map has the largest size, and needs to undergo smaller upsampling when being restored to the size of the image data, so that the third prediction model is assigned with the smallest prior chain width.
In a specific implementation, the a priori chain slack is also predetermined, and as shown in fig. 6b, a specific determination process may include:
s001b, calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model;
s002b, clustering chain sag corresponding to all training samples used for training the prediction model to obtain the first prior chain sag and the second prior chain sag.
The clustering mode can also adopt K-means clustering.
It can be understood that, because the coordinates of each key point are marked in the training samples, the chain slack and the chain width corresponding to each training sample can be calculated according to the marking information.
For example, the distribution of the a priori chain width and the a priori chain slack is shown in table 1 below:
TABLE 1 Allocation Table for Prior chain Width and Prior chain sag
Figure BDA0002865665850000121
In the above table, s1 is the first prior chain slack and s2 is the second prior chain slack.
S150, performing de-duplication processing on candidate prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data;
it is understood that, based on each prediction model, the corresponding candidate predicted coordinate can be obtained, and the problem to be solved in step S150 is to determine the candidate predicted coordinate obtained based on which prediction model as the optimal predicted coordinate. For example, N is 3, candidate predicted coordinates of a plurality of key points obtained based on a first prediction model constitute a first predicted coordinate set G1 { (x1, y1), (x2, y2) … … }, candidate predicted coordinates of a plurality of key points obtained based on a second prediction model constitute a second predicted coordinate set G2 { (x1, y1), (x2, y2) … … }, candidate predicted coordinates of a plurality of key points obtained based on a third prediction model constitute a third predicted coordinate set G3 { (x1, y1), (x2, y2) … … }, and one predicted coordinate set needs to be selected as an optimal predicted coordinate set.
In specific implementation, as shown in fig. 7, the process of the deduplication processing in step S150 may include: and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.
For example, an end point A, B and an intersection point D are selected as key points, and for a first prediction model and a second prediction model, if any one key point (A, B or D) in a first prediction coordinate set G1 is located in a triangular region formed by three key points in a second prediction coordinate set G2, the first prediction coordinate set G1 and the second prediction coordinate set G2 are considered to correspond to the same brake chain, at this time, a candidate prediction coordinate (for example, the first prediction coordinate set G1) corresponding to a larger classification probability is retained and used as the optimal prediction coordinate, and prediction data corresponding to a smaller classification probability is deleted.
Understandably, when the high-definition camera is used for shooting the image containing the brake chain, some brake chains are closer to the high-definition camera, so that the brake chain is larger in the image data; and some brake chains are far away from the high-definition camera, so that the brake chains are small in image data. If various image data are processed into a feature map with one size and a prediction model is used for prediction processing, errors of predicted coordinates of some image data are large, and the recognition accuracy of the brake chain state is affected. In order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, so that multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the brake chain state is improved.
And S160, determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of the plurality of preset key points on the brake chain in the image data.
In practical implementation, when the endpoints a and B and the intersection D are used as the key points, step S160 may specifically include the steps shown in fig. 8:
s161, calculating chain sag in the image data according to the corresponding optimal prediction coordinates of the two end points and the intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of the middle vertical line to the width of the chain;
the chain width is the shortest distance | AB | of the two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance | CD | of the middle point of the straight line between the two end points and the intersection point in the image data.
And S162, judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the brake chain state in the image data is a loose state.
It can be understood that if the ratio | CD |/| AB | is less than or equal to the preset ratio threshold, the brake chain state in the image data is a tight state.
In a specific implementation, the prediction model is obtained by pre-training, and referring to fig. 3, a specific structure of the prediction model may include two convolution modules, for example, a convolution kernel of one convolution module is 3x3, and a convolution kernel of one convolution module is 1x 1. In the training process of the prediction model, the loss value of the prediction model can be calculated according to the loss function, and then each parameter in the prediction model is adjusted according to the loss value. Specific loss functions may include:
Figure BDA0002865665850000141
in the formula, LtotalFor the total loss of the predictive model,
Figure BDA0002865665850000142
for presetting the balance parameter, NclsIs the number of all training samples, L, required to train the prediction modelcls() As a function of classification loss, yiAnd
Figure BDA0002865665850000143
respectively pre-marking the brake chain state of the ith training sample and the classification probability, N, output after the ith training sample is input into the prediction modelregIs the number of positive samples, L, in all the training samplesreg() As a function of the regression loss, tpiAnd
Figure BDA0002865665850000144
the real coordinates of the key point P in the ith training sample and the prediction coordinates output after the ith training sample is input into the prediction model are respectively.
Understandably, the brake chain state yiThe method comprises 0 (tension state) and 1 (relaxation state), if two prior chain relaxities are set for each prediction model, each prediction model needs to be trained respectively according to the two prior chain relaxities, so that prediction data are respectively output according to the two prior chain relaxities, and therefore for each prior chain relaxivity, a prediction module can adjust parameters in the prediction model according to the loss function.
The above loss function refers to a positive sample, and the selection process of the positive sample is shown in fig. 9, and may include the following steps:
s010, calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of anchor points in the training sample, wherein the anchor points are the centers of the minimum circumscribed rectangles of the brake chains;
for example, for a training sample, the key points are two endpoints a and B, and an intersection point D, a priori chain slack s and a priori chain width | AB | are set for the training sample, and s is | CD |/| AB |, so | CD | can be determined according to s and | AB |, and when the middle point of the CD, i.e., the coordinate of the anchor point, the coordinates of three key points can be determined according to the anchor point coordinates, | CD |, and | AB |.
S020, determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample;
for example, the shortest distance D1 between the coordinates of the point a calculated in step S010 and the real coordinates of the point a in the training sample, the shortest distance D2 between the coordinates of the point B calculated in step S010 and the real coordinates of the point B in the training sample, the shortest distance D3 between the coordinates of the point D calculated in step S010 and the real coordinates of the point D in the training sample are calculated, and then the sum of the distances D1, D2, D3 is calculated.
It can be understood that the coincidence degree between the rectangular frame (namely the minimum bounding rectangle of the brake chain) determined according to the priori chain slack, the priori chain width and the anchor point coordinates and the real rectangular frame of the training sample can be represented by the sum of the distances, wherein the smaller the sum of the distances is, the higher the coincidence degree is, the larger the sum of the distances is, and the lower the coincidence degree is.
And S030, judging whether the sum of the distances is smaller than a preset distance threshold value, and if so, taking the training sample as a positive sample corresponding to the prior chain sag.
It can be appreciated that when the sum of distances is small, the overlap ratio is high, and the training sample is taken as a positive sample corresponding to the a priori chain slack.
In specific implementation, the preset distance threshold is referred to as a first distance threshold, a second distance threshold larger than the first distance threshold is further set, when the sum of the distances is larger than the second distance threshold, the training sample can be used as a negative sample corresponding to the priori chain sag, and if the sum of the distances is between the first distance threshold and the second distance threshold, the training sample is ignored.
Through the steps S010-S030, a positive sample and a negative sample corresponding to each priori chain sag can be determined and further used for calculating the loss function.
The method for identifying the state of the train brake chain provided by the embodiment is characterized in that the image data is subjected to feature extraction to obtain feature maps of various sizes, and the feature maps of each size are input into a corresponding prediction model. Therefore, in order to be suitable for brake chains with different sizes in image data, the image data are processed into feature maps with different sizes, then the feature maps are processed by using corresponding prediction models, multiple candidate prediction targets are obtained, the optimal prediction target is selected from the multiple candidate prediction targets, and the recognition accuracy of the brake chain state is improved. And the method realizes automatic identification of the state of the brake chain based on a prediction model, does not need manual inspection, saves heavy and inefficient work, can effectively improve the running efficiency of the train, and ensures the running safety of the train.
As shown in fig. 10, an embodiment of the present application provides a state identification device for a train brake chain, which may specifically include:
the image acquisition module 110 is used for acquiring image data of a train brake chain;
a feature extraction module 120, configured to perform feature extraction on the image data to obtain feature maps of multiple different sizes;
the model prediction model 130 is configured to input the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, where the prediction data includes first prediction data and second prediction data, the first prediction data includes a classification probability that each pixel point in the feature map corresponds to a brake chain, and the second prediction data includes predicted coordinate offsets of a plurality of preset key points on the brake chain corresponding to the pixel point;
the coordinate calculation module 140 is configured to determine candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the predicted coordinate offset output by each prediction model;
a chain deduplication module 150, configured to perform deduplication processing on candidate predicted coordinates, corresponding to the image data, of a plurality of preset key points on the brake chain under each prediction model, so as to obtain optimal predicted coordinates, corresponding to the image data, of the plurality of preset key points on the brake chain;
the state determining module 160 is configured to determine a brake chain state in the image data according to the corresponding optimal predicted coordinates of the plurality of preset key points on the brake chain in the image data.
In specific implementation, the size category of the characteristic diagram is N, wherein N is an integer greater than or equal to 2; the feature extraction module 120 includes:
a first extraction unit, configured to perform downsampling processing on the image data for multiple times to obtain a first feature map, where a size of the first feature map is a first size;
the second extraction unit is used for determining any one of the second feature map to the Nth feature map based on the first feature map; the process for determining the ith feature map comprises the following steps: up-sampling the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger when i is larger; i is an integer of 2 or more and N or less.
In specific implementation, the coordinate calculation module 140 specifically includes:
the position determining unit is used for determining the coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;
and the coordinate determination unit is used for determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.
In particular implementations, the prediction data output by each prediction model includes the first prediction data and the second prediction data at each a priori chain sag; the coordinate determination unit is specifically configured to: determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset width of the prior chain for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offset of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model; the prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain sag is a prior value representing the brake chain sag, the prior chain sag preset for each prediction model comprises a first prior chain sag corresponding to a slack state and a second prior chain sag corresponding to a tight state, the larger classification probability is for the same pixel point, and the larger value of the classification probability corresponding to the first prior chain sag and the classification probability corresponding to the second prior chain sag is obtained.
In particular implementation, the apparatus further comprises:
the priori width determining module is used for calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; clustering chain widths corresponding to all training samples for training the prediction model to obtain prior chain widths consistent with the number of the prediction model; and/or
The prior sag determining module is used for calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; and clustering chain sag corresponding to all training samples for training the prediction model to obtain the first prior chain sag and the second prior chain sag.
When the brake chain is specifically implemented, the preset key points comprise two end points of the brake chain, a perpendicular bisector of a straight line section between the two end points and an intersection point of the brake chain.
In a specific implementation, the coordinate determination unit calculates candidate predicted coordinates corresponding to an end point of the brake chain in the image data under a prediction model by using a first formula, where the first formula includes:
xA=xc-w×sigmoid(txA)
yA=yc-w×s×sigmoid(tyA)
in the formula (x)A,yA) (x) for a candidate predicted coordinate corresponding to an endpoint A on said brake chain in said image datac,yc) (t) coordinates of corresponding pixel points of said brake chain in said image dataxA,tyA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is represented by w, which is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;
and/or the coordinate determination unit adopts a second formula to calculate the candidate predicted coordinate corresponding to the other end point of the brake chain in the image data under a prediction model, wherein the second formula comprises:
xB=xc+w×sigmoid(txB)
yB=yc-w×s×sigmoid(tyB)
in the formula (x)B,yB) (t) for the corresponding candidate predicted coordinate of the other end point B on the brake chain in the image dataxB,tyB) The predicted coordinate offset of the other end point B on the brake chain under the prior chain sag corresponding to the larger classification probability;
and/or the coordinate determination unit calculates candidate predicted coordinates corresponding to the intersection point in the image data under one prediction model by adopting a third formula, wherein the third formula comprises:
xD=xc+w×(sigmoid(txD)-0.5)
yD=yc+w×s×sigmoid(tyD)
in the formula (x)D,yD) (t) for the corresponding candidate predicted coordinate in the image data for intersection point DxD,tyD) The predicted coordinate offset of the intersection point D under the prior chain slack corresponding to the larger classification probability.
In a specific implementation, the apparatus further comprises:
the model training module is used for training a prediction model in advance and performing loss calculation on a training sample by adopting a preset loss function in the training process of the prediction model, wherein the loss function comprises:
Figure BDA0002865665850000191
in the formula, LtotalFor the total loss of the predictive model,
Figure BDA0002865665850000192
for presetting the balance parameter, NclsIs the number of all training samples, L, required to train the prediction modelcls() As a function of classification loss, yiAnd
Figure BDA0002865665850000193
respectively pre-marking the brake chain state of the ith training sample and the classification probability, N, output after the ith training sample is input into the prediction modelregIs the number of positive samples, L, in all the training samplesreg() As a function of the regression loss, tpiAnd
Figure BDA0002865665850000194
the real coordinates of the key point P in the ith training sample and the prediction coordinates output after the ith training sample is input into the prediction model are respectively.
In particular, the model training module comprises:
the sample classification module is used for selecting a positive sample, and the selection process comprises the following steps: calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of an anchor point in the training sample, wherein the anchor point is the center of the minimum circumscribed rectangle of the brake chain; determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample; and judging whether the sum of the distances is smaller than a preset distance threshold value, if so, taking the training sample as a positive sample corresponding to the prior chain sag.
In particular implementation, the chain deduplication module is specifically configured to: and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.
In a specific implementation, the state determination module is specifically configured to: calculating the chain sag in the image data according to the corresponding optimal prediction coordinates of two end points and an intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of a middle vertical line to the width of the chain; the chain width is the shortest distance between two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance between the midpoint of a straight line segment between the two end points and the intersection point in the image data; and judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the state of the brake chain in the image data is a loose state.
Embodiments of the present application further provide a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the steps of the above method when the computer program is executed.
Embodiments of the present application also provide a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above method.
It is understood that, for the apparatuses, the computer devices, and the computer-readable storage media provided in the embodiments of the present application, for explanation, examples, and beneficial effects of the contents, reference may be made to corresponding parts in the foregoing methods, and details are not described here.
It is to be appreciated that any reference to memory, storage, database, or other medium used in the embodiments provided herein can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus (Rambus) direct RAM (RDRAM), direct bused dynamic RAM (DRDRAM), and bused dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the methods according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (14)

1. A state identification method for a train brake chain is characterized by comprising the following steps:
acquiring image data of a train brake chain;
extracting the features of the image data to obtain feature maps with different sizes;
inputting the feature map of each size into a corresponding prediction model to obtain corresponding prediction data, wherein the prediction data comprises first prediction data and second prediction data, the first prediction data comprises the classification probability of each pixel point in the feature map corresponding to a brake chain, and the second prediction data comprises the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;
determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;
performing duplicate removal processing on candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal prediction coordinates corresponding to the plurality of preset key points on the brake chain in the image data;
and determining the state of the brake chain in the image data according to the corresponding optimal predicted coordinates of a plurality of preset key points on the brake chain in the image data.
2. The method according to claim 1, wherein the size category of the feature map is N, and N is an integer greater than or equal to 2; the feature extraction is performed on the image data to obtain feature maps with different sizes, and the feature maps comprise:
performing down-sampling processing on the image data for multiple times to obtain a first feature map, wherein the size of the first feature map is a first size;
determining any one of a second feature map to an Nth feature map based on the first feature map; the process for determining the ith feature map comprises the following steps: carrying out up-sampling on the ith-1 characteristic diagram to obtain an ith size characteristic diagram; splicing the feature map of the ith size with the feature map of the ith size generated in the multiple downsampling processing to obtain an ith feature map, wherein the ith feature map is of the ith size, and the ith size is larger when i is larger; i is an integer of 2 or more and N or less.
3. The method according to claim 1, wherein the determining candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the predicted coordinate offset output by the prediction model comprises:
determining the coordinates of corresponding pixel points of the brake chain in the image data under each prediction model according to the classification probability output by each prediction model;
and determining candidate predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the plurality of preset key points.
4. The method of claim 3, wherein the prediction data output by each prediction model comprises the first prediction data and the second prediction data at each a priori chain sag;
determining candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the coordinates of the corresponding pixel points of the brake chain in the image data under the prediction model and the predicted coordinate offsets of the preset key points, wherein the candidate predicted coordinates comprise:
determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under the prediction model according to the preset prior chain width for the prediction model, the prior chain sag corresponding to the larger classification probability, the prediction coordinate offsets of the plurality of preset key points under the prior chain sag corresponding to the larger classification probability and the coordinates of corresponding pixel points of the brake chain in the image data under the prediction model;
the prior chain width is a prior value of the shortest distance from one end point of the train brake chain to the other end point, and the prior chain width corresponds to the prediction model one by one; the prior chain slack is a prior value representing the brake chain slack, and the prior chain slack preset for each prediction model comprises a first prior chain slack corresponding to a slack state and a second prior chain slack corresponding to a tight state; the larger classification probability is a larger value of the classification probability corresponding to the first priori chain sag and the classification probability corresponding to the second priori chain sag for the same pixel point.
5. The method of claim 4, wherein the a priori chain width determination comprises: calculating the chain width corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; clustering chain widths corresponding to all training samples for training the prediction model to obtain prior chain widths consistent with the number of the prediction model;
and/or the determination of the a priori chain sag comprises: calculating the chain sag corresponding to each training sample according to the coordinates of a plurality of preset key points in each training sample for training the prediction model; and clustering chain sag corresponding to all training samples for training the prediction model to obtain the first prior chain sag and the second prior chain sag.
6. The method of claim 4, wherein the plurality of predetermined key points comprises two end points of the brake chain and an intersection of a perpendicular bisector of a straight line segment between the two end points and the brake chain.
7. The method of claim 6, wherein the candidate predicted coordinates for an end point on the brake chain in the image data under a prediction model are calculated using a first formula, the first formula comprising:
xA=xc-w×sigmoid(txA)
yA=yc-w×s×sigmoid(tyA)
in the formula (x)A,yA) (x) for the corresponding candidate predicted coordinate in the image data of one end point A on the brake chainc,yc) (t) coordinates of corresponding pixel points of said brake chain in said image dataxA,tyA) The predicted coordinate offset of an endpoint A on the brake chain under the prior chain sag corresponding to the larger classification probability is used, w is the prior chain width corresponding to the prediction model, and s is the prior chain sag corresponding to the larger classification probability;
and/or calculating the corresponding candidate prediction coordinate of the other end point of the brake chain in the image data under a prediction model by adopting a second formula, wherein the second formula comprises the following steps:
xB=xc+w×sigmoid(txB)
yB=yc-w×s×sigmoid(tyB)
in the formula (x)B,yB) (t) for the corresponding candidate predicted coordinate of the other end point B on the brake chain in the image dataxB,tyB) The predicted coordinate offset of the other end point B on the brake chain under the prior chain sag corresponding to the larger classification probability;
and/or, calculating candidate prediction coordinates corresponding to the intersection point in the image data under a prediction model by adopting a third formula, wherein the third formula comprises:
xD=xc+w×(sigmoid(txD)-0.5)
yD=yc+w×s×sigmoid(tyD)
in the formula (x)D,yD) For the corresponding candidate predicted coordinate in the image data of intersection point D, (t)xD,tyD) The predicted coordinate offset of the intersection point D under the prior chain sag corresponding to the larger classification probability.
8. The method according to claim 4, wherein the prediction model is obtained by pre-training, and a loss calculation is performed on training samples by using a preset loss function in the training process of the prediction model, wherein the loss function comprises:
Figure FDA0002865665840000041
in the formula, LtotalFor the total loss of the predictive model,
Figure FDA0002865665840000042
for presetting the balance parameter, NclsIs the number of all training samples, L, required to train the prediction modelcls() As a function of classification loss, yiAnd
Figure FDA0002865665840000043
respectively pre-marking the brake chain state of the ith training sample and the classification probability, N, output after the ith training sample is input into the prediction modelregIs the number of positive samples, L, in all training samplesreg() As a function of the regression loss, tpiAnd
Figure FDA0002865665840000044
the real coordinates of the key point P in the ith training sample and the prediction coordinates output after the ith training sample is input into the prediction model are respectively.
9. The method of claim 8, wherein the selecting of the positive sample comprises:
calculating the coordinates of a plurality of preset key points in a training sample under the prior chain sag according to the prior chain sag, the prior chain width corresponding to the prediction model and the coordinates of an anchor point in the training sample, wherein the anchor point is the center of a minimum circumscribed rectangle of the brake chain;
determining the sum of the calculated distances between the coordinates of a plurality of preset key points in the training sample under the prior chain sag and the real coordinates of the plurality of preset key points in the training sample;
and judging whether the sum of the distances is smaller than a preset distance threshold value, if so, taking the training sample as a positive sample corresponding to the prior chain sag.
10. The method of claim 1, wherein the performing de-duplication processing on candidate predicted coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model to obtain optimal predicted coordinates corresponding to the plurality of preset key points on the brake chain in the image data comprises:
and judging whether any key point under one prediction model is in an area formed by a plurality of preset key points under the other prediction model or not aiming at any two prediction models, and if so, taking the candidate prediction coordinate corresponding to the higher classification probability as the optimal prediction coordinate.
11. The method of claim 6, wherein determining the brake chain status in the image data according to the corresponding optimal predicted coordinates of the plurality of predetermined key points on the brake chain in the image data comprises:
calculating the chain sag in the image data according to the corresponding optimal prediction coordinates of the two end points and the intersection point on the brake chain in the image data, wherein the chain sag is the ratio of the length of the middle vertical line to the width of the chain; the chain width is the shortest distance between two end points on the brake chain in the image data, and the length of the middle vertical line is the shortest distance between the midpoint of a straight line segment between the two end points and the intersection point in the image data;
and judging whether the ratio is larger than a preset ratio threshold value or not, and if so, determining that the state of the brake chain in the image data is a loose state.
12. A state recognition device of a train brake chain is characterized by comprising:
the image acquisition module is used for acquiring image data of the train brake chain;
the characteristic extraction module is used for extracting the characteristics of the image data to obtain characteristic graphs with different sizes;
the model prediction model is used for inputting the feature map of each size into the corresponding prediction model to obtain corresponding prediction data, the prediction data comprise first prediction data and second prediction data, the first prediction data comprise the classification probability of each pixel point in the feature map corresponding to the brake chain, and the second prediction data comprise the prediction coordinate offset of a plurality of preset key points on the brake chain corresponding to the pixel point;
the coordinate calculation module is used for determining candidate prediction coordinates corresponding to a plurality of preset key points on the brake chain in the image data under each prediction model according to the classification probability and the prediction coordinate offset output by each prediction model;
the chain duplicate removal module is used for performing duplicate removal processing on candidate predicted coordinates, corresponding to the image data, of a plurality of preset key points on the brake chain under each prediction model to obtain optimal predicted coordinates, corresponding to the image data, of the plurality of preset key points on the brake chain;
and the state determining module is used for determining the state of the brake chain in the image data according to the corresponding optimal prediction coordinates of the plurality of preset key points on the brake chain in the image data.
13. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 11 are implemented by the processor when executing the computer program.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 11.
CN202011583368.7A 2020-12-28 2020-12-28 Method, device and equipment for identifying state of train brake chain and storage medium Pending CN114758150A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011583368.7A CN114758150A (en) 2020-12-28 2020-12-28 Method, device and equipment for identifying state of train brake chain and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011583368.7A CN114758150A (en) 2020-12-28 2020-12-28 Method, device and equipment for identifying state of train brake chain and storage medium

Publications (1)

Publication Number Publication Date
CN114758150A true CN114758150A (en) 2022-07-15

Family

ID=82324667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011583368.7A Pending CN114758150A (en) 2020-12-28 2020-12-28 Method, device and equipment for identifying state of train brake chain and storage medium

Country Status (1)

Country Link
CN (1) CN114758150A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576635A (en) * 2024-01-17 2024-02-20 中国石油集团川庆钻探工程有限公司 Method for judging linear target tensioning state in video identification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103454285A (en) * 2013-08-28 2013-12-18 南京师范大学 Transmission chain quality detection system based on machine vision
US20160010977A1 (en) * 2014-07-09 2016-01-14 Charles C. Frost Chain wear monitoring device
CN107346437A (en) * 2017-07-03 2017-11-14 大连理工大学 The extraction method of body side view parameter model
CN111055890A (en) * 2019-12-18 2020-04-24 成都国铁电气设备有限公司 Intelligent detection method and detection system for railway vehicle anti-slip

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103454285A (en) * 2013-08-28 2013-12-18 南京师范大学 Transmission chain quality detection system based on machine vision
US20160010977A1 (en) * 2014-07-09 2016-01-14 Charles C. Frost Chain wear monitoring device
CN107346437A (en) * 2017-07-03 2017-11-14 大连理工大学 The extraction method of body side view parameter model
CN111055890A (en) * 2019-12-18 2020-04-24 成都国铁电气设备有限公司 Intelligent detection method and detection system for railway vehicle anti-slip

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵宁 等: "油锯链传动啮合分析新方法及传动片齿廓设计", 《机械科学与技术》, 31 December 2010 (2010-12-31), pages 1 - 9 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117576635A (en) * 2024-01-17 2024-02-20 中国石油集团川庆钻探工程有限公司 Method for judging linear target tensioning state in video identification
CN117576635B (en) * 2024-01-17 2024-03-29 中国石油集团川庆钻探工程有限公司 Method for judging linear target tensioning state in video identification

Similar Documents

Publication Publication Date Title
CN110414507B (en) License plate recognition method and device, computer equipment and storage medium
CN108038474B (en) Face detection method, convolutional neural network parameter training method, device and medium
US10229332B2 (en) Method and apparatus for recognizing obstacle of vehicle
CN113033604B (en) Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN111797829A (en) License plate detection method and device, electronic equipment and storage medium
US11042742B1 (en) Apparatus and method for detecting road based on convolutional neural network
CN108428248B (en) Vehicle window positioning method, system, equipment and storage medium
CN111582339B (en) Vehicle detection and recognition method based on deep learning
CN110956081B (en) Method and device for identifying position relationship between vehicle and traffic marking and storage medium
CN111523429A (en) Deep learning-based steel pile identification method
CN111144425B (en) Method and device for detecting shot screen picture, electronic equipment and storage medium
CN114266894A (en) Image segmentation method and device, electronic equipment and storage medium
CN115995056A (en) Automatic bridge disease identification method based on deep learning
CN114758150A (en) Method, device and equipment for identifying state of train brake chain and storage medium
CN111738040A (en) Deceleration strip identification method and system
CN113269156A (en) Signal lamp detection and identification method and system based on multi-scale feature fusion
CN111950415A (en) Image detection method and device
CN111914706A (en) Method and device for detecting and controlling quality of character detection output result
CN116343148A (en) Lane line detection method, device, vehicle and storage medium
CN115631197A (en) Image processing method, device, medium, equipment and system
CN113012132B (en) Image similarity determination method and device, computing equipment and storage medium
CN114926803A (en) Lane line detection model establishing method, lane line detection method, device and equipment
CN114399657A (en) Vehicle detection model training method and device, vehicle detection method and electronic equipment
CN113269137A (en) Non-fit face recognition method combining PCANet and shielding positioning
CN112380913A (en) License plate detection and identification method based on combination of dynamic adjustment and local feature vector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination