CN109948690A

CN109948690A - A kind of high-speed rail scene perception method based on deep learning and structural information

Info

Publication number: CN109948690A
Application number: CN201910193175.1A
Authority: CN
Inventors: 李兆洋; 罗建桥; 李柏林; 程洋
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2019-03-14
Filing date: 2019-03-14
Publication date: 2019-06-28

Abstract

The invention discloses a kind of high-speed rail scene perception method based on deep learning and structural information, include the following: step 1: obtaining orbital image, be divided into training set and test set, the image in training set is labeled to form data set；Step 2: building SSD network model, and construct loss function；Step 3: the data set formed using step 1 is iterated training to the network that step 2 obtains and obtains training pattern；Step 4: to needing the video of detection senses to be input in the training pattern that step 3 obtains by frame, extracting feature, obtain position and the classification information of fastener and shoulder block, track switch and common rail are distinguished according to the position and classification information of fastener and shoulder block；Step 5: the location information of positioning result in step 4 being clustered respectively, completes the perception of rail and sleeper；The present invention can be detected to the track component of switch zones and semantic segmentation, and detection accuracy is high, and detection speed is fast.

Description

A kind of high-speed rail scene perception method based on deep learning and structural information

Technical field

The present invention relates to the track component detection methods based on image procossing, and in particular to one kind is based on deep learning and knot The high-speed rail scene perception method of structure information.

Background technique

Railway traffic transports the important support as socio-economic development, the vehicles indispensable in people's lives, It is had a decisive role in entire society's development；Especially for China has a vast territory, movement of population amount is big, resource point The factors such as cloth is unbalanced；The advantages that railway transportation is big, transportation cost is low, occupied area is relatively fewer with its capacity, in all kinds of public affairs Absolute advantage is occupied in the vehicles altogether；With the propulsion that Chinese iron is built, state plan to the year two thousand twenty, China Railway High-speed Operation mileage is up to 30,000 kilometers, and High-speed Railway Network covers 80% or more big city.

The maintenance of rail equipment and detection are the problem of being concerned in railway traffic technology, and investment research and development at this stage Most one of the project of fund；Wherein, high-speed railway rail is mainly made of fastener, rail, sleeper, track plates etc., traditional detection side Method is completed by veteran trackwalker, and worker periodically patrols along route, is found and is reported exceptional part；The labour of artificial detection Intensity is very big, working environment very severe, and driving vehicle also constitutes potential threat to the personal safety of detection employee on line；Tradition Patrolling railway method be difficult to meet the development need of nowadays high-speed railway operation.

With the development of machine vision, the route vision detection technology based on image is more and more paid attention to, but passes The visible detection method of system usually only detects the problem of a certain separate part, and there is no divided fastener, rail and sleeper etc. It does not perceive；But each track component is organically united, and the perception of each track component is highly desirable；It is logical The visible detection method of Chang Chuantong is by the way of template matching, at the track switch of high-speed railway rail, due to track component structure Difference can not be detected and be perceived substantially.

Summary of the invention

The present invention provide it is a kind of the track component of switch zones can be detected and semantic segmentation based on deep learning With the high-speed rail scene perception method of structural information.

The technical solution adopted by the present invention is that: a kind of high-speed rail scene perception method based on deep learning and structural information, Include the following:

Step 1: obtaining orbital image, be divided into training set and test set, the image in training set is labeled to form number According to collection；

Step 2: building SSD network model, and construct loss function；

Step 3: the data set formed using step 1 is iterated training to the network that step 2 obtains and obtains training mould Type；

Step 4: to needing the video of detection senses to be input in the training pattern that step 3 obtains by frame, extracting feature, obtain To the position and classification information of fastener and shoulder block, track switch and common rail are distinguished according to the position and classification information of fastener and shoulder block Road；

Step 5: the location information of positioning result in step 4 being clustered respectively, completes the perception of rail and sleeper.

Further, feature is extracted in the step 4, the process of the position and classification information that obtain fastener and shoulder block is such as Under:

The multiple regions of entire image are randomly selected in each frame image of input training pattern；It is obtained by convolution every The scoring in a region；It is judged as fastener and shoulder block if score value is greater than given threshold；It is lost if score value is less than given threshold Abandoning should be as a result, traversal all areas, then completes fastener and shoulder block mark, obtains position and classification information.

Further, it is clustered in the step 5 by DBSCAN algorithm.

Further, the cluster process is as follows:

S1: the direction x coordinate position D, radius eps and the density threshold Minpts for the target that setting steps 4 detect；

S2: an arbitrarily selected target p；

S3: judging whether it already belongs to some cluster or have become noise spot, if then return step S2, if otherwise turning Enter step S4；

S4: judging whether the point of the contiguous range of p is less than Minpts, if then marking p is noise spot, is transferred to step S2； If being otherwise transferred to step S5；

S5: the point q not being labeled within the scope of the traversal p radius of neighbourhood；

S6: judging whether the point of the contiguous range of q is less than Minpts, if step S5 is then transferred to, if being otherwise transferred to step S7；

S7: the point for not being included into other clusters in q neighborhood is added in set C, is terminated.

Further, each priori frame and true frame are calculated by the German number of outstanding card in training process in the step 3 Between similarity, if similarity be greater than given threshold if be included in short-list, be not otherwise included in short-list.

Further, using the parameter information of original SSD model as the initialization of the new model to be trained in the step 2 Parameter.

Further, data set is made according to VOC data set format in the step 1.

Further, loss function L (x, c, l, g) is as follows in the step 2:

In formula: N is the quantity for being matched to the priori frame of real goal, L_conf(x, c) is confidence loss function, L_loc(x,l, G) position loss function, l are priori frame, and g is true frame, and c is confidence level of the Softmax function to every classification, and α is for adjusting The parameter of ratio between position loss and confidence loss, x are center coordinate.

Further, the perception of rail is as follows in the step 5:

S21: using the picture upper left corner as coordinate origin, abscissa and ordinate increase downwards to the right respectively, and the direction x coordinate is The target that feature obtains step 4 is polymerized to n class；

S22: according in step S21 cluster result and step 4 obtained in fastener and shoulder block location information, respectively will The left or right side abscissa positions information of y-coordinate minimum and maximum outlined connects in every one kind；It is folded by intermediate Location of rail.

Further, the perception of sleeper is as follows in the step 5:

S31: using the picture upper left corner as coordinate origin, abscissa and ordinate increase downwards to the right respectively, and the direction y coordinate is The target that feature obtains step 4 is polymerized to m class；

S32: according to the cluster result and prior information of step S31, a line sleeper is chosen, in each fastener ordinate Upper or lower k pixel is to the region of boundary；

S33: the interception area step S32 is projected, and is found gradient in perspective view according to prior information and is greater than certain threshold The point of value obtains boundary point, completes the perception to sleeper position.

The beneficial effects of the present invention are:

(1) the method for the present invention combines deep learning, cluster and prior information scheduling algorithm, both ensure that the precision of detection, Also meet certain speed, can reach the purpose of real-time detection；

(2) present invention uses the algorithm of deep learning, has stronger robustness, is not easy to be illuminated by the light, the influence of noise, after Phase by the way of cluster, can weed out abnormal point；

(3) present invention can preferably improve detection accuracy and detection speed due to using prior information；

(4) present invention solve Conventional visual detection method can not the track component to switch zones carry out detection and semantic The problem of segmentation.

Detailed description of the invention

Fig. 1 is the frame construction drawing of the SSD network model used in the present invention.

Fig. 2 is acquired image exemplary diagram in the present invention, and a is common rail, and b is turnout rail.

Fig. 3 is two exemplary diagram of target one and target in the present invention in acquired image, and a is common rail, and b is track switch Track.

Fig. 4 is data set manufacturing process schematic diagram in the present invention.

Fig. 5 is the cluster result schematic diagram of railroad turnout steel rail region fastener and shoulder block in embodiment in the present invention.

Fig. 6 is DBSCAN clustering algorithm flow diagram in the present invention.

Fig. 7 is the sensing results schematic diagram of rail in the embodiment of the present invention, and a is common rail, and b is turnout rail.

Fig. 8 is the cluster result schematic diagram of switch sleeper region fastener and shoulder block in the embodiment of the present invention.

Fig. 9 is switch sleeper Boundary Recognition result schematic diagram in the embodiment of the present invention.

Figure 10 is a-quadrant sleeper projection result schematic diagram in Fig. 9.

Figure 11 is regular sleeper Boundary Recognition result schematic diagram in the embodiment of the present invention, and a is upright projection, and b is horizontal throws Shadow.

Figure 12 is upright projection result schematic diagram in Figure 11.

Figure 13 is the method for the present invention flow diagram.

Figure 14 is sensing results of embodiment of the present invention schematic diagram, and a is common rail, and b is turnout rail.

Specific embodiment

The present invention will be further described in the following with reference to the drawings and specific embodiments.

As shown in figure 13, a kind of high-speed rail scene perception method based on deep learning and structural information, include the following:

Step 2: building SSD network model, and construct loss function；

Step 3: the data set formed using step 1 is iterated training to the network that step 2 obtains and obtains training mould Type；Using the parameter information of original SSD model as the initiation parameter of the new model to be trained；Pass through Jie Kade in training process Coefficient calculates the similarity between each priori frame and true frame, is included in short-list if similarity is greater than given threshold, Otherwise it is not included in short-list.

Loss function L (x, c, l, g) is as follows:

Step 4: to needing the video of detection senses to be input in the training pattern that step 3 obtains by frame, extracting feature, obtain To the position and classification information of fastener and shoulder block, track switch and common rail are distinguished according to the position and classification information of fastener and shoulder block Road.

It is clustered by DBSCAN algorithm, cluster process is as follows:

S2: an arbitrarily selected target p；

The perception of rail is as follows:

The perception of sleeper is as follows:

It is detected in the present invention using SSD network model contacting piece, traditional object detection method includes optical flow method, back Scape modeling etc.；Object, detection with higher are extracted by the motion vector of pixel in gray level image and tracked to optical flow method Precision, but anti-noise ability is poor；Background modeling method is mentioned by doing thresholding processing to the difference between present frame and background template Moving region is taken out, but since, since the factor optical of weather is shone, shade changes greatly on track, it may appear that the phenomenon that erroneous detection； Due to the difference of fastener structures at track switch, the method for conventional target detection can not detect the fastener at track switch；Currently, deep Object detection method in degree study includes Faster R-CNN, YOLO (you look only once), SSD (single Shot multibox detector) etc.；The target detection accuracy rate of Faster R-CNN is very high, but speed is slower, is not suitable for In real-time detection；Quickly, but accuracy rate is poor for the speed of YOLO target detection；SSD combine recurrence thought in YOLO and Frame mechanism is selected in Faster R-CNN, is returned using the multiple dimensioned provincial characteristics of each position of full figure, was both maintained The fireballing characteristic of YOLO also ensures the relatively more accurate as Faster R-CNN of window prediction.

SSD frame structure is as shown in Figure 1, leading portion is used as basic network using first five layer of VGG-16 image classification model； Latter two full articulamentum is converted to two convolutional layers, then especially increases three convolutional layers and an average pond layer；Given one A input picture and one group of true tag, wherein true tag includes the classification and location information of target；By a series of in SSD Convolutional layer transmits image, and several different Feature Mappings are generated on different scales.

In training process, the overall goal loss function of detection framework is as follows:

In formula: N is the quantity for being matched to the priori frame of real goal, L_conf(x, c) is confidence loss function, L_loc(x,l, G) position loss function, l are priori frame, and g is true frame, and c is confidence level of the Softmax function to every classification, and α is for adjusting The parameter of ratio between position loss and confidence loss, being defaulted as 1, x is center coordinate.Confidence level loss is desirable to the class of prediction More inaccurate, the gap of bounding box and priori frame that position loss hope is predicted is as far as possible with the gap of true frame and priori frame Close, the bounding box predicted in this way can be as far as possible as true frame.

In the training process, similar between each priori frame and true frame to calculate by the German number (following formula) of outstanding person's card Degree can just be classified as short-list when threshold value is greater than 0.5；Assuming that choosing the frame that N number of threshold value is greater than 0.5, i is enabled to indicate i-th A prediction block, j indicate that j-th of true frame, p indicate pth class, thenIndicate j-th of i-th of prediction block and classification p The outstanding person that true frame matches blocks German number；If mismatching,

Training pattern

Firstly the need of data set is prepared, video camera is installed under railcar, as railcar orbits, camera pair Track is imagined, and it is as shown in Figure 2 to obtain orbital image in video；Wherein common rail image 5000 is opened, track at track switch Image 3000 is opened；According to eight to two pro rate training set and test set, i.e. training set totally 6400, test set 1600 is opened.

Different from target classification, the training sample of SSD needs calibration manually, therefore training picture is all comprising fastener equipotential The picture of confidence breath.For the track at common rail and track switch, the fastener of track is different with sleeper shoulder arrangement, such as Fig. 3 institute Show；Therefore by common rail and the fastener and the shoulder block target category that be divided into two classes different at turnout rail, at common rail Fastener and shoulder block be determined as target one, the fastener and shoulder block at turnout rail are determined as target two；Different targets is distinguished Position mark is carried out, and obtains labeling position, classification information.

Because the data volume turnout rail of calibration only has 3000, seem very in face of huge neural network parameter It is few, if training SSD network from the beginning, the bad adjustment of parameter and feature extraction generalization ability is not strong；Therefore it is instructing Transfer learning technology is utilized when practicing SSD neural network, using the parameter information of original model as the initial of the new model to be trained Change parameter.

In addition original model uses PASCAL VOC data set, therefore when training, it is also desirable to according to VOC data set format system Make data set, establish Railway, Annotations, ImageSets, JPEGImages tri- is established under Railway file A file；Trained and test markup information, classification and target including target are put under Annotations file Location information；Main file is established under ImageSets file, and is put into the text information of training and test, that is, is instructed The picture position and picture number of white silk and test；All pictures, structure such as Fig. 4 institute are put under JPEGImages file Show.

The training dataset marked is put under SSD neural network model and is iterated training, initial learning rate is Learning rate is adjusted to 0.00001,60000 after learning rate is adjusted to 0.0001,40000 circulation after 0.001,25000 circulation Training is terminated after secondary circulation；Parameter pad value is 0.0005, factor of momentum 0.9；Obtain under track switch and common rail fastener and Sleeper shoulder block disaggregated model and parameter, to export final training pattern.

Perception and detection

The video for perceiving detection will be needed to be input in trained model by frame, it, should in each frame image of input Model randomly selects the multiple regions of entire image, and is given a mark by the way of convolution to each region；Set a threshold Value is 0.8, and the score in each region is enabled to be judged as fastener and shoulder block if score is higher than threshold value with threshold comparison respectively, if Score be less than threshold value then abandon this as a result, until in whole picture figure fastener and shoulder block be all marked out, export final position And classification information.The information of output such as A { a1, a2, a3 ... an }, n is the number of output, wherein a1={ c, x, y, w, h }, and c is indicated The classification of output, x and y respectively indicate the starting point coordinate information of output box, and w and h respectively indicate the width and height of output box.

Due to the difference of fastener and shoulder block at common rail and track switch, all fasteners and gear perceived are exported in SSD After shoulder, the type of track can be determined according to the classification of fastener and shoulder block；By spatial relationship it is recognised that either in track switch At place or common rail, fastener and shoulder block all in rail two sides and have four column, such as A1, A2, A3, A4 in Fig. 4；Therefore, pass through The position of target can determine the region of rail, i.e., between A1 column and A2 column, A3 column and A4 column；In each column of A1~A4, mesh Target x coordinate is essentially identical；Therefore, using the method for cluster that target a1~a17 is automatic characterized by the x column coordinate of target Mark off tetra- classifications of A1, A2, A3, A4.

Cluster is that the set of data is grouped into the process of different clusters or class, and the data characteristics in each cluster or class is compared Data characteristics in different class or cluster has higher similitude；Common clustering method includes K-means algorithm, AGNES Algorithm, DBSCAN algorithm etc.；The advantages of K-means, easily realizes and arithmetic speed is fast, but the number needs of class are manually set simultaneously It is fixed；AGNES algorithm scalability is poor, and computation complexity is high；DBSCAN algorithm is in cluster process to clustering object amorphism, big Small constraint does not need the number of cluster before cluster yet, there is preferable noise repellence；As shown in figure 5, in figure at a5 frame Fastener and shoulder block are abnormal point, can effectively remove this noise by DBSCAN algorithm；Therefore in order to guarantee subsequent singulation The abscissa of accuracy, contacting piece and shoulder block is clustered, and fastener and shoulder block are polymerized to four classes, and weed out that SSD detects can Can there are the fastener and shoulder block of mistake.

DBSCAN algorithm needs to be arranged two parameters, i.e. radius eps and density threshold Minpts；The value of radius eps limits The search range of core point, density threshold Minpts determine that the minimum in finally formed cluster comprising data point keeps count of； Different radius eps are affected to cluster result, and two fasteners and shoulder block even more than different lines are accidentally synthesized when larger One cluster, the fastener and shoulder block of same row can be aggregated into two clusters when radius is smaller, and being determined by experiment radius eps is 50； For density threshold Minpts, by the observation to a large amount of orbital images, the number of each column fastener and shoulder block is at least 4, institute 4 are set as with density threshold Minpts.

As shown in figure 5, abscissa and ordinate increase downwards to the right respectively, using the upper left corner in scheming as coordinate origin with mesh The mark direction x coordinate is characterized, and target a1~a17 is clustered into 4 class A1 { a1, a2, a3, a4 }, A2 { a6, a7, a8, a9 }, A3 { a10, a11, a12, a13 }, A4 { a14, a15, a16, a17 }, weeds out a5 abnormal point；Similarly, the common rail of non-switch zones Also four classes are clustered at road；For the flow diagram of DBSCAN algorithm as shown in fig. 6, for the perception of rail, D is input All SSD detect that the direction the x coordinate position of target, p are optional one of target in one width figure.

It is perceived based on fastener-rail spatial relationship steel rail area

After completing the cluster of contacting piece and shoulder block, through prior information it can be appreciated that either at common rail Still again at track switch, rail is all fixed among fastener and shoulder block, and the contour connection on the boundary of rail and a column fastener It is parallel；As shown in fig. 7, by the location information of fastener and shoulder block in cluster above, by the smallest a1 of y-coordinate in A1 class and The right side abscissa positions information that the maximum a4 of y-coordinate is outlined；The smallest a5 of y-coordinate and the maximum a8 of y-coordinate are outlined in A2 class Left side abscissa positions information be separately connected and through whole picture figure, intermediate added folder is the position letter of coordinate rail Breath；Similarly, the rail positioning on right side then outlines right side abscissa positions information by a9 and a12；The left side that a13 and a16 are outlined Abscissa positions information connects and runs through whole picture figure, to complete the perception to right side rail.

It is perceived based on fastener-sleeper spatial relationship sleeper

The y-coordinate of sleeper is similar, as shown in Figure 8；The target detected is clustered characterized by the y-coordinate of target, Sleeper in available image；Since the quantity of the sleeper in every width figure may be different, and DBSCAN algorithm was clustering To the constraint of clustering object amorphism, size in journey, the number of cluster is not needed before cluster yet, has preferable noise to resist Property；Therefore the ordinate of the fastener and shoulder block that detect in SSD is clustered using DBSCAN algorithm, different sleeper numbers, The classification being polymerized to is different；In Fig. 8, it is polymerized to four classes, respectively B1, B2, B3, B4；Being determined by experiment radius eps is 25, for Density threshold Minpts, by the observation to a large amount of orbital images, the number of sleeper is at least 4 in every figure, so density threshold Value Minpts is set as 4.

Sleeper at track switch is long sleeper as shown in figure 9, therefore, the mode of perception is different from common rail, long sleeper Be in the middle section of track it is integrated, i.e., in the fastener and shoulder block arranged at two or three liang, and on a left side for first row fastener and shoulder block Side, there are the boundaries of sleeper on the right side of the 4th column fastener and shoulder block；By taking the first row sleeper as an example, by prior information, sleeper Up-and-down boundary is in 30 pixels up and down of the ordinate relative to fastener, 30 pixels above and below interception b1 fastener ordinate It is region A to left-side images boundary；The region in b5 and b9 relative to 30 pixels above and below fastener ordinate is intercepted respectively For B；Intercept b13 fastener ordinate up and down 30 pixels to image right boundary be region C.

Since the boundary of sleeper is generally distributed in horizontal or vertical direction, to image to vertical direction projection gained to Spirogram will well reflection sleeper boundary distribution character, so also two-dimensional matrix will be one-dimensional matrix, reduce calculation amount； Project formula is as follows:

Wherein, I (x, y) is the sleeper borderline region image marked off, and w is picture traverse, and h is the width of the image, and V is To project obtained one-dimensional vector；Lateral projection is carried out to B area, to judge the horizontal boundary of sleeper；To region A and C carries out the projection of vertical and horizontal, to judge the horizontal boundary and longitudinal boundary of sleeper, as shown in Figure 9.

By taking the A of region as an example, the results are shown in Figure 10 for region A projection, and the longitudinal edge of a-quadrant is recognized by prior information Boundary is certainly in the left side of figure；Therefore in figure a, the point that first gradient is greater than certain threshold value, i.e., institute in figure are found from left to right The point irised out；Similarly, for the horizontal boundary of sleeper, i.e., up to lower searching boundary in a-quadrant, reflection be into figure b from It is coboundary that left-to-right, which successively finds gradient greater than first point of certain threshold value, the last one point is lower boundary.To pass through Determine that the perception to switch sleeper position is completed on the boundary of sleeper.

Since common rail is different with the structure of sleeper at track switch, for the perceptive mode of different railway sleepers It is different；For common rail, generally use double-block type sleeper, by outlined in SSD be fastener and sleeper shoulder block is rail The major part of pillow；But secondary series sleeper generally extends to the right one section, and third column sleeper generally extends one section to the left, in this regard, passing through The boundary of sleeper is judged by the way of projection.

It is as shown in figure 11 by being polymerized to 4 classes, in every one kind, right lateral position that second largest fastener of x coordinate and shoulder block b5 are outlined It sets, extends right up to middle line and be partitioned into left side sleeper boundary image that may be present；Similarly, pass through x coordinate third in every one kind The leftward position that big fastener and shoulder block b9 outlines extends to centre to the left and is partitioned into right track bedside circle figure that may be present Picture.

It is in-orbit pillow borderline region upright projection it is as shown in figure 12, gradient meet certain threshold value point be sleeper side Boundary, by determining the perception completed to sleeper to regular sleeper boundary.

Conventional visual detection method can not carry out semantic segmentation and perception to whole picture high-speed railway rail, can only detect single track Component can not effectively utilize prior information existing in track (positional relationship between each track component)；Track switch is asked Topic, due to the difference at the track component and normal orbit at track switch, the matched mode of conventional template can not solve track switch substantially Locate the perception of track component；The present invention is based on existing detection method there are the problem of, using deep learning, cluster and prior information The combination of scheduling algorithm, both ensure that the precision of detection, also meets certain speed, achievees the purpose that real-time detection；Prior target Positioning use the algorithm of deep learning, there is stronger robustness, be not easy to be illuminated by the light, the influence of noise, the later period is using cluster Mode, weed out abnormal point；Due to using prior information, detection accuracy and detection speed can be improved；Solves traditional view Feel detection method can not the track component to switch zones carry out detection and the problem of semantic segmentation.

Claims

1. a kind of high-speed rail scene perception method based on deep learning and structural information, which is characterized in that include the following:

Step 1: obtaining orbital image, be divided into training set and test set, the image in training set is labeled to form data set；

Step 2: building SSD network model, and construct loss function；

Step 3: the data set formed using step 1 is iterated training to the network that step 2 obtains and obtains training pattern；

Step 4: to needing the video of detection senses to be input in the training pattern that step 3 obtains by frame, extracting feature, detained Turnout rail and common rail are distinguished according to the position and classification information of fastener and shoulder block in the position and classification information of part and shoulder block Road；

2. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, extract feature in the step 4, the process of the position and classification information that obtain fastener and shoulder block is as follows:

The multiple regions of entire image are randomly selected in each frame image of input training pattern；Each area is obtained by convolution The scoring in domain；It is judged as fastener and shoulder block if score value is greater than given threshold；Abandoning if score value is less than given threshold should As a result, traversal all areas, then complete fastener and shoulder block mark, obtain position and classification information.

3. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, is clustered in the step 5 by DBSCAN algorithm.

4. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 3, feature It is, the cluster process is as follows:

S2: an arbitrarily selected target p；

S3: judging whether it already belongs to some cluster or have become noise spot, if then return step S2, if being otherwise transferred to step Rapid S4；

S4: judging whether the point of the contiguous range of p is less than Minpts, if then marking p is noise spot, is transferred to step S2；If not Then it is transferred to step S5；

S6: judging whether the point of the contiguous range of q is less than Minpts, if being then transferred to step S5, if being otherwise transferred to step S7；

5. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, the similarity between each priori frame and true frame is calculated by the German number of outstanding card in training process in the step 3, It is included in short-list if similarity is greater than given threshold, is not otherwise included in short-list.

6. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, using the parameter information of original SSD model as the initiation parameter of the new model to be trained in the step 2.

7. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, makes data set according to VOC data set format in the step 1.

8. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 1, feature It is, loss function L (x, c, l, g) is as follows in the step 2:

In formula: N is the quantity for being matched to the priori frame of real goal, L_conf(x, c) is confidence loss function, L_locThe position (x, l, g) Loss function is set, l is priori frame, and g is true frame, and c is confidence level of the Softmax function to every classification, and α is for adjusting position The parameter of ratio between loss and confidence loss, x are center coordinate.

9. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 3, feature It is, the perception of rail is as follows in the step 5:

S21: using the picture upper left corner as coordinate origin, abscissa and ordinate increase downwards to the right respectively, and the direction x coordinate is characterized The target that step 4 is obtained is polymerized to n class；

S22: according in step S21 cluster result and step 4 obtained in fastener and shoulder block location information, respectively will be each The left or right side abscissa positions information of y-coordinate minimum and maximum outlined connects in class；Intermediate folded as rail Position.

10. a kind of high-speed rail scene perception method based on deep learning and structural information according to claim 3, feature It is, the perception of sleeper is as follows in the step 5:

S31: using the picture upper left corner as coordinate origin, abscissa and ordinate increase downwards to the right respectively, and the direction y coordinate is characterized The target that step 4 is obtained is polymerized to m class；

S32: according to the cluster result and prior information of step S31, choosing a line sleeper, on each fastener ordinate or Lower k pixel is to the region of boundary；

S33: the interception area step S32 is projected, and is found gradient in perspective view according to prior information and is greater than certain threshold value Point obtains boundary point, completes the perception to sleeper position.