CN116912661A - Target track prediction method and system with domain generalization capability - Google Patents
Target track prediction method and system with domain generalization capability Download PDFInfo
- Publication number
- CN116912661A CN116912661A CN202310892764.5A CN202310892764A CN116912661A CN 116912661 A CN116912661 A CN 116912661A CN 202310892764 A CN202310892764 A CN 202310892764A CN 116912661 A CN116912661 A CN 116912661A
- Authority
- CN
- China
- Prior art keywords
- track
- loss
- target
- end point
- road
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000013528 artificial neural network Methods 0.000 claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 13
- 230000002452 interceptive effect Effects 0.000 claims abstract description 6
- 230000000295 complement effect Effects 0.000 claims abstract description 5
- 238000011176 pooling Methods 0.000 claims description 9
- 230000010365 information processing Effects 0.000 claims description 8
- 230000003993 interaction Effects 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 230000007787 long-term memory Effects 0.000 claims description 3
- 230000015654 memory Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000006403 short-term memory Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 3
- 238000012360 testing method Methods 0.000 description 9
- 230000006872 improvement Effects 0.000 description 7
- 238000013136 deep learning model Methods 0.000 description 3
- 238000012733 comparative method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 235000004522 Pentaglottis sempervirens Nutrition 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000000452 restraining effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a target track prediction method and a target track prediction system with domain generalization capability, wherein target track information and vector map information of a scene where the target track information is located are firstly obtained; processing the map information to extract map information characteristics; extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics; then, predicting the vehicle end point based on the invariance principle to obtain a predicted track end point; the predicted track end point codes, the interactive features, the map features and the historical track features are input into a time sequence neural network decoder in a cascading way to complement the rest future track points; and training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the kl divergence loss of each data domain, superposing the average with the invariance principle loss to form the final loss, and finally realizing target track prediction and improving the safety of the model and the accuracy of the model in an unknown domain.
Description
Technical Field
The invention belongs to the technical field of track prediction, and mainly relates to a target track prediction method and system with domain generalization capability.
Background
The deep learning model is an effective method for solving the problem of target track prediction. However, the deep learning model is extremely dependent on the distribution of training data, and cannot be adapted to the situation that the distribution difference between the source domain and the target domain is large. The distribution deviation of the data fields is unavoidable due to the different acquisition areas, acquisition modes and processing methods of the track data, so that the weak generalization capability of the model can bring about great potential safety hazard.
In the prior art, a plurality of target track prediction methods are provided, for example, chinese patent with application number CN202011493671.8 discloses a track prediction method based on BEV bird's eye view, the patent converts track and scene information into image data, and obtains a future track of a target to be predicted by encoding and decoding the image data; the Chinese patent with the application number of CN202010658245.9 discloses a track prediction method based on a graph network, global scene characteristics are extracted through a plurality of tracks and local point sets of the environment, and finally predicted tracks and corresponding probabilities are estimated according to the global scene characteristics; the Chinese patent with the application number of CN202210657220.6 proposes a method based on a double-head attention mechanism, and a graph attention network of nodes is constructed according to a target relation graph and a historical motion trail. However, in the method for solving the track prediction problem by adopting the deep learning model, the generalization problem caused by the data distribution difference is mostly ignored, the distribution of the training domain and the testing domain of the model is mostly similar, and the performance effect in the unknown domain is ignored.
In the extraction network design of scene context, the convolutional network has limited receptive field in a high-definition scene rendering mode, the method for modeling the map by the graph neural network focuses on the topological relation among lanes, and the directivity of the lanes and the time sequence existing when the track is constrained are not emphasized.
Disclosure of Invention
Aiming at the problems of poor prediction effect and poor generalization capability of a target track model under an unknown domain in the prior art, the invention provides a target track prediction method and a target track prediction system with domain generalization capability, which are characterized in that target track information and vector map information of a scene where the target track information is located are firstly obtained; processing the map information to extract map information characteristics; extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics; then, predicting the vehicle end point based on the invariance principle to obtain a predicted track end point; the predicted track end point codes, the interactive features, the map features and the historical track features are input into a time sequence neural network decoder in a cascading way to complement the rest future track points; and training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the k l divergence loss of each data domain, superposing the average with the invariance principle loss to form the final loss, and finally realizing target track prediction and improving the safety of the model and the accuracy of the model in an unknown domain.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a target track prediction method with domain generalization capability comprises the following steps:
s1, acquiring training data: acquiring target track information and vector map information of a scene where the target track information is located;
s2, map information processing: inputting the map information obtained in the step S1 into a map information processing module to extract the central line information of the road, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information into a long-term and short-term memory network to extract the map information characteristics;
s3, time sequence network coding: extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics;
s4, predicting the vehicle end point based on the invariance principle: cascading the map features, the historical track features and the track end point features output in the step S3 to an end point prediction module based on the invariance principle to obtain a predicted track end point; the step comprises the loss of invariance principle, the loss of KL divergence and the loss of end point reconstruction;
s5, filling the remaining track points: extracting interaction characteristics from the target track characteristics of the adjacent vehicles output by the time sequence neural network encoder through a social pooling module; inputting the track end point code predicted in the step S4, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points;
s6, parameter optimization: training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the k l divergence loss of each data domain, and superposing the average with the invariance principle loss to form a final loss;
s7, target track prediction: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
As an improvement of the present invention, in the step S1, the target track information includes at least a set of coordinate points of the position where the target is located, and the periphery of the target is adjacent to the set of coordinate points of the target; the vector map information at least comprises map coordinate points, lane information formed by the coordinate points and topological relations among lanes.
As an improvement of the present invention, the step S2 further includes:
s21, extracting a road center line: aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged firstly, and the left lane line is reversed according to the situation; the lane line coordinates with fewer nodes are expanded by using a primary spline interpolation formula, so that the number of the nodes of the left lane line and the right lane line is consistent, and the interpolation formula is as follows:
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
wherein the coordinates of the lane line are (k, j), r is the number of nodes of the lane line, L 1,r (k) Representing node (k) r ,j r ) And (k) r+1 ,j r+1 ) The interpolation formula of the primary spline between f [ k ] r ,k r+1 ]Is the slope between the two points.A primary spline interpolation formula for all points of the lane line;
s22, constructing a road polygon: the minimum polygon covering the road is obtained according to the left and right lane lines, the construction formula is as follows, (k, j) is the coordinates of the left and right lane lines,for the road polygon:
s23, searching adjacent roads: determining a coordinate range of a search road according to the endpoints of the current vehicle history track and the set search range superparameter; traversing road range polygons in accordance with search rangesCoordinate normalization is realized by taking a historical track endpoint as a reference on the road center line with the overlapped range;
s24, sequencing the central lines: sequencing the normalized road center line set in the step S23 according to the center line distance;
s25, inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the characteristic V m 。
As another improvement of the present invention, the center line distance d in the step S24 is the euclidean distance of the origin, specifically:
wherein, (a, b) is the road centerline coordinates.
As a further improvement of the present invention, in the step S4, the principle of invariance is lostThe formula is as follows,
wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
the KL divergence lossThe method comprises the following steps:
wherein Z is a hidden variable in the constraint condition variation self-encoder, the condition distribution of Z meets the normal distribution, the mean value is mu, and the variance is sigma 2 ;
The end point reconstruction lossThe l2 norm defined as the endpoint truth and predicted values is given by:
where m represents the number of endpoints, alpha represents the predicted future step size,in the environment e, the predicted value of the endpoint of the ith node in the alpha-th time frame,/>In the e environment, the endpoint true value of the ith node in the alpha time frame.
As a further improvement of the present invention, the track deviation loss formula of the remaining track points in the step S5 is
As a further improvement of the present invention, the final loss in step S6 is:
wherein L is irm In order to minimize the loss term without changing the risk, n represents the number of data fields, and the coefficient gamma=delta=eta=1 to which the loss function belongs, and lambda is an adjustable parameter.
In order to achieve the above purpose, the invention also adopts the technical scheme that: a domain generalization capable target trajectory prediction system comprising a computer program which when executed by a processor performs the steps of any of the methods described above.
Compared with the prior art, the invention has the beneficial effects that:
1. the end point prediction module based on the invariance principle combines the invariance principle with the condition variation self-encoder, so that the situation that the distribution difference between a source domain and an unknown domain is large can be overcome, the generalization capability of the model is enhanced, the model is adapted to a more complex prediction scene, and the safety of a target track prediction model is improved;
2. the map information processing module is used for capturing the directional characteristics of the lanes, providing priori knowledge for track prediction, restraining the predicted future tracks from avoiding scene boundaries, and improving the accuracy of the model in an unknown domain.
Drawings
FIG. 1 is a flow chart of a method for predicting a target track with domain generalization capability according to the present invention;
FIG. 2 is a flowchart of a map information processing module in step S2 of the method of the present invention;
FIG. 3 is a flowchart of the social pooling module in step S5 of the method of the present invention.
Detailed Description
The present invention is further illustrated in the following drawings and detailed description, which are to be understood as being merely illustrative of the invention and not limiting the scope of the invention.
Example 1
The method can be used in the fields of automatic driving, auxiliary driving and the like. The present embodiment describes a case where the present method is applied to the field of automatic driving.
The present embodiment will be specifically described with reference to fig. 1 to 3, which is a target trajectory prediction method with domain generalization capability, and as shown in fig. 1, the method includes the following steps:
step S1: acquiring training data
Obtaining information such as a target track and a high-definition vector map of a scene where the target track is located, wherein the obtained data information comprises: the system comprises a coordinate point set of a position where a target is located, a coordinate point set of a periphery of the target adjacent to the target, a vector map of a scene where the target is located, wherein the map comprises coordinate points, lane information consisting of the coordinate points and topological relations among lanes.
In the training preparation phase, enough training data needs to be acquired, at least three different data fields are needed first, and each data field comprises a vehicle history track, a vehicle future track and a high-definition vector map of the current data field.
Step S2: map information processing
As shown in fig. 2, the high-definition vector map needs to be input into a map information processing module to be extractedRoad center line information, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information to a long-term and short-term memory network to extract the map information characteristic V m 。
S21: extracting the center line of the road
The invention uses the road center line as map input information, reduces the model parameter, and the road center line is obtained by calculating the end points of left and right lane lines of the road. Aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged first, and the left lane line is reversed according to the situation. Since the difference of the number of nodes on the left and right sides of the lane can cause the calculation error of the central line, a spline interpolation formula is used to expand the coordinates of the lane lines with fewer nodes, if the coordinates of the lane lines are (k, j):
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
s22: constructing road polygons
Obtaining a minimum polygon covering the road according to the left lane line and the right lane line, and constructing a formula as follows, wherein (k, j) is coordinates of the left lane line and the right lane line:
s23: searching for adjacent roads
And determining the coordinate range of the searched road according to the end point of the current vehicle history track and the set search range super-parameters. Traversing road range polygons in accordance with search rangesSeating is realized by taking a center line of a road with overlapping scope as a reference of a historical track end pointAnd (5) normalization.
S24: ordering centerlines
Sequencing the normalized road center line set in the step S23 according to the center line distance, (a, b) is the coordinates of the road center line, the center line distance d defined by the invention is the Euclidean distance between the center line starting point and the end point and the historical track end point, namely the origin, and the formula is as follows:
s25: inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the characteristic V m 。
Step S3: and (3) encoding the time sequence network, extracting adjacent target tracks, and inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract features.
S31: according to the historical track of the vehicle to be predicted, according to a 13X 13 space grid structure, the vehicle processing the current space grid position in the same time frame is searched out, and each unit of the space grid is 5 meters long and 2 meters wide as the adjacent vehicle of the vehicle to be predicted.
S32: the vehicle history track data extracted from the data set and the adjacent vehicle tracks are sequentially input to the LSTM encoder to extract features.
Step S4: and (3) carrying out vehicle end point prediction based on an invariance principle, inputting map features, historical track features and track end point features output by the time sequence neural network encoder into an end point prediction module based on the invariance principle in a cascading manner, obtaining a predicted track end point, and improving generalization of end point prediction by using the invariance principle.
S41: the constant risk minimization principle is used for acting on the conditional variation self-encoder structure, the constant risk minimization loss promotes the distribution learned by the module to enhance the correlation with the invariance characteristic, and the generalization capability of the module is improved. Loss of invariance principleThe formula is as follows, wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
s42: the penalty of the end-point prediction module based on invariance principles also includes constraint variations from the hidden variable Z distribution in the encoder. Assuming that the conditional distribution of the fitting Z of the invention meets the normal distribution, the mean value is mu, and the variance is sigma 2 KL divergence lossLet the distribution of hidden variable Z be closer to N (0, 1), the formula is as follows:
s43: the loss of the endpoint prediction module based on the invariance principle further comprises an endpoint reconstruction loss, wherein the endpoint reconstruction loss is defined as l2 norms of an endpoint true value and a predicted value, the formula is as follows, m represents the endpoint number, and alpha represents the predicted future step size:
step S5: and (3) complementing the remaining track points, and extracting interaction characteristics from the adjacent vehicle target track characteristics output by the time sequence neural network encoder through a social pooling module. And (3) inputting the predicted track end point codes, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points.
S51: the output of the LSTM of the adjacent vehicles is processed by a social pooling module to extract the interactive characteristic information to obtain the characteristic V n . As shown in FIG. 3, the social pooling tensor is organized by 13X 13 spatial grid structure, the history of neighboring vehiclesTrack coding fills in to corresponding positions of the spatial grid. The social pooling tensor passes through a layer of 3×3 convolution kernel and a layer of 3×1 convolution kernel, and a layer of pooling layer finally outputs the adjacent vehicle interaction characteristics V n . The social pooling module facilitates capturing interaction characteristics between neighboring vehicles and a predicted vehicle.
S52: s43, the predicted vehicle track end point value passes through the full connection layerGenerating a prediction end point code, and combining the prediction end point code with the adjacent vehicle interaction characteristic V n Map information feature V m Historical track feature V t The cascade inputs into the LSTM decoder, ultimately outputting the remaining future track points. The track deviation loss formula of the track residual point is as follows
Step S6: to play the role of the constant risk minimization, a plurality of source domains are trained together to calculate a constant risk minimization loss item L irm . The overall loss function of the model is as follows, n represents the number of data fields, the coefficient gamma=delta=eta=1 of the loss function, lambda is an adjustable parameter, and the action strength of the risk-free minimum loss term is determined:
step S7: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
In order to verify the correctness and rationality of the implementation mode, two road conditions of the rotary island and the intersection in the public data set INTERACTON are adopted for testing, each map scene is defined as a data field, one map scene is selected as a test set, the rest map scenes are used as training sets, and the selection of the test fields is determined according to the track data quantity. The test set data is not visible to the model during the training process. The present embodiment divides the track into 5s segments, including a history track of 2s and a future track of 3 s. The original sampling frequency of the vehicle track is 10Hz, the track is smooth and real, the model parameter is reduced, the track is downsampled, the downsampling coefficient is 2, namely the step length of the historical track is 10, and the step length of the future track is 15.
The test evaluation index measures the generalization capability of the vehicle track model by using the minimum average displacement error. The mADE is defined as the average L2 norm of the predicted track and the true track point with the minimum interpolation of the generated k end points and the true values, and the formula is as follows:
the test results are shown in tables 1 and 2:
TABLE 1 results of roundabout scene test
Time | ERM | MMD | DANN | IRM | Rex | The invention is that |
1s | 0.32 | 0.15 | 0.40 | 0.22 | 0.18 | 0.18 |
2s | 0.73 | 0.55 | 0.84 | 0.58 | 0.67 | 0.41 |
3s | 1.51 | 1.31 | 1.71 | 1.30 | 1.46 | 0.72 |
Table 2 intersection scene test results
Time | ERM | MMD | DANN | IRM | Rex | The invention is that |
1s | 0.19 | 0.13 | 0.24 | 0.24 | 0.12 | 0.12 |
2s | 0.39 | 0.32 | 0.56 | 0.45 | 0.32 | 0.21 |
3s | 0.78 | 0.70 | 1.16 | 0.85 | 0.73 | 0.36 |
Introduction of each comparative method is shown in table 3:
table 3 introduction to the comparative method
According to experimental results, the method can play a role in improving generalization capability, and provides a new thought for improving model generalization and solving the problem of generalization of vehicle track prediction in deep learning.
It should be noted that the foregoing merely illustrates the technical idea of the present invention and is not intended to limit the scope of the present invention, and that a person skilled in the art may make several improvements and modifications without departing from the principles of the present invention, which fall within the scope of the claims of the present invention.
Claims (8)
1. The target track prediction method with domain generalization capability is characterized by comprising the following steps of:
s1, acquiring training data: acquiring target track information and vector map information of a scene where the target track information is located;
s2, map information processing: inputting the map information obtained in the step S1 into a map information processing module to extract the central line information of the road, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information into a long-term and short-term memory network to extract the map information characteristics;
s3, time sequence network coding: extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics;
s4, predicting the vehicle end point based on the invariance principle: cascading the map features, the historical track features and the track end point features output in the step S3 to an end point prediction module based on the invariance principle to obtain a predicted track end point; the step comprises the loss of invariance principle, the loss of KL divergence and the loss of end point reconstruction;
s5, filling the remaining track points: extracting interaction characteristics from the target track characteristics of the adjacent vehicles output by the time sequence neural network encoder through a social pooling module; inputting the track end point code predicted in the step S4, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points;
s6, parameter optimization: training a plurality of source domains together to obtain the end point reconstruction loss, the track deviation loss and the kl divergence loss of each data domain, and superposing the end point reconstruction loss, the track deviation loss and the kl divergence loss with the invariance principle loss to form a final loss;
s7, target track prediction: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
2. The method for predicting a target trajectory with domain generalization capability as claimed in claim 1, wherein: in the step S1, the target track information at least includes a set of coordinate points of the position where the target is located, and the periphery of the target is adjacent to the set of coordinate points of the target; the vector map information at least comprises map coordinate points, lane information formed by the coordinate points and topological relations among lanes.
3. The method for predicting a target trajectory with domain generalization capability as claimed in claim 2, wherein: the step S2 further includes:
s21, extracting a road center line: aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged firstly, and the left lane line is reversed according to the situation; the lane line nodes with fewer nodes are expanded by using a primary spline interpolation formula, so that the number of the nodes of the left lane line and the right lane line is consistent, and the interpolation formula is as follows:
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
wherein the coordinates of the lane line are%k, j), r is the number of nodes of the lane line, L 1,r (k) Representing node (k) r ,j r ) And (k) r+1 ,j r+1 ) The interpolation formula of the primary spline between f [ k ] r ,k r+1 ]For the slope between the two points,a primary spline interpolation formula for all points of the lane line;
s22, constructing a road polygon: the minimum polygon covering the road is obtained according to the left and right lane lines, the construction formula is as follows, (k, j) is the coordinates of the left and right lane lines,for the road polygon:
s23, searching adjacent roads: determining a coordinate range of a search road according to the endpoints of the current vehicle history track and the set search range superparameter; traversing road range polygons in accordance with search rangesCoordinate normalization is realized by taking a historical track endpoint as a reference on the road center line with the overlapped range;
s24, sequencing the central lines: sequencing the normalized road center line set in the step S23 according to the center line distance;
s25, inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the map information feature V m 。
4. A method for predicting a target trajectory with domain generalization capability as claimed in claim 3, wherein: the center line distance d in the step S24 is the euclidean distance of the origin, specifically:
wherein, (a, b) is the road centerline coordinates.
5. The method for predicting a target trajectory with domain generalization capability as claimed in claim 1, wherein: in the step S4, the principle of invariance is lostThe formula is as follows,
wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
the KL divergence lossThe method comprises the following steps:
wherein Z is a hidden variable in the constraint condition variation self-encoder, the condition distribution of Z meets the normal distribution, the mean value is mu, and the variance is sigma 2 ;
The end point reconstruction lossThe I2 norm, defined as the endpoint truth and predicted value, is given by:
where m represents the number of endpoints, alpha represents the predicted future step size,in the environment e, the predicted value of the endpoint of the ith node in the alpha-th time frame,/>In the e environment, the endpoint true value of the ith node in the alpha time frame.
6. The method for predicting a target trajectory with domain generalization capability as claimed in claim 5, wherein: the track deviation loss formula of the residual track points in the step S5 is as follows
7. The method for predicting a target trajectory with domain generalization capability as claimed in claim 6, wherein: the final loss in the step S6 is as follows:
wherein L is im In order to minimize the loss term without changing the risk, n represents the number of data fields, and the coefficient gamma=delta=eta=1 to which the loss function belongs, and lambda is an adjustable parameter.
8. A domain generalization capable target trajectory prediction system comprising a computer program characterized by: the computer program, when executed by a processor, implements the steps of the method as described in any of the above.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310892764.5A CN116912661A (en) | 2023-07-20 | 2023-07-20 | Target track prediction method and system with domain generalization capability |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310892764.5A CN116912661A (en) | 2023-07-20 | 2023-07-20 | Target track prediction method and system with domain generalization capability |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116912661A true CN116912661A (en) | 2023-10-20 |
Family
ID=88350717
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310892764.5A Pending CN116912661A (en) | 2023-07-20 | 2023-07-20 | Target track prediction method and system with domain generalization capability |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116912661A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117542004A (en) * | 2024-01-10 | 2024-02-09 | 杰创智能科技股份有限公司 | Offshore man-ship fitting method, device, equipment and storage medium |
-
2023
- 2023-07-20 CN CN202310892764.5A patent/CN116912661A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117542004A (en) * | 2024-01-10 | 2024-02-09 | 杰创智能科技股份有限公司 | Offshore man-ship fitting method, device, equipment and storage medium |
CN117542004B (en) * | 2024-01-10 | 2024-04-30 | 杰创智能科技股份有限公司 | Offshore man-ship fitting method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sadeghian et al. | Sophie: An attentive gan for predicting paths compliant to social and physical constraints | |
CN114120102A (en) | Boundary-optimized remote sensing image semantic segmentation method, device, equipment and medium | |
CN111462130B (en) | Method and apparatus for detecting lane lines included in input image using lane mask | |
CN113065594B (en) | Road network extraction method and device based on Beidou data and remote sensing image fusion | |
CN111767847B (en) | Pedestrian multi-target tracking method integrating target detection and association | |
CN113850129A (en) | Target detection method for rotary equal-variation space local attention remote sensing image | |
CN111950658B (en) | Deep learning-based LiDAR point cloud and optical image priori coupling classification method | |
CN114387265A (en) | Anchor-frame-free detection and tracking unified method based on attention module addition | |
CN116912661A (en) | Target track prediction method and system with domain generalization capability | |
Zhang et al. | A dual attention neural network for airborne LiDAR point cloud semantic segmentation | |
CN115359366A (en) | Remote sensing image target detection method based on parameter optimization | |
CN115071762A (en) | Pedestrian trajectory prediction method, model and storage medium oriented to urban scene | |
Kong et al. | Local Stereo Matching Using Adaptive Cross‐Region‐Based Guided Image Filtering with Orthogonal Weights | |
CN113191213A (en) | High-resolution remote sensing image newly-added building detection method | |
CN113378112A (en) | Point cloud completion method and device based on anisotropic convolution | |
CN117011701A (en) | Remote sensing image feature extraction method for hierarchical feature autonomous learning | |
Liu et al. | A new multi-channel deep convolutional neural network for semantic segmentation of remote sensing image | |
CN114419877A (en) | Vehicle track prediction data processing method and device based on road characteristics | |
Zao et al. | Topology-Guided Road Graph Extraction From Remote Sensing Images | |
CN117934524A (en) | Building contour extraction method and device | |
CN117036373A (en) | Large-scale point cloud segmentation method based on round function bilinear regular coding and decoding network | |
Bulatov et al. | Vectorization of road data extracted from aerial and uav imagery | |
Gaucher et al. | Accurate motion flow estimation with discontinuities | |
CN114332638B (en) | Remote sensing image target detection method and device, electronic equipment and medium | |
Zheng et al. | Deep inference networks for reliable vehicle lateral position estimation in congested urban environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |