CN116912661A - Target track prediction method and system with domain generalization capability - Google Patents

Target track prediction method and system with domain generalization capability Download PDF

Info

Publication number
CN116912661A
CN116912661A CN202310892764.5A CN202310892764A CN116912661A CN 116912661 A CN116912661 A CN 116912661A CN 202310892764 A CN202310892764 A CN 202310892764A CN 116912661 A CN116912661 A CN 116912661A
Authority
CN
China
Prior art keywords
track
loss
target
end point
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310892764.5A
Other languages
Chinese (zh)
Inventor
李煊鹏
卢一凡
薛启凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202310892764.5A priority Critical patent/CN116912661A/en
Publication of CN116912661A publication Critical patent/CN116912661A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a target track prediction method and a target track prediction system with domain generalization capability, wherein target track information and vector map information of a scene where the target track information is located are firstly obtained; processing the map information to extract map information characteristics; extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics; then, predicting the vehicle end point based on the invariance principle to obtain a predicted track end point; the predicted track end point codes, the interactive features, the map features and the historical track features are input into a time sequence neural network decoder in a cascading way to complement the rest future track points; and training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the kl divergence loss of each data domain, superposing the average with the invariance principle loss to form the final loss, and finally realizing target track prediction and improving the safety of the model and the accuracy of the model in an unknown domain.

Description

Target track prediction method and system with domain generalization capability
Technical Field
The invention belongs to the technical field of track prediction, and mainly relates to a target track prediction method and system with domain generalization capability.
Background
The deep learning model is an effective method for solving the problem of target track prediction. However, the deep learning model is extremely dependent on the distribution of training data, and cannot be adapted to the situation that the distribution difference between the source domain and the target domain is large. The distribution deviation of the data fields is unavoidable due to the different acquisition areas, acquisition modes and processing methods of the track data, so that the weak generalization capability of the model can bring about great potential safety hazard.
In the prior art, a plurality of target track prediction methods are provided, for example, chinese patent with application number CN202011493671.8 discloses a track prediction method based on BEV bird's eye view, the patent converts track and scene information into image data, and obtains a future track of a target to be predicted by encoding and decoding the image data; the Chinese patent with the application number of CN202010658245.9 discloses a track prediction method based on a graph network, global scene characteristics are extracted through a plurality of tracks and local point sets of the environment, and finally predicted tracks and corresponding probabilities are estimated according to the global scene characteristics; the Chinese patent with the application number of CN202210657220.6 proposes a method based on a double-head attention mechanism, and a graph attention network of nodes is constructed according to a target relation graph and a historical motion trail. However, in the method for solving the track prediction problem by adopting the deep learning model, the generalization problem caused by the data distribution difference is mostly ignored, the distribution of the training domain and the testing domain of the model is mostly similar, and the performance effect in the unknown domain is ignored.
In the extraction network design of scene context, the convolutional network has limited receptive field in a high-definition scene rendering mode, the method for modeling the map by the graph neural network focuses on the topological relation among lanes, and the directivity of the lanes and the time sequence existing when the track is constrained are not emphasized.
Disclosure of Invention
Aiming at the problems of poor prediction effect and poor generalization capability of a target track model under an unknown domain in the prior art, the invention provides a target track prediction method and a target track prediction system with domain generalization capability, which are characterized in that target track information and vector map information of a scene where the target track information is located are firstly obtained; processing the map information to extract map information characteristics; extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics; then, predicting the vehicle end point based on the invariance principle to obtain a predicted track end point; the predicted track end point codes, the interactive features, the map features and the historical track features are input into a time sequence neural network decoder in a cascading way to complement the rest future track points; and training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the k l divergence loss of each data domain, superposing the average with the invariance principle loss to form the final loss, and finally realizing target track prediction and improving the safety of the model and the accuracy of the model in an unknown domain.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a target track prediction method with domain generalization capability comprises the following steps:
s1, acquiring training data: acquiring target track information and vector map information of a scene where the target track information is located;
s2, map information processing: inputting the map information obtained in the step S1 into a map information processing module to extract the central line information of the road, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information into a long-term and short-term memory network to extract the map information characteristics;
s3, time sequence network coding: extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics;
s4, predicting the vehicle end point based on the invariance principle: cascading the map features, the historical track features and the track end point features output in the step S3 to an end point prediction module based on the invariance principle to obtain a predicted track end point; the step comprises the loss of invariance principle, the loss of KL divergence and the loss of end point reconstruction;
s5, filling the remaining track points: extracting interaction characteristics from the target track characteristics of the adjacent vehicles output by the time sequence neural network encoder through a social pooling module; inputting the track end point code predicted in the step S4, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points;
s6, parameter optimization: training a plurality of source domains together to obtain the average of the end point reconstruction loss, the track deviation loss and the k l divergence loss of each data domain, and superposing the average with the invariance principle loss to form a final loss;
s7, target track prediction: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
As an improvement of the present invention, in the step S1, the target track information includes at least a set of coordinate points of the position where the target is located, and the periphery of the target is adjacent to the set of coordinate points of the target; the vector map information at least comprises map coordinate points, lane information formed by the coordinate points and topological relations among lanes.
As an improvement of the present invention, the step S2 further includes:
s21, extracting a road center line: aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged firstly, and the left lane line is reversed according to the situation; the lane line coordinates with fewer nodes are expanded by using a primary spline interpolation formula, so that the number of the nodes of the left lane line and the right lane line is consistent, and the interpolation formula is as follows:
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
wherein the coordinates of the lane line are (k, j), r is the number of nodes of the lane line, L 1,r (k) Representing node (k) r ,j r ) And (k) r+1 ,j r+1 ) The interpolation formula of the primary spline between f [ k ] r ,k r+1 ]Is the slope between the two points.A primary spline interpolation formula for all points of the lane line;
s22, constructing a road polygon: the minimum polygon covering the road is obtained according to the left and right lane lines, the construction formula is as follows, (k, j) is the coordinates of the left and right lane lines,for the road polygon:
s23, searching adjacent roads: determining a coordinate range of a search road according to the endpoints of the current vehicle history track and the set search range superparameter; traversing road range polygons in accordance with search rangesCoordinate normalization is realized by taking a historical track endpoint as a reference on the road center line with the overlapped range;
s24, sequencing the central lines: sequencing the normalized road center line set in the step S23 according to the center line distance;
s25, inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the characteristic V m
As another improvement of the present invention, the center line distance d in the step S24 is the euclidean distance of the origin, specifically:
wherein, (a, b) is the road centerline coordinates.
As a further improvement of the present invention, in the step S4, the principle of invariance is lostThe formula is as follows,
wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
the KL divergence lossThe method comprises the following steps:
wherein Z is a hidden variable in the constraint condition variation self-encoder, the condition distribution of Z meets the normal distribution, the mean value is mu, and the variance is sigma 2
The end point reconstruction lossThe l2 norm defined as the endpoint truth and predicted values is given by:
where m represents the number of endpoints, alpha represents the predicted future step size,in the environment e, the predicted value of the endpoint of the ith node in the alpha-th time frame,/>In the e environment, the endpoint true value of the ith node in the alpha time frame.
As a further improvement of the present invention, the track deviation loss formula of the remaining track points in the step S5 is
As a further improvement of the present invention, the final loss in step S6 is:
wherein L is irm In order to minimize the loss term without changing the risk, n represents the number of data fields, and the coefficient gamma=delta=eta=1 to which the loss function belongs, and lambda is an adjustable parameter.
In order to achieve the above purpose, the invention also adopts the technical scheme that: a domain generalization capable target trajectory prediction system comprising a computer program which when executed by a processor performs the steps of any of the methods described above.
Compared with the prior art, the invention has the beneficial effects that:
1. the end point prediction module based on the invariance principle combines the invariance principle with the condition variation self-encoder, so that the situation that the distribution difference between a source domain and an unknown domain is large can be overcome, the generalization capability of the model is enhanced, the model is adapted to a more complex prediction scene, and the safety of a target track prediction model is improved;
2. the map information processing module is used for capturing the directional characteristics of the lanes, providing priori knowledge for track prediction, restraining the predicted future tracks from avoiding scene boundaries, and improving the accuracy of the model in an unknown domain.
Drawings
FIG. 1 is a flow chart of a method for predicting a target track with domain generalization capability according to the present invention;
FIG. 2 is a flowchart of a map information processing module in step S2 of the method of the present invention;
FIG. 3 is a flowchart of the social pooling module in step S5 of the method of the present invention.
Detailed Description
The present invention is further illustrated in the following drawings and detailed description, which are to be understood as being merely illustrative of the invention and not limiting the scope of the invention.
Example 1
The method can be used in the fields of automatic driving, auxiliary driving and the like. The present embodiment describes a case where the present method is applied to the field of automatic driving.
The present embodiment will be specifically described with reference to fig. 1 to 3, which is a target trajectory prediction method with domain generalization capability, and as shown in fig. 1, the method includes the following steps:
step S1: acquiring training data
Obtaining information such as a target track and a high-definition vector map of a scene where the target track is located, wherein the obtained data information comprises: the system comprises a coordinate point set of a position where a target is located, a coordinate point set of a periphery of the target adjacent to the target, a vector map of a scene where the target is located, wherein the map comprises coordinate points, lane information consisting of the coordinate points and topological relations among lanes.
In the training preparation phase, enough training data needs to be acquired, at least three different data fields are needed first, and each data field comprises a vehicle history track, a vehicle future track and a high-definition vector map of the current data field.
Step S2: map information processing
As shown in fig. 2, the high-definition vector map needs to be input into a map information processing module to be extractedRoad center line information, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information to a long-term and short-term memory network to extract the map information characteristic V m
S21: extracting the center line of the road
The invention uses the road center line as map input information, reduces the model parameter, and the road center line is obtained by calculating the end points of left and right lane lines of the road. Aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged first, and the left lane line is reversed according to the situation. Since the difference of the number of nodes on the left and right sides of the lane can cause the calculation error of the central line, a spline interpolation formula is used to expand the coordinates of the lane lines with fewer nodes, if the coordinates of the lane lines are (k, j):
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
s22: constructing road polygons
Obtaining a minimum polygon covering the road according to the left lane line and the right lane line, and constructing a formula as follows, wherein (k, j) is coordinates of the left lane line and the right lane line:
s23: searching for adjacent roads
And determining the coordinate range of the searched road according to the end point of the current vehicle history track and the set search range super-parameters. Traversing road range polygons in accordance with search rangesSeating is realized by taking a center line of a road with overlapping scope as a reference of a historical track end pointAnd (5) normalization.
S24: ordering centerlines
Sequencing the normalized road center line set in the step S23 according to the center line distance, (a, b) is the coordinates of the road center line, the center line distance d defined by the invention is the Euclidean distance between the center line starting point and the end point and the historical track end point, namely the origin, and the formula is as follows:
s25: inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the characteristic V m
Step S3: and (3) encoding the time sequence network, extracting adjacent target tracks, and inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract features.
S31: according to the historical track of the vehicle to be predicted, according to a 13X 13 space grid structure, the vehicle processing the current space grid position in the same time frame is searched out, and each unit of the space grid is 5 meters long and 2 meters wide as the adjacent vehicle of the vehicle to be predicted.
S32: the vehicle history track data extracted from the data set and the adjacent vehicle tracks are sequentially input to the LSTM encoder to extract features.
Step S4: and (3) carrying out vehicle end point prediction based on an invariance principle, inputting map features, historical track features and track end point features output by the time sequence neural network encoder into an end point prediction module based on the invariance principle in a cascading manner, obtaining a predicted track end point, and improving generalization of end point prediction by using the invariance principle.
S41: the constant risk minimization principle is used for acting on the conditional variation self-encoder structure, the constant risk minimization loss promotes the distribution learned by the module to enhance the correlation with the invariance characteristic, and the generalization capability of the module is improved. Loss of invariance principleThe formula is as follows, wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
s42: the penalty of the end-point prediction module based on invariance principles also includes constraint variations from the hidden variable Z distribution in the encoder. Assuming that the conditional distribution of the fitting Z of the invention meets the normal distribution, the mean value is mu, and the variance is sigma 2 KL divergence lossLet the distribution of hidden variable Z be closer to N (0, 1), the formula is as follows:
s43: the loss of the endpoint prediction module based on the invariance principle further comprises an endpoint reconstruction loss, wherein the endpoint reconstruction loss is defined as l2 norms of an endpoint true value and a predicted value, the formula is as follows, m represents the endpoint number, and alpha represents the predicted future step size:
step S5: and (3) complementing the remaining track points, and extracting interaction characteristics from the adjacent vehicle target track characteristics output by the time sequence neural network encoder through a social pooling module. And (3) inputting the predicted track end point codes, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points.
S51: the output of the LSTM of the adjacent vehicles is processed by a social pooling module to extract the interactive characteristic information to obtain the characteristic V n . As shown in FIG. 3, the social pooling tensor is organized by 13X 13 spatial grid structure, the history of neighboring vehiclesTrack coding fills in to corresponding positions of the spatial grid. The social pooling tensor passes through a layer of 3×3 convolution kernel and a layer of 3×1 convolution kernel, and a layer of pooling layer finally outputs the adjacent vehicle interaction characteristics V n . The social pooling module facilitates capturing interaction characteristics between neighboring vehicles and a predicted vehicle.
S52: s43, the predicted vehicle track end point value passes through the full connection layerGenerating a prediction end point code, and combining the prediction end point code with the adjacent vehicle interaction characteristic V n Map information feature V m Historical track feature V t The cascade inputs into the LSTM decoder, ultimately outputting the remaining future track points. The track deviation loss formula of the track residual point is as follows
Step S6: to play the role of the constant risk minimization, a plurality of source domains are trained together to calculate a constant risk minimization loss item L irm . The overall loss function of the model is as follows, n represents the number of data fields, the coefficient gamma=delta=eta=1 of the loss function, lambda is an adjustable parameter, and the action strength of the risk-free minimum loss term is determined:
step S7: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
In order to verify the correctness and rationality of the implementation mode, two road conditions of the rotary island and the intersection in the public data set INTERACTON are adopted for testing, each map scene is defined as a data field, one map scene is selected as a test set, the rest map scenes are used as training sets, and the selection of the test fields is determined according to the track data quantity. The test set data is not visible to the model during the training process. The present embodiment divides the track into 5s segments, including a history track of 2s and a future track of 3 s. The original sampling frequency of the vehicle track is 10Hz, the track is smooth and real, the model parameter is reduced, the track is downsampled, the downsampling coefficient is 2, namely the step length of the historical track is 10, and the step length of the future track is 15.
The test evaluation index measures the generalization capability of the vehicle track model by using the minimum average displacement error. The mADE is defined as the average L2 norm of the predicted track and the true track point with the minimum interpolation of the generated k end points and the true values, and the formula is as follows:
the test results are shown in tables 1 and 2:
TABLE 1 results of roundabout scene test
Time ERM MMD DANN IRM Rex The invention is that
1s 0.32 0.15 0.40 0.22 0.18 0.18
2s 0.73 0.55 0.84 0.58 0.67 0.41
3s 1.51 1.31 1.71 1.30 1.46 0.72
Table 2 intersection scene test results
Time ERM MMD DANN IRM Rex The invention is that
1s 0.19 0.13 0.24 0.24 0.12 0.12
2s 0.39 0.32 0.56 0.45 0.32 0.21
3s 0.78 0.70 1.16 0.85 0.73 0.36
Introduction of each comparative method is shown in table 3:
table 3 introduction to the comparative method
According to experimental results, the method can play a role in improving generalization capability, and provides a new thought for improving model generalization and solving the problem of generalization of vehicle track prediction in deep learning.
It should be noted that the foregoing merely illustrates the technical idea of the present invention and is not intended to limit the scope of the present invention, and that a person skilled in the art may make several improvements and modifications without departing from the principles of the present invention, which fall within the scope of the claims of the present invention.

Claims (8)

1. The target track prediction method with domain generalization capability is characterized by comprising the following steps of:
s1, acquiring training data: acquiring target track information and vector map information of a scene where the target track information is located;
s2, map information processing: inputting the map information obtained in the step S1 into a map information processing module to extract the central line information of the road, constructing a road polygon, searching adjacent roads, sorting according to the related distance of the map, and inputting the map information into a long-term and short-term memory network to extract the map information characteristics;
s3, time sequence network coding: extracting adjacent target tracks, inputting target history track data and the adjacent target tracks into a time sequence neural network encoder to extract characteristics;
s4, predicting the vehicle end point based on the invariance principle: cascading the map features, the historical track features and the track end point features output in the step S3 to an end point prediction module based on the invariance principle to obtain a predicted track end point; the step comprises the loss of invariance principle, the loss of KL divergence and the loss of end point reconstruction;
s5, filling the remaining track points: extracting interaction characteristics from the target track characteristics of the adjacent vehicles output by the time sequence neural network encoder through a social pooling module; inputting the track end point code predicted in the step S4, the interactive features, the map features and the historical track features into a time sequence neural network decoder in cascade to complement the rest future track points;
s6, parameter optimization: training a plurality of source domains together to obtain the end point reconstruction loss, the track deviation loss and the kl divergence loss of each data domain, and superposing the end point reconstruction loss, the track deviation loss and the kl divergence loss with the invariance principle loss to form a final loss;
s7, target track prediction: and finally, inputting the historical track data into a model, sampling the distribution of the end point prediction module to generate track points, and then complementing the rest track points to complete the future track prediction of the vehicle.
2. The method for predicting a target trajectory with domain generalization capability as claimed in claim 1, wherein: in the step S1, the target track information at least includes a set of coordinate points of the position where the target is located, and the periphery of the target is adjacent to the set of coordinate points of the target; the vector map information at least comprises map coordinate points, lane information formed by the coordinate points and topological relations among lanes.
3. The method for predicting a target trajectory with domain generalization capability as claimed in claim 2, wherein: the step S2 further includes:
s21, extracting a road center line: aiming at the situation that the directions of the lane lines of the vector map are inconsistent, whether the directions of the lane lines are consistent or not needs to be judged firstly, and the left lane line is reversed according to the situation; the lane line nodes with fewer nodes are expanded by using a primary spline interpolation formula, so that the number of the nodes of the left lane line and the right lane line is consistent, and the interpolation formula is as follows:
f[k r ,k r+1 ]=(j r+1 -j r )/(k r+1 -k r )
L 1,r (k)=j r +f[k r ,k r+1 ](k-k r )
wherein the coordinates of the lane line are%k, j), r is the number of nodes of the lane line, L 1,r (k) Representing node (k) r ,j r ) And (k) r+1 ,j r+1 ) The interpolation formula of the primary spline between f [ k ] r ,k r+1 ]For the slope between the two points,a primary spline interpolation formula for all points of the lane line;
s22, constructing a road polygon: the minimum polygon covering the road is obtained according to the left and right lane lines, the construction formula is as follows, (k, j) is the coordinates of the left and right lane lines,for the road polygon:
s23, searching adjacent roads: determining a coordinate range of a search road according to the endpoints of the current vehicle history track and the set search range superparameter; traversing road range polygons in accordance with search rangesCoordinate normalization is realized by taking a historical track endpoint as a reference on the road center line with the overlapped range;
s24, sequencing the central lines: sequencing the normalized road center line set in the step S23 according to the center line distance;
s25, inputting the processed center line set into an encoder composed of a long-short-term memory network LSTM to extract the map information feature V m
4. A method for predicting a target trajectory with domain generalization capability as claimed in claim 3, wherein: the center line distance d in the step S24 is the euclidean distance of the origin, specifically:
wherein, (a, b) is the road centerline coordinates.
5. The method for predicting a target trajectory with domain generalization capability as claimed in claim 1, wherein: in the step S4, the principle of invariance is lostThe formula is as follows,
wherein w is a linear classifier, phi is the endpoint prediction module, R e The deviation of the end point predicted value and the true value in the environment e is that n is the number of source domains:
the KL divergence lossThe method comprises the following steps:
wherein Z is a hidden variable in the constraint condition variation self-encoder, the condition distribution of Z meets the normal distribution, the mean value is mu, and the variance is sigma 2
The end point reconstruction lossThe I2 norm, defined as the endpoint truth and predicted value, is given by:
where m represents the number of endpoints, alpha represents the predicted future step size,in the environment e, the predicted value of the endpoint of the ith node in the alpha-th time frame,/>In the e environment, the endpoint true value of the ith node in the alpha time frame.
6. The method for predicting a target trajectory with domain generalization capability as claimed in claim 5, wherein: the track deviation loss formula of the residual track points in the step S5 is as follows
7. The method for predicting a target trajectory with domain generalization capability as claimed in claim 6, wherein: the final loss in the step S6 is as follows:
wherein L is im In order to minimize the loss term without changing the risk, n represents the number of data fields, and the coefficient gamma=delta=eta=1 to which the loss function belongs, and lambda is an adjustable parameter.
8. A domain generalization capable target trajectory prediction system comprising a computer program characterized by: the computer program, when executed by a processor, implements the steps of the method as described in any of the above.
CN202310892764.5A 2023-07-20 2023-07-20 Target track prediction method and system with domain generalization capability Pending CN116912661A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310892764.5A CN116912661A (en) 2023-07-20 2023-07-20 Target track prediction method and system with domain generalization capability

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310892764.5A CN116912661A (en) 2023-07-20 2023-07-20 Target track prediction method and system with domain generalization capability

Publications (1)

Publication Number Publication Date
CN116912661A true CN116912661A (en) 2023-10-20

Family

ID=88350717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310892764.5A Pending CN116912661A (en) 2023-07-20 2023-07-20 Target track prediction method and system with domain generalization capability

Country Status (1)

Country Link
CN (1) CN116912661A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117542004A (en) * 2024-01-10 2024-02-09 杰创智能科技股份有限公司 Offshore man-ship fitting method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117542004A (en) * 2024-01-10 2024-02-09 杰创智能科技股份有限公司 Offshore man-ship fitting method, device, equipment and storage medium
CN117542004B (en) * 2024-01-10 2024-04-30 杰创智能科技股份有限公司 Offshore man-ship fitting method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Sadeghian et al. Sophie: An attentive gan for predicting paths compliant to social and physical constraints
CN114120102A (en) Boundary-optimized remote sensing image semantic segmentation method, device, equipment and medium
CN111462130B (en) Method and apparatus for detecting lane lines included in input image using lane mask
CN113065594B (en) Road network extraction method and device based on Beidou data and remote sensing image fusion
CN111767847B (en) Pedestrian multi-target tracking method integrating target detection and association
CN113850129A (en) Target detection method for rotary equal-variation space local attention remote sensing image
CN111950658B (en) Deep learning-based LiDAR point cloud and optical image priori coupling classification method
CN114387265A (en) Anchor-frame-free detection and tracking unified method based on attention module addition
CN116912661A (en) Target track prediction method and system with domain generalization capability
Zhang et al. A dual attention neural network for airborne LiDAR point cloud semantic segmentation
CN115359366A (en) Remote sensing image target detection method based on parameter optimization
CN115071762A (en) Pedestrian trajectory prediction method, model and storage medium oriented to urban scene
Kong et al. Local Stereo Matching Using Adaptive Cross‐Region‐Based Guided Image Filtering with Orthogonal Weights
CN113191213A (en) High-resolution remote sensing image newly-added building detection method
CN113378112A (en) Point cloud completion method and device based on anisotropic convolution
CN117011701A (en) Remote sensing image feature extraction method for hierarchical feature autonomous learning
Liu et al. A new multi-channel deep convolutional neural network for semantic segmentation of remote sensing image
CN114419877A (en) Vehicle track prediction data processing method and device based on road characteristics
Zao et al. Topology-Guided Road Graph Extraction From Remote Sensing Images
CN117934524A (en) Building contour extraction method and device
CN117036373A (en) Large-scale point cloud segmentation method based on round function bilinear regular coding and decoding network
Bulatov et al. Vectorization of road data extracted from aerial and uav imagery
Gaucher et al. Accurate motion flow estimation with discontinuities
CN114332638B (en) Remote sensing image target detection method and device, electronic equipment and medium
Zheng et al. Deep inference networks for reliable vehicle lateral position estimation in congested urban environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination