CN118025203A

CN118025203A - Automatic driving vehicle behavior prediction method and system integrating complex network and graph converter

Info

Publication number: CN118025203A
Application number: CN202410176460.3A
Authority: CN
Inventors: 刘擎超; 王林强; 赵晶娅; 蔡英凤; 王海; 陈龙; 熊晓夏; 刘泽
Original assignee: Jiangsu University
Current assignee: Jiangsu University
Priority date: 2023-07-12
Filing date: 2024-02-08
Publication date: 2024-05-14
Anticipated expiration: 2044-02-08
Also published as: CN116853272A; CN118025203B

Abstract

The invention provides an automatic driving vehicle behavior prediction system and method integrating a complex network and a graph converter, firstly, constructing a vehicle interaction complex network, and simultaneously taking original map information data, vehicle motion data and local interaction information as multi-layer input of a model; then, modeling and learning complex relations among vehicles by means of graph convolution and multi-head node attention mechanisms of graph transformers to acquire global and local correlations of the vehicles; finally, by training a prediction model, the graph transform model has good generalization and robustness, and can accurately output the behavior of the automatic driving vehicle.

Description

Automatic driving vehicle behavior prediction method and system integrating complex network and graph converter

Technical Field

The application relates to the field of automatic driving behavior prediction, in particular to a method and a system for predicting the behavior of an automatic driving vehicle by combining a complex network and a graph converter.

Background

Autopilot vehicle behavior prediction involves fusion of various information, including sensor data of the vehicle (e.g., cameras, lidar, radar, etc.), dynamic state of the vehicle (e.g., speed, acceleration, direction, etc.), information of the surrounding environment (e.g., road structure, traffic flow, etc.), traffic regulations, driving patterns, etc. By analyzing and modeling this information, the driving behavior of the vehicle can be predicted and provided to the autopilot system as a basis for decision making and control. Currently, with the development of autopilot technology, autopilot vehicles have been able to sense and understand the surrounding environment, including roads, traffic signs, pedestrians, other vehicles, and the like. However, to implement advanced driving functions, such as active decision-making and planning of a vehicle, more accurate prediction of future driving behavior of the vehicle is required.

Disclosure of Invention

Based on the technical requirements, the invention provides an automatic driving vehicle behavior prediction method and system integrating a complex network and a graph converter. The system is based on complex networks and incorporates a graph rolling network in the transducer for processing the graphics data. In autopilot vehicle behavior prediction, graph transformers may capture complex relationships between vehicles through a complex relationship network between the vehicles, and using a multi-headed node attention mechanism. This information is then used to predict the behavior of the vehicle. Therefore, the automatic driving vehicle behavior prediction system integrating the complex network and the graph converter can utilize various interaction information to help the automatic driving vehicle to predict the behavior more accurately, and reduce a plurality of uncertainties in the automatic driving vehicle behavior prediction, so that the safety, the comfort and the reliability of the vehicle are improved, and a foundation is laid for realizing a more advanced automatic driving system.

The invention provides an automatic driving vehicle behavior prediction method and system integrating a complex network and a graph converter, comprising the following steps:

step 1) constructing a multi-level input of a model: establishing a complex network of vehicles, original road information data, motion data of the vehicles and local interaction information layering as inputs of a prediction model, and vectorizing all the characteristics of the complex network;

step 2) building a prediction model by using a graph transducer: utilizing a multi-head node attention mechanism of a graph Transformer and a complex relation between graph convolution and vehicles to establish a prediction model and learn so as to acquire global and local correlations of the vehicles;

Step 3) training a prediction model, and training and verifying the built prediction model by using training set and verification set data;

step 4) behavior prediction: for each test sample, inputting the characteristics of the test sample into a trained prediction model for prediction, and carrying out forward propagation on the prediction model according to the input characteristics, calculating layer by layer and generating an output result.

Further, the method further comprises the following steps: and 5) visually displaying the prediction result.

Further, in step 1), the method for establishing the complex network of the vehicle includes: first, a vehicle is represented as a node in a network, the node comprises vehicle data, the node is vectorized as [ x ₁,x₂,x₃………x_t ] representation, a connecting edge is established between the nodes, and an adjacency matrix of vehicle interaction is thatThe weight of the edge is the magnitude of the vehicle interaction;

The motion data of the vehicle comprises vehicle position and speed, acceleration and angular speed, path and track, direction and course angle, time stamp and motion state;

The original map data comprises geographic data and road network data; the geographic data are data describing geographic features of the earth surface, and comprise longitude, latitude, altitude, terrain and water system; the road network data comprises topological structure and attribute information of the road, and the topological structure and the attribute information comprise starting points and ending points of the road, road types, road names, road widths, the number of lanes and speed limit information.

Further, in the process of constructing the multi-level input of the model in step 1), firstly, data preprocessing is performed on the collected data, including:

(1) Data cleaning, namely cleaning noise, abnormal values and inconsistency in the data;

(2) Feature selection is to select the most relevant and important features from the original data, and feature scaling is to scale the data features to a proper range;

(3) Data set partitioning: dividing the data set into a training set, a verification set and a test set, wherein the training set is used for training a model, the verification set is used for tuning and selecting the model, and the test set is used for evaluating and verifying the performance of a final model; the training set, the verification set and the test set are divided according to the proportion of 70%, 15% and 15%;

(4) And (3) data coding: encoding the collected and processed data so that a behavior prediction model can be processed;

(5) Data normalization: the data is normalized to ensure that the numerical ranges between the different features are consistent.

Further, the specific steps of establishing the prediction model in the step 2) are as follows:

s2.1, carrying out information transmission and aggregation on the graph transition through a Graph Convolutional Network (GCN);

S2.2, further capturing global and local correlations by combining a multi-head node attention module and a graph modeling module;

S2.3, the decoder receives the output of the encoder and the context information as input, generates an output sequence of driving behaviors by using the attention mechanism and the graph convolution of the multi-head nodes, and makes the dimension of the output sequence identical to the category number of the driving behaviors by using the full connection layer; the SoftMax module performs normalization operation on the output of the decoder, converts the score of each category into a probability value, and takes the category with the largest probability value as a prediction result.

Further, the basic calculation steps and formulas for information transfer and aggregation by the graph rolling network (GCN) in step S2.1 are as follows:

first, initializing node characteristics: input node feature matrix Wherein N represents the number of nodes and D represents the feature dimension;

second, aggregating neighbor features: for each node i, the characteristics of the neighbor nodes are aggregated, and the characteristic aggregation formula of the neighbor nodes is as follows:

wherein:

-Agg _i represents the aggregate characteristics of node i;

-Ne (i) represents the set of neighbor nodes of node i;

-deg (i) and deg (j) represent the degree of node i and node j, respectively, i.e. the number of neighbor nodes;

-X _j represents the characteristics of the neighbor node j;

Third step, updating node representation:

The fusion formula is:

wherein:

- Is a node representation matrix of a first layer, F represents a representation dimension of each node, and r is an index of the node;

- Is the weight matrix of the first layer;

- Is a regularized adjacency matrix (e.g., an adjacency matrix using symmetric normalization);

- σ (·) is an activation function;

Fourth, iterative aggregation and updating: the expressive power and the learning power of the predictive model are enhanced by repeating the second step and the third step a plurality of times.

Further, the process of capturing global and local correlations in step S2.2 in combination with the multi-headed node concentration module and the graph modeling module is as follows:

Firstly, expanding a multi-head self-attention mechanism to an attention mechanism of a multi-head node, and capturing global correlation of a connected node and global dependence of a non-connected node; constructing two new graph node attention modules, namely Connection Node Attention (CNA) and non-connection node attention (NNA), wherein the two modules share the same network structure;

The Connected Node Attention (CNA) module models the correlation between connected global nodes, namely: at the position of Matrix sum/>A matrix multiplication is performed between transposed matrices, and then connected node attention features/>, are calculated by a softmax functionThe specific formula is as follows:

Wherein: card (N (r)) is the number of connected nodes in a training sample batch (batch);

Connection node attention feature Measuring the influence of the connection node on other connection nodes;

finally, the above results are summed Performing matrix multiplication, then taking the sum, and multiplying the sum by a scaling parameter alpha to obtain output; the specific formula is as follows:

wherein: alpha is a learnable parameter, initialized to 0;

By doing so, each connected node in N (r) is a weighted sum of all connected nodes; therefore, the CNA can obtain a global view of the network, the vehicle can obtain sufficient vehicle interaction information, and the accuracy of the behavior prediction of the automatic driving vehicle is improved;

a non-connected node attention (NNA) module is to capture global relationships between non-connected nodes; specifically, in Transposed matrix/>Performs matrix multiplication between them, and calculates the attention feature/>, of the non-connected nodes by applying a softmax function

Wherein:

- Is the number of non-connected nodes in a training sample batch (batch);

Non-connected node attention feature The effect of the non-connected node on other non-connected nodes is measured. Finally, sum/>, the above resultsPerforms matrix multiplication therebetween. Then, the result is multiplied by the ratio parameter β to obtain the following output:

wherein: beta is a weight that gradually learns from 0;

By doing so, each non-connected node in N (r) is a weighted sum of all non-connected nodes;

Finally, use Summing element by row to update node characteristics to capture connected and unconnected spatial relationships; this process can be represented as follows,

While Connected Node Attention (CNA) modules and non-connected node attention (NNA) modules are useful in extracting long-term and global dependencies, capturing fine-grained local information in complex internal data structures is inefficient. To solve this limitation, a new graph modeling module is proposed, in particular:

Given the above features The local correlation is further improved by using graph convolution, and the specific formula is as follows:

wherein:

-a is the adjacency matrix of the graph,

GC () represents the convolution of the graph,

P represents a parameter that can be trained,

- Σ () is a gaussian error linear unit (GeLU) proposed in the activation function, intended to provide network nonlinearity.

Further, in step S2.3, after the last layer of output of the decoder, a fully-connected layer is applied, and the output dimension of the fully-connected layer is the same as the class number of driving behaviors; then, inputting the output of the full connection layer into a softmax function, and carrying out normalization operation, wherein the softmax function converts the score of each category into a probability value to represent the prediction confidence of the prediction model on the category; the output probability vector represents the predicted probability distribution of the model for each driving behavior class; and finally, selecting the final driving behavior according to the maximum probability value.

Further, the specific steps of step 3) training the predictive model are as follows:

a. inputting the training dataset into a predictive model: providing input features in the training dataset to the predictive model as input;

b. Forward propagation: through a forward propagation process, the prediction model processes the input features through a graph convolution network, an encoder, a decoder, a softMax module and an activation function to obtain an output result, namely a predicted value of the prediction on the input data;

c. calculating a prediction error: comparing the predicted value of the predicted model with a real label, and calculating the difference between the predicted value and the real value according to the cross entropy loss function;

d. back propagation: calculating the gradient of the prediction model parameters by using a cross entropy loss function by using a back propagation algorithm; propagating the gradient from the output layer to the input layer through a chain rule, and updating the gradient of each parameter;

e. parameter updating: updating parameters of the model by using an optimizer according to the calculated parameter gradient; the optimizer adjusts the values of the prediction model parameters according to the setting of the gradient and the learning rate super parameters so as to gradually reduce the prediction error;

f. Repeating the iteration: repeating the steps b to e, performing iterative training on the model by using different training sample batches (batch), wherein each training sample batch (batch) can update parameters of the model, and gradually optimizing the performance of the model;

g. verification and tuning: during training, periodically using the validation set to evaluate the performance of the predictive model; monitoring generalization capability and overfitting condition of a prediction model by calculating prediction errors on the verification set; and according to the verification result, adjusting the hyper-parameters of the prediction model or stopping training in advance to obtain the optimal model performance.

Further, the specific steps of the behavior prediction in the step 4) are as follows: inputting a test sample into a prediction model, wherein in the prediction model, firstly, information transmission and aggregation are carried out through a graph convolution network, secondly, coding is carried out through a multi-head node attention mechanism, characteristic updating is carried out by utilizing an interaction relation among nodes, and finally, updating of the nodes is expressed as follows: Global and local correlations are then further captured by the multi-headed attention module and the graph modeling module of the graph transitioner. First, extending the multi-headed self-attention mechanism to the multi-headed point attention mechanism with the goal of capturing global dependencies of connected nodes and global dependencies of non-connected nodes, a process that can be expressed as Second, the graph construction module uses graph convolution to further improve local correlation, specifically formulated as/>Finally, the decoder receives the output of the encoder and the context information as inputs, and generates an output sequence of driving behavior using the attention mechanism of the multi-headed points, the encoder-decoder attention mechanism, and the graph convolution. After the last layer output of the decoder, the full-join layer is applied to linearly transform and non-linearly transform the features. Finally, a softmax function is applied, the softmax function outputs the predicted probability distribution of each driving behavior category, and the category with the highest probability is selected as the final predicted result.

An automatic driving vehicle behavior prediction system integrating a complex network and a graph converter is characterized by comprising a data acquisition device, a data preprocessing module, a prediction model and a visualization module,

The data acquisition device comprises a sensor and monitoring equipment, and is used for collecting motion data and original road data of a vehicle;

The data preprocessing module is used for cleaning, characteristic selection and scaling, data set division, data characteristic vectorization and data normalization of original data;

a prediction model comprising a graph rolling network, an encoder, a decoder and a softMax module,

The convolution information transmission network is used for information transmission and aggregation;

The encoder comprises a multi-head node attention module and a graph modeling module, wherein the multi-head node attention module comprises a Connecting Node Attention (CNA) module and a non-connecting node attention (NNA) module, and the two modules share the same network structure; the Connected Node Attention (CNA) module is used for obtaining correlation between connected global nodes, and the non-connected node attention (NNA) module is used for capturing global relations between non-connected nodes; the graph modeling module utilizes two graph convolution to carry out information transfer and aggregation, so that the local correlation is further improved;

The decoder takes the output of the encoder and the context information as input, generates an output sequence of driving behaviors by using the attention mechanism of a multi-head node and graph convolution, and carries out linear transformation and nonlinear transformation on the characteristics by the full-connection layer so that the dimension of the output sequence is the same as the category number of the driving behaviors;

The softMax module performs normalization operation on the output of the decoder, converts the score of each category into a probability value, and takes the category with the largest probability value as a prediction result;

And the visualization module is used for displaying the prediction result.

The invention provides an automatic driving vehicle behavior prediction system fusing a complex network and a graph Transformer, wherein the adopted graph Transformer model is a graph neural network (Graph Neural Network) based on the Transformer for processing graph data, namely, a graph rolling network is introduced into the Transformer, and the graph rolling network is further expanded on the basis of the graph neural network to process the graph data. Graph transformers can build links between nodes and edges, capture complex relationships between them, and perform tasks on the basis. In autopilot vehicle behavior prediction, graph transformers can capture complex relationships between vehicles by building a complex relationship network between the vehicles and using a multi-headed node attention mechanism. This information is then used to predict the behavior of the vehicle, such as whether the vehicle will turn, change lanes, cut-in, park, etc. Therefore, the automatic driving vehicle behavior prediction system integrating the complex network and the graph converter can help the automatic driving vehicle to better plan paths and behavior decisions, so that the safety, the comfort and the reliability are improved, and a foundation is laid for realizing a more advanced automatic driving system.

According to the method for predicting the behavior of the automatic driving vehicle by fusing the complex network and the graph Transformer, firstly, the complex network of vehicle interaction is constructed aiming at the interaction between vehicles on the basis of sensing the complex environment of the automatic driving vehicle, and meanwhile, the information of the original road information data and the motion data of the vehicles are vectorized to be used as multi-layer input; and secondly, carrying out information transmission and aggregation by using a multi-head node attention mechanism and a graph convolution of the graph Transformer module, and modeling and learning complex relations among vehicles to acquire the representation and the correlation of the vehicles. Then, the training data set is input into the model for training, so that the model has better generalization capability and reliability. Finally, for each test sample, inputting the characteristics of the test sample into a trained model for prediction, and visually displaying the prediction results according to the needs, such as turning left and right, changing lanes left and right, keeping driving behaviors such as straight driving, stopping driving and the like. The automatic driving behavior prediction system integrating the complex network and the graph converter provides more accurate prediction for the automatic driving vehicle, and improves the safety and efficiency of traffic.

The beneficial effects of the invention are as follows:

1. the invention constructs the complex network of vehicle interaction, which can capture the complexity and nonlinear relation of vehicle behavior. Meanwhile, the vehicle interaction complex network has rich topological structures, and can more accurately express vehicle behaviors. In addition, the vehicle interaction complex network can build a vehicle behavior prediction model at different levels. From the microscopic level, the complex network can establish an interaction model among vehicles, and the relative position, the speed, the acceleration and other factors among the vehicles are considered. From the macroscopic level, the complex network can establish a topological structure and a traffic flow model of the road network, and analyze the influence of factors such as traffic jam, road section bottleneck and the like on the vehicle behavior. Thus, multi-level modeling can provide more comprehensive prediction results. And finally, the vehicle interaction complex network has stronger robustness and generalization capability, and can adapt to different traffic environments and scenes.

2. The invention captures the interactive relationship between vehicles through graph convolution. In vehicle behavior prediction, vehicles typically interact and interact with each other. Graph convolution can model interactions between vehicles as graph structures and communicate and aggregate information on the graph through graph convolution operations, thereby predicting vehicle behavior more accurately. Additionally, graph convolution can directly process unstructured data and extract useful features therefrom. By constructing the vehicle behavior data as a graph, graph convolution can take full advantage of the local structure and global relevance of the data to better understand and predict vehicle behavior. Finally, the graph convolution can construct different graph structures according to actual requirements. Meanwhile, the graph convolves with the transducer to form a mixed model, so that the prediction performance is further improved.

3. The invention builds a graph transform model, and further captures global and local correlations by combining a multi-head node attention module and a graph modeling module. First, the multi-headed node attention mechanism includes Connected Node Attention (CNA) and non-connected node attention (NNA), which two modules share the same network structure. Node Attention (CNA) and non-connected node attention (NNA) may capture global dependencies of connected nodes and global dependencies of non-connected nodes. Meanwhile, in order to improve the efficiency in capturing fine-grained local information in a complex internal data structure, a graph building module is proposed, and local correlation is further improved by using graph convolution. Finally, the graph Transformer module has high flexibility and extensibility. The system can process various types of vehicles, different traffic environments and road conditions, and is suitable for automatic driving scenes of different types and scales. Therefore, the model has good generalization capability.

Drawings

In order that the application may be well understood, various forms thereof will now be described, by way of example, with reference to the accompanying drawings, in which:

FIG. 1 is a block diagram of a system for predicting driving behavior of a hybrid complex network and graph converter according to the present invention.

Fig. 2 is a flowchart of a method for predicting driving behavior of a hybrid complex network and a graph converter according to the present invention.

FIG. 3 is a block diagram of a transducer model according to the present invention.

FIG. 4 is a flow chart of the training and testing of the transducer model of the present invention.

Fig. 5 (a) is a diagram illustrating an example of a global interactive complex network for a vehicle according to the present invention.

Fig. 5 (b) is a diagram showing a complex example of the local interaction of the vehicle according to the present invention.

FIG. 6 is an exemplary view of an autonomous vehicle behavior prediction scenario in accordance with the present invention.

Fig. 7 is an illustration of an original road vehicle of the present invention.

Fig. 8 is a display diagram of the prediction result of the behavior of the automatically driven vehicle according to the present invention.

Fig. 9 is an exemplary diagram of an application scenario of the autonomous vehicle behavior prediction system of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings, but the protection of the invention is not limited thereto.

The autonomous car in this embodiment is equipped with a cooperative adaptive cruise control system to enable communication between the car-to-car and the car-to-infrastructure, as well as cooperative driving between the vehicles.

According to the method for predicting the behavior of the automatic driving vehicle by fusing the complex network and the graph Transformer, firstly, a vehicle interaction complex network is constructed, and meanwhile, original road information data, vehicle motion data and local interaction information are used as multi-layer input of a model; then, modeling and learning complex relations among vehicles by means of graph convolution and multi-head node attention mechanisms of graph transformers to acquire global and local correlations of the vehicles; finally, by training a prediction model, the graph transform model has good generalization and robustness, and can accurately output the behavior of the automatic driving vehicle.

As shown in fig. 1 and fig. 2, the system for predicting the driving behavior of the fusion complex network and the graph converter according to the invention comprises a data acquisition device, a data preprocessing module, a prediction model and a visualization module.

The data acquisition device comprises a sensor and monitoring equipment and is used for collecting motion data and original road data of a vehicle. Specifically, the collected motion data of the vehicle includes vehicle position and speed, acceleration and angular velocity, path and trajectory, direction and heading angle, time stamp, motion state, etc.; the original road data includes geographic data, road network data, the geographic data describing data of geographic features of the earth's surface, including longitude, latitude, etc., and the road network data includes topology and attribute information of the road.

The data preprocessing module is used for cleaning, characteristic selection and scaling, data set division, data characteristic vectorization and data normalization of original data so as to improve data quality, reduce noise, process missing values and abnormal values and convert the data into a format suitable for model training and analysis. The following is the specific content of the data preprocessing:

(1) Data cleaning: mainly related to noise, outliers and inconsistencies in the processed data. Statistical methods and domain knowledge are used to identify and process outliers. And meanwhile, performing operations such as data deduplication, missing value filling and the like.

(2) Feature selection and scaling: feature selection is to select features with the highest relevance and importance from the original data so as to improve the performance and efficiency of the model. Feature scaling is the scaling of data features to a suitable range to avoid excessive impact of certain features on model training.

(3) Data set partitioning: the data set is divided into a training set, a validation set and a test set. The training set is used for training the model, the verification set is used for tuning and selecting the model, and the test set is used for evaluating the final model and verifying the performance.

(4) And (3) data coding: the collected and processed data are encoded so that the behavior prediction model can process.

(5) Data normalization: the data is normalized to ensure consistent numerical ranges between different features, which can avoid excessive impact of certain features on model training.

The prediction model is constructed based on a graph Transformer and comprises a graph rolling network, an encoder, a decoder and a softMax module. The convolutional information delivery network is used for information delivery and aggregation. The encoder includes a multi-headed node attention module including a Connected Node Attention (CNA) module, a non-connected node attention (NNA) module, and a graph modeling module, both modules sharing the same network structure. The Connected Node Attention (CNA) module is used for obtaining correlation between connected global nodes, and the non-connected node attention (NNA) module is used for capturing global relations between non-connected nodes. The graph modeling module utilizes two graph convolution to conduct information transfer and aggregation, and further improves local correlation. The decoder takes the output of the encoder and the context information as input, generates an output sequence of driving behaviors by using the attention mechanism of a multi-head node and graph convolution, and carries out linear transformation and nonlinear transformation on the output characteristics by the full-connection layer so that the dimension of the output sequence is the same as the category number of the driving behaviors. And the softMax module performs normalization operation on the output of the decoder, converts the score of each category into a probability value, and takes the category with the largest probability value as a prediction result. The prediction model is used for predicting driving behaviors after learning, training and optimizing.

The invention discloses a method for predicting the automatic driving behavior of a fusion complex network and a graph converter, which is based on the fact that a prediction model is built by the graph converter to realize the prediction of the driving behavior of a vehicle. The process of constructing the prediction model is as follows:

Step 1) constructing a multi-level input of a model: firstly, a complex network of vehicles, original road information data, motion data of the vehicles and local interaction information layering are established as inputs of a model, and then all the characteristics of the model are vectorized.

Specifically, in the process of building a complex network of vehicles, as shown in fig. 5 (a), firstly, the vehicles are represented as a node in the network, the node comprises vehicle data, the node is vectorized into [ x ₁,x₂,x₃………x_t ] representation, and a connection edge is established between the nodes, wherein the adjacency matrix of vehicle interaction is as followsThe weight of an edge is the magnitude of the vehicle interaction.

As shown in fig. 5 (b), the local interaction network of the vehicle is configured to set up a local interaction map between the neighboring vehicle and the target vehicle as an input of a model, because the interaction between the neighboring vehicle and the target vehicle is large. The vehicle local interaction map provides more detailed information in terms of behavior prediction to understand and predict the behavior of the vehicle. The method comprises the steps of capturing local interaction relations among vehicles, extracting association features, modeling space-time evolution and carrying out behavior reasoning, improving accuracy and reliability of behavior prediction, and providing more accurate decision basis for an automatic driving prediction system.

As shown in fig. 6, the motion data of the vehicle includes vehicle position and speed, acceleration and angular velocity, path and trajectory, direction and heading angle, time stamp, motion state. The position and speed of the vehicle are the most basic information in the vehicle motion data. Acceleration and angular velocity data may help analyze acceleration and deceleration behavior, cornering behavior, etc. of the vehicle. The path and trajectory describe the trajectory of the movement of the vehicle during travel. The direction and heading angle data may help analyze the direction of travel and steering behavior of the vehicle. The time stamp records the time point of vehicle motion data acquisition and can be used for analyzing the motion speed, acceleration and the like of the vehicle. The motion state includes a running state of the vehicle such as stopping, starting, running at a constant speed, turning, etc. The motion state data may be used to analyze the driving behavior and driving pattern of the vehicle. Therefore, the vehicle motion data plays an important role in behavior prediction. By analyzing and utilizing the vehicle motion data, the behavior mode, trend and rule of the vehicle can be revealed, so that more accurate prediction of future behaviors is realized.

As shown in fig. 7, the original road information includes geographical data, road network data. The geographic data describes data of geographic features of the earth's surface, including longitude, latitude, and the like. The road network data includes topology and attribute information of the road, such as a start point and an end point of the road, a road type (expressway, urban road, rural road, etc.), a road name, a road width, the number of lanes, speed limit information, etc. These data describe the structure and characteristics of the road network for navigation, path planning, behavior prediction, etc. applications.

Step 2) building a prediction model by using a graph transducer: and building a prediction model for complex relations among vehicles by using a multi-head node attention mechanism of the graph Transformer and graph convolution, and learning to acquire global and local correlations of the vehicles.

S2.1, the graph converter firstly carries out information transmission and aggregation through a Graph Convolution Network (GCN), and the basic calculation steps and formulas of the Graph Convolution Network (GCN) are as follows:

First, initializing node characteristics:

Input: node characteristic matrix Where N represents the number of nodes and D represents the feature dimension.

Second, aggregating neighbor features:

neighbor aggregation: for each node i, aggregating the features of its neighbor nodes;

Neighbor node feature aggregation formula:

wherein:

-Agg _i represents the aggregate characteristics of node i;

-Ne (i) represents the set of neighbor nodes of node i;

-X _j represents the characteristics of the neighbor node j.

Third step, updating node representation:

fusion formula:

wherein:

- Is the weight matrix of the first layer;

- σ (·) is the activation function.

Fourth, iterative aggregation and updating:

the expressive power and learning power of the model can be enhanced by repeating the second step and the third step a plurality of times.

S2.2 the graph transform further captures global and local correlations by combining the multi-headed node attention module and the graph modeling module. The processing steps are as follows:

first, the multi-headed self-attention mechanism is extended to the attention mechanism of multi-headed points in order to capture the global dependencies of connected nodes and the global dependencies of non-connected nodes. Meanwhile, two new graph node attention modules, namely a Connected Node Attention (CNA) module and a non-connected node attention (NNA) module, are provided, and the two modules share the same network structure.

The Connected Node Attention (CNA) module can model the correlation between connected global nodes, in particular, inMatrix sum/>A matrix multiplication is performed between transposed matrices, and then connected node attention features/>, are calculated by a softmax functionThe specific formula is as follows:

wherein:

the card (N (r)) is the number of connected nodes in a training batch.

Connection node attention featureThe effect of the connection node on other connection nodes is measured. Finally, sum/>, the above resultsMatrix multiplication is performed, then the sum is taken, and then the scaling parameter a is multiplied to obtain an output. The specific formula is as follows:

wherein:

alpha is a learnable parameter initialized to 0.

By doing so, each of the N (r) connection nodes is a weighted sum of all connection nodes. Therefore, the Connection Node Attention (CNA) can obtain a global view of the network, the vehicle can obtain sufficient vehicle interaction information, and the accuracy of the behavior prediction of the automatic driving vehicle is improved.

Similarly, a non-connected node attention (NNA) module captures global relationships between non-connected nodes. Specifically, inTransposed matrix/>Performs matrix multiplication between them, and calculates the attention feature/>, of the non-connected nodes by applying a softmax function/>

Wherein:

- is the number of non-connected nodes in a training sample batch (batch).

Non-connected node attention featureThe effect of the non-connected node on other non-connected nodes is measured. Finally, sum/>, the above resultsPerforms matrix multiplication therebetween. Then, the result is multiplied by the ratio parameter β to obtain the following output:

wherein: beta is a weight that gradually learns from 0.

By doing so, each non-connected node in N (r) is a weighted sum of all non-connected nodes. Finally, useThe node characteristics are summed element by row to update the node characteristics to capture the connected and unconnected spatial relationships. This process can be represented as follows,

While CNAs and NNAs are useful in extracting long-term and global dependencies, capturing fine-grained local information in complex internal data structures is inefficient. To solve this limitation, a new building block is proposed, in particular:

wherein:

-a is the adjacency matrix of the graph,

GC () represents the convolution of the graph,

P represents a parameter that can be trained,

S2.3 has the same layer as the encoder, meanwhile, after the output of the last layer of the decoder, a fully connected layer is applied, the decoder receives the output of the encoder and the context information as input, the attention mechanism of the multi-head node and the graph convolution are utilized to generate an output sequence of driving behaviors, and the fully connected layer is utilized to enable the dimension of the output sequence to be the same as the category number of the driving behaviors. The output of the fully connected layer is then input into the softmax function for normalization. The softmax function converts the score of each category into a probability value representing the model's predictive confidence for that category, and the probability vector output represents the model's predictive probability distribution for each driving behavior category. Finally, the category with the highest probability value is taken as the final result of the predicted driving behavior.

As shown in fig. 3, the proposed graph converter encoder consists of two parts, the multi-headed node attention and the graph modeling module. It can capture global and local correlations for the generation of autopilot behavior predictions. First, in the connected node attention module, inputMatrix sum/>Matrix, then, connecting node attention features/>, is calculated by a softmax functionFinally, sum/>, the above resultsMatrix multiplication is performed, the sum is taken, and the scaling parameter a is multiplied to obtain an output. The Connected Node Attention (CNA) module may connect correlations between global nodes, capturing long-term relationships between connected nodes. In addition, the proposed non-connected node attention (NNA) has the same structure as the Connected Node Attention (CNA) module, but takes as input non-connected nodes, capturing long-term relationships between non-connected nodes.

Model training plays a critical role in the prediction of autonomous vehicle behavior, training models for processes that learn and extract useful patterns and laws from data. The model is trained through the steps of data input model, forward propagation, loss function calculation, backward propagation, parameter updating, repeated iteration, verification, tuning and the like, and in the training process, the generalization capability and the overfitting condition of the model are monitored to obtain the optimal model performance.

Step 3) training a prediction model, as shown in fig. 4, the data preprocessing module divides the data set into a training set, a verification set and a test set according to the proportion of 70%, 15% and 15%. The training set is used for training of the model and parameter adjustment. The verification set is used for model selection, tuning and hyper-parameter adjustment. The test set is used to ultimately evaluate the performance and generalization ability of the model. The training and validation data set is then used for model training, validation. Through three steps of model training, model tuning and evaluation, the expected performance of the training model is achieved. Finally, the model application predicts and visualizes the behavior of the autonomous vehicle in a complex scenario. In addition, when the model is deployed, factors such as model security, privacy protection and the like need to be considered.

The method comprises the following specific steps:

a. Inputting the training dataset into a model: input features in the training dataset are provided as input to the model.

B. forward propagation: through the forward propagation process, the model processes the input features through a series of modules and activation functions to obtain an output result. This output is a model predictive value of the input data.

C. Calculating a prediction error: and comparing the predicted value of the predicted model with the real label, and calculating the difference between the predicted value and the real value according to the cross entropy loss function.

D. back propagation: and calculating the gradient of the cross entropy loss function to the prediction model parameters by using a back propagation algorithm. The gradient is propagated from the output layer to the input layer by the chain law, updating the gradient of each parameter.

E. Parameter updating: and updating the parameters of the model by using an optimizer according to the calculated parameter gradient. The optimizer adjusts the values of the model parameters according to the settings of the super parameters such as gradient and learning rate, so that the loss function is gradually reduced.

F. Repeating the iteration: repeating steps b to e, and performing iterative training on the model by using different training sample batches (batch). The parameters of the model are updated in each training batch, and the performance of the model is gradually optimized.

G. Verification and tuning: during training, the performance of the model is periodically evaluated using the validation set. The generalization ability and the overfitting condition of the model are monitored by calculating the loss function value or other performance indexes on the validation set. And according to the verification result, adjusting the super parameters of the model or stopping training in advance to obtain the optimal model performance.

H. model preservation and deployment: during the training process, the parameters of the model or the whole model are saved regularly for subsequent testing and prediction.

Step 4) behavior prediction: and predicting or deducing new input data by using the trained behavior prediction model. For each test sample, inputting the characteristics of the test sample into a trained prediction model for prediction, and carrying out forward propagation on the prediction model according to the input characteristics, calculating layer by layer and generating an output result.

For each test sample, inputting the characteristics of the test sample into a trained model for prediction, and carrying out forward propagation on the model according to the input characteristics, calculating the test sample layer by layer and generating an output result.

In the graph transform model, firstly, information transmission and aggregation are carried out through a graph convolution network, secondly, coding is carried out through a multi-head node attention mechanism, feature updating is carried out by utilizing interaction relations among nodes, and finally, updating of the nodes is expressed as follows: global and local correlations are then further captured by the multi-headed attention module and the graph modeling module of the graph transitioner. First, extending the multi-headed self-attention mechanism to the multi-headed point attention mechanism with the goal of capturing global dependencies of connected nodes and global dependencies of non-connected nodes, a process that can be expressed as/> Second, the graph construction module uses graph convolution to further improve local correlation, specifically formulated as/>Finally, the decoder receives the output of the encoder and the context information as inputs, and generates an output sequence of driving behavior using the attention mechanism of the multi-headed points, the encoder-decoder attention mechanism, and the graph convolution. After the last layer output of the decoder, the full-join layer is applied to linearly transform and non-linearly transform the features. Finally, a softmax function is applied, the softmax function outputs the predicted probability distribution of each driving behavior category, and the category with the highest probability is selected as the final predicted result. And visually displaying the prediction result according to the requirement. Such as turning left and right, changing lanes left and right, keeping straight, stopping, etc., in order to more intuitively understand the predictive effect of the model.

As shown in fig. 8, a display diagram of the automatic driving behavior prediction result. It is first pointed out that fig. 8 is only an exemplary diagram of an interface of a behavior prediction system, only showing its necessary functions, and may be modified in the future according to specific requirements, which is not a limitation of the present application.

Referring to FIG. 8, the interface is composed of four parts, and frame ① shows the current operating state of the vehicle, including running, stopping, braking, accelerating, starting, etc.; frame ② shows the current time, the signal state of the vehicle, and the amount of oil remaining at the current time; frame ③ shows four display interface buttons of the behavior prediction system: the vehicle interaction information, the original road information, the vehicle motion information and the vehicle behavior prediction can respectively check the specific display information; frame ④ shows vehicle-related travel information for the key corresponding to frame ③.

In detail, the 'vehicle interaction information' interface displays a local interaction diagram of the vehicles at the current moment, provides local interaction information among the vehicles, and carefully captures the relationship and influence among the vehicles. The "original road information" interface displays road information at the current time, including geographic data and road network data. "vehicle motion information", data relating to the motion state and behavior of the vehicle at the present time is displayed. Such information may include position information, velocity information (m/s or km/h), acceleration information (m/s ²), and the like. The "vehicle behavior prediction" shows that the behavior of the automatically driven vehicle, such as left turn, straight running, lane change, etc., of the vehicle is predicted from the above information input at the present time.

As shown in fig. 9, an exemplary view of an application scenario of an autonomous vehicle behavior prediction system. It should be understood at first that fig. 9 is presented by way of example only and is not intended to limit the scope of the present application.

Referring to fig. 9, taking the traffic scenario of the figure as an example, the driving behavior of the target vehicle during driving is studied, the lower right corner red vehicle is taken as the own vehicle to predict the driving behavior of the target vehicle (yellow vehicle), the own vehicle collects various information including the interaction information of the vehicle, the original road information, the vehicle motion information and the like, and then the own vehicle processes the information and transmits the information to the trained prediction model to output the driving behavior of the target vehicle. And finally, displaying all predicted display results on a central control screen of the automobile, displaying the real-time driving behavior of the automobile for a driver and providing related suggestions.

The examples are preferred embodiments of the present invention, but the present invention is not limited to the above-described embodiments, and any obvious modifications, substitutions or variations that can be made by one skilled in the art without departing from the spirit of the present invention are within the scope of the present invention.

Claims

1. An automatic driving vehicle behavior prediction method integrating a complex network and a graph converter is characterized by comprising the following steps of:

2. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and graph Transformer of claim 1, further comprising: and 5) visually displaying the prediction result.

3. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph converter according to claim 1, wherein in the step 1), the method for establishing the complex network of the vehicle is as follows: first, a vehicle is represented as a node in a network, the node comprises vehicle data, the node is vectorized as [ x ₁,x₂,x₃………x_t ] representation, a connecting edge is established between the nodes, and an adjacency matrix of vehicle interaction is thatThe weight of the edge is the magnitude of the vehicle interaction;

The original road information comprises geographic data and road network data; the road network data comprises topological structure and attribute information of the road, and the topological structure and the attribute information comprise starting points and ending points of the road, road types, road names, road widths, the number of lanes and speed limit information.

4. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph converter according to claim 1, wherein in the process of constructing the multi-level input of the model in step 1), the data preprocessing is performed on the collected data, and the method comprises the following steps:

5. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph converter according to claim 1, wherein the specific steps of establishing the prediction model in the step 2) are as follows:

6. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph Transformer according to claim 5, wherein the basic calculation steps and formulas for information transfer and aggregation performed by the graph rolling network (GCN) in step S2.1 are as follows:

wherein:

-Agg _i represents the aggregate characteristics of node i;

-Ne (i) represents the set of neighbor nodes of node i;

-X _j represents the characteristics of the neighbor node j;

Third step, updating node representation:

The fusion formula is:

wherein:

- Is the weight matrix of the first layer;

- σ (·) is an activation function;

7. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and graph Transformer as recited in claim 5, wherein the process of capturing global and local correlations in step S2.2 in combination with the multi-headed node attention module and the graph modeling module is as follows:

The Connected Node Attention (CNA) module models the correlation between connected global nodes, namely: at the position of Matrix and method for forming sameA matrix multiplication is performed between transposed matrices, and then connected node attention features/>, are calculated by a softmax functionThe specific formula is as follows:

wherein: alpha is a learnable parameter, initialized to 0;

by doing so, each connected node in N (r) is a weighted sum of all connected nodes;

Wherein:

- Is the number of non-connected nodes in a training sample batch (batch);

Non-connected node attention feature Measuring the influence of the non-connected node on other non-connected nodes; finally, sum/>, the above resultsPerforming matrix multiplication therebetween; then, the result is multiplied by the ratio parameter β to obtain the following output:

wherein: beta is a weight that gradually learns from 0;

Given the above featuresThe local correlation is further improved by using graph convolution, and the specific formula is as follows:

wherein:

-a is the adjacency matrix of the graph,

GC () represents the convolution of the graph,

P represents a parameter that can be trained,

8. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph Transformer according to claim 5, wherein in step S2.3, after the output of the last layer of the decoder, a fully connected layer is applied, and the output dimension of the fully connected layer is the same as the number of categories of driving behaviors; then, inputting the output of the full connection layer into a softmax function, and carrying out normalization operation, wherein the softmax function converts the score of each category into a probability value to represent the prediction confidence of the prediction model on the category; the output probability vector represents the predicted probability distribution of the model for each driving behavior class; and finally, selecting the final driving behavior according to the maximum probability value.

9. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph fransformer as recited in claim 1, wherein the specific step of training the prediction model in the step 3) is as follows:

g. Verification and tuning: during training, periodically using the validation set to evaluate the performance of the predictive model; the generalization capability and the overfitting condition of the prediction model are monitored by calculating the error on the verification set; and according to the verification result, adjusting the hyper-parameters of the prediction model or stopping training in advance to obtain the optimal model performance.

10. The method for predicting the behavior of an autonomous vehicle by fusing a complex network and a graph converter according to claim 9, wherein the specific steps of predicting the behavior in the step 4) are as follows: inputting a test sample into a prediction model, wherein in the prediction model, firstly, information transmission and aggregation are carried out through a graph convolution network, secondly, coding is carried out through a multi-head node attention mechanism, characteristic updating is carried out by utilizing an interaction relation among nodes, and finally, updating of the nodes is expressed as follows: Global and local correlations are then further captured by the multi-headed attention module and the graph modeling module of the graph transitioner. First, extending the multi-headed self-attention mechanism to the multi-headed point attention mechanism with the goal of capturing global dependencies of connected nodes and global dependencies of non-connected nodes, a process that can be expressed as Second, the graph construction module uses graph convolution to further improve local correlation, specifically formulated as/>Finally, the decoder receives the output of the encoder and the context information as inputs, and generates an output sequence of driving behavior using the attention mechanism of the multi-headed points, the encoder-decoder attention mechanism, and the graph convolution. After the last layer output of the decoder, the full-join layer is applied to linearly transform and non-linearly transform the features. Finally, a softmax function is applied, the softmax function outputs the predicted probability distribution of each driving behavior category, and the category with the highest probability is selected as the final predicted result.

11. An automatic driving vehicle behavior prediction system integrating a complex network and a graph converter is characterized by comprising a data acquisition device, a data preprocessing module, a prediction model and a visualization module,

The data acquisition device comprises a sensor and monitoring equipment, and is used for collecting motion data and original road information data of a vehicle;

The graph rolling network is used for information transmission and aggregation;

And the visualization module is used for displaying the prediction result.