CN116244281A

CN116244281A - Lane traffic flow data complement and model training method and device thereof

Info

Publication number: CN116244281A
Application number: CN202211705577.3A
Authority: CN
Inventors: 张乐; 明靖祠; 梅雨; 凌玮岑; 田楚杰; 祝恒书; 熊辉; 陈尚义
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-09-28
Filing date: 2022-12-28
Publication date: 2023-06-09
Anticipated expiration: 2042-12-28
Also published as: CN115563093A; CN116244281B

Abstract

The application discloses a lane traffic flow data complement method and a lane traffic flow data complement device, and relates to the fields of artificial intelligence, intelligent traffic, deep learning and the like. The specific implementation scheme is as follows: acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a first time period; generating a lane traffic flow comprehensive space representation according to the attribute information and traffic flow data observed by the lane in a first time period; acquiring embedded representation of traffic flow global distribution of a lane according to traffic flow data observed by the lane in a first time period; acquiring a lane traffic flow time relation representation according to the embedded representation of the lane traffic flow global distribution; and carrying out completion operation on missing data in the traffic flow of the traffic road at the intersection according to the comprehensive space representation of the traffic flow of the traffic road and the time relationship representation of the traffic flow of the traffic road. The method and the device can complement the missing data of the traffic flow of the urban multilane, and effectively improve the accuracy of traffic flow data complement.

Description

Lane traffic flow data complement and model training method and device thereof

Technical Field

The application relates to the field of data processing, in particular to the fields of artificial intelligence, intelligent traffic, deep learning and the like, and particularly relates to a lane traffic flow data complement method, a training method and device of a lane traffic flow data complement model, electronic equipment and a storage medium, which can be applied to urban multi-intersection traffic flow data complement application scenes.

Background

With the rapid development of internet technology and traffic informatization, the scale of traffic data is larger and larger, and in an intelligent traffic system, the complete and effective traffic data is significant for traffic management. However, when traffic data is actually collected, due to the occurrence of some unavoidable events (such as equipment damage, bad weather and the like), data collection is interrupted, and partial data is lost, so that the effectiveness of a data set is reduced, and the development of intelligent traffic construction is restricted. Aiming at the problems, various methods for complementing the missing data based on the traffic flow data collected by observation are proposed in the research. The method has important research significance in theory and actual level, is beneficial to downstream applications such as traffic flow prediction, intelligent traffic management and the like, and has great challenges.

Disclosure of Invention

The application provides a lane traffic flow data complement method, a lane traffic flow data complement model training method, a lane traffic flow data complement device, electronic equipment and a storage medium.

According to a first aspect of the present application, there is provided a lane traffic flow data complement method, including:

acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a first time period;

generating a lane traffic flow comprehensive space representation according to the attribute information of the lane and the traffic flow data observed by the lane in the first time period;

acquiring embedded representation of traffic flow global distribution of a lane according to traffic flow data observed by the lane in a first time period;

performing time sequence prediction on the embedded representation of the traffic flow global distribution of the lane based on a pre-trained two-way long-short-term memory artificial neural network LSTM model to obtain a lane traffic flow time relationship representation according to the embedded representation of the traffic flow global distribution of the lane;

and carrying out completion operation on missing data in the lane traffic flow of the intersection according to the lane traffic flow comprehensive space representation and the lane traffic flow time relation representation.

According to a second aspect of the present application, there is provided a training method of a lane traffic flow data complement model, the lane traffic flow data complement model including a plurality of spatial relationship fusion modules, a pre-trained self-coding module and a two-way long-short-term memory artificial neural network LSTM module, wherein the method includes:

acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a third time period;

inputting the attribute information of the lane and the traffic flow data observed by the lane in a third time period into the multiple spatial relationship fusion module to obtain a lane traffic flow comprehensive spatial representation;

inputting traffic flow data observed by the lane in a third time period to the self-coding module to obtain embedded representation of traffic flow global distribution of the lane;

inputting the comprehensive space representation of the traffic flow of the lane and the embedded representation of the global distribution of the traffic flow of the lane to the bidirectional LSTM module to obtain traffic flow data of the intersection output by the bidirectional LSTM module;

calculating a model loss value according to traffic flow data observed by the lane in a third time period, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection;

And training the lane traffic flow data complement model according to the model loss value.

According to a third aspect of the present application, there is provided another lane traffic flow data complement method, comprising:

acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a fourth time period;

inputting the attribute information of the lane and the traffic flow data observed by the lane in a fourth time period into a pre-trained lane traffic flow data complement model; wherein the lane traffic flow data complement model is trained by the method according to the second aspect of the application;

acquiring lane traffic flow data output by the lane traffic flow data complement model; the lane traffic flow data comprises data of the traffic flow observation existence of the lane and missing data estimated values.

According to a fourth aspect of the present application, there is provided a lane traffic flow data complement apparatus comprising:

the first acquisition module is used for acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a first time period;

the generation module is used for generating a lane traffic flow comprehensive space representation according to the attribute information of the lane and the traffic flow data observed by the lane in the first time period;

The second acquisition module is used for acquiring embedded representation of traffic flow global distribution of the lane according to traffic flow data observed by the lane in the first time period;

the third acquisition module is used for carrying out time sequence prediction on the embedded representation of the traffic flow global distribution of the lane based on a pre-trained two-way long-short-term memory artificial neural network LSTM model to obtain a lane traffic flow time relationship representation according to the embedded representation of the traffic flow global distribution of the lane;

and the completion module is used for carrying out completion operation on missing data in the traffic flow of the road junction according to the traffic flow comprehensive space representation and the traffic flow time relationship representation.

According to a fifth aspect of the present application, there is provided a training device for a lane traffic flow data complement model, the lane traffic flow data complement model including a plurality of spatial relationship fusion modules, a pre-trained self-coding module and a two-way long-short-term memory artificial neural network LSTM module, wherein the device includes:

the first acquisition module is used for acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a third time period;

The second acquisition module is used for inputting the attribute information of the lane and the traffic flow data observed by the lane in a third time period into the multiple spatial relationship fusion module to obtain the comprehensive spatial representation of the traffic flow of the lane;

the third acquisition module is used for inputting traffic flow data observed by the lane in a third time period to the self-coding module to obtain embedded representation of traffic flow global distribution of the lane;

a fourth obtaining module, configured to input the comprehensive space representation of the traffic flow of the lane and the embedded representation of the global distribution of the traffic flow of the lane to the bidirectional LSTM module, to obtain traffic flow data of the intersection output by the bidirectional LSTM module;

the calculation module is used for calculating a model loss value according to traffic flow data observed by the lane in a third time period, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection;

and the training module is used for training the lane traffic flow data complement model according to the model loss value.

According to a sixth aspect of the present application, there is provided another lane traffic flow data complement apparatus comprising:

The first acquisition module is used for acquiring attribute information of a lane in an intersection and traffic flow data observed by the lane in a fourth time period;

the input module is used for inputting the attribute information of the lane and the traffic flow data observed by the lane in a fourth time period into a pre-trained lane traffic flow data complement model; wherein the lane traffic flow data complement model is trained by the apparatus according to the fifth aspect;

the second acquisition module is used for acquiring the lane traffic flow data output by the lane traffic flow data complement model; the lane traffic flow data comprises data of the traffic flow observation existence of the lane and missing data estimated values.

According to a seventh aspect of the present application, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect, or to perform the method of the second aspect, or to perform the method of the third aspect.

According to an eighth aspect of the present application, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of the first aspect, or to perform the method of the second aspect, or to perform the method of the third aspect.

According to a ninth aspect of the present application there is provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of the first aspect described above, or implements the steps of the method of the second aspect described above, or implements the steps of the third aspect described above.

According to the technology, the complex time-space correlation of the traffic flow of the lanes is integrated, and missing data of the traffic flow of the urban multilane is complemented, so that the accuracy and the efficiency of the subsequent intelligent traffic management are improved.

It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

fig. 1 is a flowchart of a lane traffic flow data complement method according to an embodiment of the present disclosure;

FIG. 2 is a flow chart of another lane traffic flow data completion method provided by an embodiment of the present application;

FIG. 3 is a flow chart of another lane traffic flow data completion method provided by an embodiment of the present application;

FIG. 4 is a flowchart of a training method of a self-coding model according to an embodiment of the present application;

fig. 5 is a schematic structural flow diagram of a lane traffic flow data complement model according to an embodiment of the present application;

FIG. 6 is a flowchart of a training method of a lane traffic flow data completion model according to an embodiment of the present disclosure;

FIG. 7 is a flow chart of another lane traffic flow data completion method provided by an embodiment of the present application;

fig. 8 is a block diagram of a lane traffic flow data complement device according to an embodiment of the present disclosure;

FIG. 9 is a block diagram of another lane traffic flow data completing apparatus according to an embodiment of the present disclosure;

fig. 10 is a structural block diagram of a training device for lane traffic flow data complement model according to an embodiment of the present application;

FIG. 11 is a block diagram illustrating another lane traffic flow data completing apparatus according to an embodiment of the present disclosure;

fig. 12 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the related technology, the traffic flow data in a period of time can be complemented, and the using method can be mainly divided into the following three types: a traditional interpolation method; a matrix and tensor decomposition based method; a deep learning-based method.

A first conventional interpolation method:

some conventional statistical methods were first proposed to be applicable to this traffic completion problem, such as linear interpolation, e.g., linear interpolation and ARIMA (differential integration moving average autoregressive, autoregressive Integrated Moving Average), of data found at adjacent time points based on historical data and observed periodic characteristics. Another similar method mainly relates to a method of KNN (K-Nearest Neighbor), except that the similarity in time is utilized, and the similarity in space is utilized to find the observed value of the adjacent road at the same time for interpolation and complementation.

A second matrix and tensor decomposition-based method:

in order to improve the efficiency and accuracy of the complementation, new researches have proposed a method for using matrix and tensor decomposition for traffic flow complementation. In the matrix factorization method, traffic flow data is mapped into a matrix, and each row represents traffic flow observations of a road at different times. Matrix and tensor decomposition methods are mainly used to find some inherent spatial or temporal links of traffic flow data to complement the missing data.

A third approach based on deep learning:

in recent years, the method of machine learning is also widely used for traffic flow completion, and the built-in machine learning model is used for learning the inherent nonlinear time-space correlation in the observed traffic flow data. The construction mode of the deep learning model determines the learning mode, and the model can be constructed according to time relevance and space relevance, so the construction mode of the deep learning model is an important key of model effect.

However, the three methods have respective disadvantages. The first conventional interpolation method relies too much on subjective and obvious spatio-temporal correlations, and tends to ignore some of the connections inherent in the data and thus increase the error in the estimation. The second matrix and tensor decomposition-based method fails to fully exploit the spatial correlation between roads. The third method based on deep learning cannot pertinently construct a model for supplementing the urban traffic flow at the lane level so as to fully embody the space-time complexity of the model.

Based on the above problems, the application provides a method for supplementing traffic flow data at a lane level, which is to fuse multiple spatial relations among lanes, so as to obtain a comprehensive representation mode among lanes to fully embody the spatial relevance of traffic flow of the lanes, learn the time relevance of traffic flow of the same lane at different time points and fuse the time relevance with the spatial relevance obtained before to obtain the supplemented data. In order to further improve the model effect, global distribution information of traffic flow of all lanes in a period of time is added to cope with the data sparseness problem.

The lane traffic flow data complement method, the training method of the lane traffic flow data complement model, the device, the electronic equipment and the storage medium according to the embodiments of the present application are described below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a lane traffic flow data complement method according to an embodiment of the present application. It should be noted that, the lane traffic flow data complement method in the embodiment of the present application may be executed by the electronic device. As shown in fig. 1, the lane traffic flow data complement method may include, but is not limited to, the following steps.

In step 101, attribute information of a lane in an intersection and traffic flow data observed by the lane in a first period of time are acquired.

Alternatively, attribute information of all lanes within the intersection and traffic flow data observed by all lanes within the first period of time may be acquired. The traffic flow data observed by the lane in the first time period can be understood as: traffic flow data observed over a first period of time by a visual device (e.g., camera) on the roadway. When the visual equipment on the lane observes the traffic flow data, the observed traffic flow data is sent to the electronic equipment, so that the electronic equipment performs the completion operation and/or other subsequent operations on the missing data in the traffic flow data of the lane.

In some embodiments of the present application, the attribute information of the lane may include, but is not limited to, a driving direction of the lane, geographic position information of the lane, and the like. In one implementation, driving directions of all lanes within an intersection, geographic location information of all lanes, and traffic flow data observed by all lanes within a first time period may be acquired. Wherein, the matrix x= { X can be used ₁ ,x ₂ ,…,x _T Traffic flow data observed by all lanes in a first period T, wherein the vector

The observed traffic flow data at time t for all lanes is represented, N represents the number of lanes, and all time intervals are the same.

In step 102, a lane traffic flow integrated spatial representation is generated from the attribute information of the lane and traffic flow data observed by the lane during a first period of time.

For example, taking the example that the attribute information of the lanes includes the driving directions and the geographic position information of the lanes, the lane traffic flow comprehensive space representation may be generated according to the driving directions and the geographic position information of all the lanes in the intersection and the traffic flow data observed by all the lanes in the first period.

In one implementation manner, the lane traffic flow comprehensive space is obtained by fusing a plurality of spatial relation matrixes based on lanes. In another implementation, the lane traffic flow integrated spatial representation is derived by dynamically fusing lane multiple spatial relationship matrices at different times using an attention mechanism. Among other things, in some embodiments of the present application, the various spatial relationships between lanes may include, but are not limited to, reachable relationships, neighbor relationships, and similarity relationships, among others.

In step 103, an embedded representation of the global distribution of traffic flow of the lane is obtained from traffic flow data observed by the lane during a first period of time.

Alternatively, traffic flow data observed by all lanes within the intersection over the first time period may be encoded by a self-encoding technique to obtain an embedded representation of the traffic flow global distribution of all lanes over the first time period.

In step 104, a lane traffic time relationship representation is obtained from the embedded representation of the lane traffic global distribution.

Optionally, a pre-trained two-way long-short-term memory artificial neural network LSTM model is adopted to conduct time sequence prediction on embedded representation of traffic flow global distribution of the lane, and lane traffic flow time relation representation is obtained. The bidirectional LSTM model has learned a temporal correlation in the traffic flow of the lane, so that a time-dependent representation of the traffic flow of the lane can be obtained using the bidirectional LSTM model.

In step 105, the data missing from the traffic flow of the intersection is complemented according to the traffic flow integrated space representation and the traffic flow time relationship representation.

In one implementation, the lane traffic flow comprehensive space representation and the lane traffic flow time relationship representation can be subjected to fusion processing, and data obtained after the fusion processing is determined to be lane traffic flow data of the intersection, wherein the lane traffic flow data comprises data for observing existence and missing data estimated values, so that urban traffic data complement operation for fine-granularity lane flow is realized. In addition, the method effectively merges the nonlinear space-time correlation in the data, and is not aimed at one aspect of the data singly, so that the missing part in the traffic flow data of the lane can be complemented, and the accuracy of the traffic flow data complement is effectively improved. In addition, the method and the device integrate space-time correlation information, effectively utilize the correlation of the two aspects at the same time, and can effectively and accurately estimate the missing data aiming at the problems of continuous missing and simultaneous missing.

By implementing the embodiment of the application, through fusing various spatial relations among the lanes, a comprehensive representation mode among the lanes is obtained to fully embody the spatial relevance of traffic flow of the lanes, then the time relevance among traffic flows of the same lane at different time points is learned, and the time relevance is fused with the spatial relevance obtained before to obtain the completed data. In order to further improve the model effect, global distribution information of traffic flow of all lanes in a period of time is added to cope with the data sparseness problem. Therefore, the method and the device solve the problem of complex time-space correlation of the traffic flow of the lanes, complement the missing data of the traffic flow of the urban multilane, and improve the accuracy and the efficiency of the follow-up intelligent traffic management.

In order to solve the problem of traffic flow loss at the lane level in a targeted manner, the spatial correlation among multiple lanes can be fused to fully utilize complex spatial correlation information among the lanes. In an embodiment of the present application, as shown in fig. 2, the lane traffic flow data complement method may include, but is not limited to, the following steps.

In step 201, attribute information of a lane in an intersection and traffic flow data observed by the lane in a first period of time are acquired.

Alternatively, step 201 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not repeated herein.

In step 202, a plurality of spatial relationship matrices for the lanes are generated based on the attribute information of the lanes and the traffic flow data observed by the lanes during the first period of time.

In the embodiment of the application, the attribute information of the lane includes a driving direction of the lane and geographic position information. The lane multiple spatial relationship matrices may include an reachability relationship adjacency matrix, an adjacency relationship matrix, and a similarity relationship matrix for the lane.

In one implementation, whether two lanes located at adjacent crossroads have a direct reachable relation can be determined according to the driving direction of the lanes, and a reachable relation adjacency matrix of the lanes is established according to the reachable relation between the lanes, wherein the element in the ith row and the jth column in the matrix

Can be expressed as: />

Wherein l _i And l _j Representing two different lanes, respectively.

The geometric distance between every two of the two adjacent lanes can be calculated according to the geographic position information of the lanes, a distance threshold E is given, when the distance between the lanes is smaller than the threshold, the adjacent relationship exists between the lanes and an adjacent relationship matrix is constructed, and the ith row and the jth column of elements in the matrix

Can be expressed as:

wherein dist (l) _i ,l _j ) Indicating lane l _i To lane l _j Geometric distance between them.

The similarity distance between every two lanes can be calculated according to the traffic flow data observed by the lanes, k lanes with the highest similarity distance are taken for each lane to be connected with the lanes, the connected lanes are considered to have a similarity relationship, and a similarity relationship matrix is constructed, wherein the elements in the ith row and the jth column of the matrix

Can be expressed as:

in step 203, a lane traffic flow integrated spatial representation is generated based on the lane multiple spatial relationship matrices.

In one implementation, multiple spatial relationship matrixes of the lanes are respectively processed based on the graph rolling operation aggregation multi-order spatial relationship so as to obtain multiple lane traffic flow representations based on a single spatial relationship; and carrying out fusion processing on the lane traffic flow representations based on the single spatial relationship to generate a lane traffic flow comprehensive spatial representation.

Alternatively, based on each single spatial relationship, the representation method of the lane traffic flow based on the single spatial relationship can be obtained by aggregating the multi-order spatial relationships in a graph convolution mode, and the method can be expressed as follows:

wherein the k value represents the order,

representing a trainable parameter matrix, x _t For traffic flow data observed by all lanes at time t, sigma represents an activation function, A ^(k) Representing a single spatial relationship matrix.

That is, the above formula can be expressed(1) A in (2) ^(k) The adjacent relation matrix, the adjacent relation matrix and the similar relation matrix are replaced respectively, so that the lane traffic flow representation based on the adjacent relation matrix, the lane traffic flow representation based on the adjacent relation matrix and the lane traffic flow representation based on the similar relation matrix can be calculated by using the formula (1). Then, the lane traffic flow representation based on the reachable relation adjacency matrix, the lane traffic flow representation based on the adjacent relation matrix and the lane traffic flow representation based on the similarity relation matrix can be subjected to fusion processing to obtain the lane traffic flow comprehensive space representation.

Optionally, in one possible implementation, a concentration mechanism is used to perform fusion processing on multiple lane traffic flow representations based on a single spatial relationship to generate a lane traffic flow integrated spatial representation. In order to fuse multiple spatial relationships of lanes, the application can dynamically fuse multiple spatial relationships of lanes (such as reachable relationships, adjacent relationships and similar relationships) at different times by adopting an attention mechanism, and the fusion method can be expressed as follows:

Wherein,,

respectively represent the traffic flow of the lanes obtained by the spatial relationship of different lanes, W _α Trainable parameter matrix, alpha _t Representing attention parameter, +_>

The spatial representation is synthesized for the resulting lane traffic flow.

Alternatively, the present application may also use other fusion processing manners, such as a weighted summation manner, which is not specifically limited herein.

In step 204, an embedded representation of the global distribution of traffic flow of the lane is obtained from traffic flow data observed by the lane over a first period of time.

Alternatively, step 204 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not repeated herein.

In step 205, a lane traffic flow time relationship representation is obtained from the embedded representation of the lane traffic flow global distribution.

Alternatively, step 205 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not repeated herein.

In step 206, the data missing from the traffic flow of the intersection is complemented according to the traffic flow of the lane comprehensive spatial representation and the traffic flow of the lane temporal relationship representation.

Alternatively, step 206 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not repeated herein.

By implementing the embodiment of the application, the complex space correlation among lanes can be better utilized aiming at traffic flow of lane levels with fine granularity, and the problem of traffic flow deficiency of the lane levels is solved pertinently by fusing the complex space correlation among multiple lanes and fully utilizing the complex space correlation information among the lanes.

In order to obtain the global distribution of traffic flow of all lanes in a period of time, the application can obtain the information of the global distribution by introducing a pre-trained self-coding model. In some embodiments of the present application, as shown in fig. 3, the lane traffic flow data completion method may include, but is not limited to, the following steps.

In step 301, attribute information of a lane in an intersection and traffic flow data observed by the lane in a first period of time are acquired.

Alternatively, step 301 may be implemented in any implementation manner in various embodiments of the present application, which is not limited to this embodiment, and is not described in detail herein.

In step 302, a plurality of spatial relationship matrices for the lanes are generated based on the attribute information of the lanes and the traffic flow data observed by the lanes during the first period of time.

Alternatively, step 302 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not repeated herein.

In step 303, a lane traffic flow integrated spatial representation is generated based on the lane multiple spatial relationship matrix.

Alternatively, step 303 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not described in detail herein.

In step 304, the traffic flow data observed by the lane in the first time period is encoded to obtain the lane traffic flow code.

In one implementation, traffic flow data observed by the lane in a first time period is encoded based on a pre-trained self-encoding model, so as to obtain lane traffic flow encoding.

Wherein the self-coding model is pre-trained and the global portion of traffic flow can be learned by self-coding. In some embodiments of the present application, as shown in fig. 4, the training method of the self-coding model may include the following steps:

Step 401, obtaining traffic flow data observed by a lane in an intersection in a second time period; wherein the second time period is earlier than the first time period.

Step 402, inputting traffic flow data observed by the lane in a second time period into a self-coding learning model; the self-coding learning model includes an encoder and a decoder.

Step 403, obtaining coding information which is output after the coder codes the traffic flow data observed by the lane in the second time period for a plurality of times.

In one implementation, the encoder's formula is expressed as follows:

wherein X represents the traffic flow observed by all lanes over a period of time (e.g., a second period of time), W _e,1 And W is _e,2 Representing a trainable parameter matrix, b _e,1 And b _e,2 Representing a trainable bias vector.

Optionally, the traffic flow data X observed by the lane in the second time period is input into the self-coding learning model to obtain coding information z output after the encoder codes the traffic flow data observed by the lane in the second time period for a plurality of times ^e 。

Step 404, inputting the encoded information output by the encoder to a decoder to obtain lane traffic flow reconstruction data output after the decoder decodes the encoded information for a plurality of times in sequence; wherein the number of encodings of the encoder is the same as the number of decodings of the decoder.

In one implementation, the decoder is formulated as follows:

wherein W is _d,1 And W is _d,2 Representing a trainable parameter matrix, b _d,1 And b _d,2 Representing trainable bias vector, z ^d Representing reconstructed traffic flow of all lanes obtained after self-encoding, rewritable as X _r ＝z ^d I.e. lane traffic flow reconstruction data.

Step 405, obtaining the position information of the traffic flow observation data of the lane at the intersection; the lane traffic flow observation data position information comprises position information of data existing in the lane traffic flow observation and missing data.

Alternatively, since there is a problem of data missing in the collected traffic flow data of the lane, in order to improve the training effect of the model, the loss value may be calculated in combination with the traffic flow observation data position information of the lane. For this reason, in this step, the position information of the traffic flow observation data of the lane of the intersection may be acquired so that the loss value is calculated later using the acquired position information of the traffic flow observation data of the lane of the intersection. In embodiments of the present application, a matrix may be used

To represent the positions of data present and missing data of traffic flow observations of a lane, wherein element m _ti =1 at x _ti (i-th lane at time t) no missing data is observed, and if missing is expressed as m _ti ＝0。

Step 406, calculating a loss value according to the traffic flow data, the lane traffic flow reconstruction data and the lane traffic flow observation data position information observed by the lane in the second time period.

Alternatively, the loss value may be calculated from traffic flow data observed by the lane in the second period, lane traffic flow reconstruction data, and lane traffic flow observation data position information using a preset loss function. Wherein, in the embodiments of the present application, the formula of the loss function is expressed as follows:

wherein I ₁ Indicates a first order pattern, by which is the hardmard product, M indicates the position information of the traffic flow observation data of the lane, and X _r Represents lane traffic flow reconstruction data, and X represents traffic flow observed by all lanes over a period of time (e.g., a second period of time).

Step 407, training a self-coding learning model according to the loss value, and determining an encoder in the trained self-coding learning model as the self-coding model.

Alternatively, the self-encoding learning model may be trained during the training process by minimizing the above-described loss function. And determining the encoder in the trained self-coding learning model as a self-coding model.

Thus, the self-coding model can be trained through steps 401-407 described above, so that the traffic flow global distribution can be learned from coding using the self-coding model.

In step 305, lane traffic codes are processed based on a preset function to obtain an embedded representation of the global distribution of lane traffic.

Wherein, in some embodiments of the present application, the preset function may be an average function. For example, the embedded representation method of the global distribution of the traffic flow of the lane can be as follows: h is a ^e ＝g(z ^e ). Where g () represents a function, an average function may be chosen in the embodiments of the present application.

In step 306, a lane traffic time relationship representation is obtained from the embedded representation of the lane traffic flow global distribution.

In one implementation, a time series prediction is performed on an embedded representation of a global distribution of traffic flow of a lane based on a pre-trained LSTM (Bi-directional Long-Short-Term-Memory) artificial neural network model, and a time relationship representation of traffic flow of the lane is obtained. To learn time-dependence in lane traffic flow, the present application learns based on a bi-directional LSTM model. First, initialize the initial in the bidirectional LSTM model with an embedded representation of the obtained global distribution of lane traffic flow Initial hidden state h ₀ . In order to solve the problem of lane traffic flow completion, the present application performs a linear operation on some changed model elements in the conventional bidirectional LSTM model, for example, each step of input in the model, based on the hidden state obtained by the previous step of update to obtain a time relationship estimation of the current step, where the process is expressed as:

wherein,,

estimate for time relation in unidirectional LSTM, W _x As a parameter matrix, b _x As a deviation vector, h _t-1 Is the hidden state at time t-1.

The above calculation formula only obtains the time relation estimation value in the unidirectional LSTM, then combines the time relation estimation values obtained by the bidirectional LSTM to obtain the time relation estimation value of the bidirectional LSTM, and the process is expressed as follows:

wherein,,

estimating the time relationship of the traffic flow of the lane in the first direction at the time t;

At the time t, estimating the time relationship of the traffic flow of the lane in the second direction;

And the time relation of the traffic flow of the lane at the moment t is represented.

Then, the time relation estimation and the space relation estimation of the time step are fused through the following formula

Wherein beta is _t Representing a trainable parameter vector, sometimes lacking input data at the current step due to the problem of data loss, can be used to

With the current input x _t Fusion to obtain new input->

Hidden state h to update LSTM _t This process can be expressed as:

wherein m is _t For the time t, the traffic flow observation data position information of all lanes, gamma _t The time delay coefficient is used for controlling the influence caused by continuous data missing, and can reduce the influence of continuous missing on hidden state updating when the data is continuously missing, and the calculation mode is expressed as follows:

γ _t ＝exp{-max(0,W _γ δ _t +b _γ )}

wherein W is _γ And b _γ Representing trainable parameters, delta _t Representing the length of time that the data was continuously missing before the current time step.

Therefore, the embedded representation of the traffic flow global distribution of the lane can be predicted in time sequence through the pre-trained bidirectional LSTM model, and the traffic flow time relation representation of the lane can be obtained.

In step 307, the data missing from the traffic flow of the intersection is complemented according to the traffic flow integrated space representation and the traffic flow time relationship representation.

Optionally, the lane traffic flow comprehensive space representation and the lane traffic flow time relation representation are subjected to fusion processing, the data obtained after the fusion processing are determined to be lane traffic flow data of the intersection, and the lane traffic flow data comprise data for observing existence and missing data estimated values. In one implementation, the time relation estimation value and the space relation estimation value of the time step are fused through the formula (3) so as to obtain the lane traffic flow data of the current time step.

By implementing the embodiment of the application, the global distribution information of the traffic flow of all lanes in a period can be obtained by introducing a pre-trained self-coding model, and the data sparsity problem can be solved by introducing the global distribution information of the traffic flow of all lanes in a period, so that the accuracy of traffic flow data complementation can be further improved, and the problem of continuous deletion or large-area simultaneous deletion can be solved.

The application also provides a training method of the lane traffic flow data complement model. And training a lane traffic flow data complement model so as to realize lane-level traffic flow data complement by using the trained lane traffic flow data complement model. In an embodiment of the present application, as shown in fig. 5, the lane traffic flow data complement model may include a variety of spatial relationship fusion modules, a pre-trained self-encoding module, and a bi-directional LSTM module. As shown in fig. 6, the training method of the lane traffic flow data complement model may include, but is not limited to, the following steps.

In step 601, attribute information of a lane in an intersection and traffic flow data observed by the lane in a third period of time are acquired.

Alternatively, step 601 may be implemented in any implementation manner of embodiments of the present application, which is not limited to this embodiment, and is not described in detail herein.

In step 602, attribute information of the lane and traffic flow data observed by the lane in a third time period are input into a plurality of spatial relationship fusion modules to obtain a lane traffic flow comprehensive spatial representation.

In one implementation, attribute information of the lane and traffic flow data observed by the lane in a third time period are input into a plurality of spatial relationship fusion modules; calculating the attribute information of the lane and the traffic flow data observed by the lane in a third time period based on the multiple spatial relationship fusion modules to obtain multiple spatial relationship matrixes of the lane; based on a plurality of spatial relationship fusion modules, adopting graph convolution operation to aggregate multi-order spatial relationships to respectively process a plurality of spatial relationship matrixes of the lanes so as to obtain a plurality of lane traffic flow representations based on a single spatial relationship; and carrying out fusion processing on a plurality of lane traffic flow representations based on the single spatial relationship based on the multiple spatial relationship fusion modules to generate a lane traffic flow comprehensive spatial representation.

Can be expressed as:

wherein l _i And l _j Representing two different lanes, respectively.

The geometric distance between every two of the two lanes can be calculated according to the geographic position information of the lanes, a distance threshold value is given, and the vehicle is considered to be a vehicle when the distance between the lanes is smaller than the threshold valueAdjacent relations exist between tracks and an adjacency relation matrix is constructed, and the elements in the ith row and the jth column of the matrix

Can be expressed as:

Can be expressed as:

wherein the k value represents the order,

That is, A in the above formula (1) can be expressed ^(k) The adjacent relation matrix, the adjacent relation matrix and the similar relation matrix are replaced respectively, so that the lane traffic flow representation based on the adjacent relation matrix, the lane traffic flow representation based on the adjacent relation matrix and the lane traffic flow representation based on the similar relation matrix can be calculated by using the formula (1). Then, the lane traffic flow representation based on the reachable relation adjacency matrix, the lane traffic flow representation based on the adjacent relation matrix and the lane traffic flow representation based on the similarity relation matrix can be subjected to fusion processing to obtain the lane traffic flow comprehensive space representation.

In an alternative implementation, based on multiple spatial relationship fusion modules, a plurality of lane traffic flow representations based on a single spatial relationship are fused by adopting an attention mechanism to generate a lane traffic flow comprehensive spatial representation. In order to fuse multiple spatial relationships of lanes, the application can dynamically fuse multiple spatial relationships of lanes (such as reachable relationships, adjacent relationships and similar relationships) at different times by adopting an attention mechanism, and the fusion method can be expressed as follows:

Wherein,,

The spatial representation is synthesized for the resulting lane traffic flow.

In step 603, traffic flow data observed by the lane during a third time period is input to the self-encoding module to obtain an embedded representation of the global distribution of traffic flow of the lane.

In one implementation, traffic flow data observed by the lane in a first time period is encoded based on a pre-trained self-encoding model, so as to obtain lane traffic flow encoding. And processing the lane traffic flow codes based on a preset function to obtain embedded representation of the lane traffic flow global distribution. The self-coding model is a pre-trained model, and the training manner of the self-coding model can refer to the description of the embodiment shown in fig. 4 and will not be described herein.

In step 604, the lane traffic flow integrated spatial representation and the embedded representation of the lane traffic flow global distribution are input to the bi-directional LSTM module to obtain lane traffic flow data for the intersection output by the bi-directional LSTM module.

In one implementation, an embedded representation of the global distribution of lane traffic flow is input to a bi-directional LSTM module, obtaining an estimate of the lane traffic flow time relationship in a first direction and an estimate of the lane traffic flow time relationship in a second direction; processing the lane traffic flow time relation estimation value in the first direction and the lane traffic flow time relation estimation value in the second direction based on the bidirectional LSTM module to obtain a lane traffic flow time relation representation; the lane traffic flow comprehensive space representation is input to a bidirectional LSTM module, and the lane traffic flow time relation representation and the lane traffic flow comprehensive space representation are fused based on the bidirectional LSTM module to obtain lane traffic flow data of an intersection output by the bidirectional LSTM module.

In step 605, a model loss value is calculated from traffic flow data observed by the lane during a third time period, the lane traffic flow integrated space representation, and the lane traffic flow data of the intersection.

In one implementation, a first loss value in a first direction in a bidirectional LSTM module is calculated according to traffic flow data observed by a lane in a third time period, an estimate of a lane traffic flow time relationship in the first direction, a lane traffic flow integrated spatial representation, and lane traffic flow data at an intersection; calculating a second loss value in a second direction in the bidirectional LSTM module according to traffic flow data observed by the lane in a third time period, the lane traffic flow time relation estimation value in the second direction, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection; calculating a third loss value according to the lane traffic flow time relation estimation in the first direction and the lane traffic flow time relation estimation in the second direction; and calculating a model loss value according to the first loss value, the second loss value and the third loss value.

To learn time-dependence in lane traffic flow, the present application learns based on bi-directional LSTM. First, initializing an initial hidden state h in a bi-directional LSTM module with an embedded representation of the obtained global distribution of lane traffic flow ₀ . In order to solve the problem of traffic flow completion of a lane, the present application performs a linear operation on some model elements changed in a conventional bidirectional LSTM module, for example, each step of input in the module, based on the hidden state obtained by the previous step of update to obtain a time relationship estimation of the current step, where the process is expressed as:

wherein,,

wherein,,

And the time relation of the traffic flow of the lane at the moment t is represented. />

With the current input x _t Fusion to obtain new input->

Hidden state h to update LSTM _t This process can be expressed as:

γ _t ＝exp{-max(0,W _γ δ _t +b _γ )}

For a single direction LSTM, one can train by minimizing the following loss function:

wherein,,

representing a first order normal loss function for a given x _t And y _t Further expressed as:

can be calculated by using the above formula (5)

And->

To further unify time estimates obtained in different directions in a bi-directional LSTM module, the present application proposes a new loss function to reduce the gap between two estimates, the loss function being defined as:

the final model loss function is defined as:

wherein,,

and->

The loss functions respectively representing two directions in the bidirectional LSTM can be obtained according to the loss function of the unidirectional LSTM model, namely, the loss values +. >

And

(i.e., the first and second penalty values described above). The third loss value is calculated by using the above formula (6), and the model loss value is calculated by using the above formula (7).

In step 606, a lane traffic flow data completion model is trained based on the model loss values.

Alternatively, the lane traffic flow data complement model is trained by minimizing the model loss function to obtain an estimate of the missing data.

By implementing the embodiment of the application, by fusing various spatial relationships between lanes: the method comprises the steps of obtaining a comprehensive representation mode among lanes by comprising an reachable relation, an adjacent relation, a similar relation and the like, fully reflecting the spatial relevance of traffic flow of the lanes, learning the time relevance of the traffic flow of the same lane at different time points, and fusing the time relevance with the spatial relevance obtained before to obtain the completed data. In order to further improve the model effect, global distribution information of traffic flow of all lanes in a period of time is added to cope with the data sparseness problem.

The application also provides another lane traffic flow data complement method, which can realize lane-level traffic flow data complement by using the lane traffic flow data complement model in the embodiment. As shown in fig. 7, the lane traffic flow data complement method may include, but is not limited to, the following steps.

In step 701, attribute information of a lane in an intersection and traffic flow data observed by the lane in a fourth period of time are acquired.

In step 702, attribute information of the lane and traffic flow data observed by the lane in a fourth period of time are input to a pre-trained lane traffic flow data complement model.

In the embodiment of the application, the lane traffic flow data complement model is obtained through training by the training method of any embodiment of the application. The input of the lane traffic flow data complement model is the attribute information of the lane and the traffic flow data observed by the lane in a period of time, and the output of the lane traffic flow data complement model is the lane traffic flow data

In step 703, lane traffic flow data output by the lane traffic flow data complement model is obtained; the lane traffic flow data includes data of the existence of the lane traffic flow observation and missing data estimation values.

By implementing the embodiment of the application, the lane-level traffic flow data complement can be realized through the pre-trained lane traffic flow data complement model, so that the accuracy of traffic flow data complement can be effectively improved, and the problem of continuous deletion or large-area simultaneous deletion can be solved.

In order to achieve the above embodiment, the present application further provides a lane traffic flow data complement device. Fig. 8 is a block diagram of a lane traffic flow data complement device according to an embodiment of the present application. As shown in fig. 8, the lane traffic flow data complement apparatus may include: a first acquisition module 801, a generation module 802, a second acquisition module 803, a third acquisition module 804, and a complement module 805.

The first obtaining module 801 is configured to obtain attribute information of a lane in an intersection and traffic flow data observed by the lane in a first period of time.

The generating module 802 is configured to generate a lane traffic flow comprehensive spatial representation according to attribute information of a lane and traffic flow data observed by the lane in a first period of time. In one implementation, the generating module 802 includes: the first generation unit is used for generating a plurality of spatial relation matrixes of the lanes according to the attribute information of the lanes and the traffic flow data observed by the lanes in the first time period; the second generation unit is used for generating a lane traffic flow comprehensive space representation based on the lane multiple space relation matrixes.

In one possible implementation manner, the second generating unit is specifically configured to: respectively processing multiple spatial relationship matrixes of the lanes based on the graph rolling operation aggregation multi-order spatial relationship to obtain multiple lane traffic flow representations based on a single spatial relationship; and carrying out fusion processing on the lane traffic flow representations based on the single spatial relationship to generate a lane traffic flow comprehensive spatial representation.

In one possible implementation manner, the second generating unit is specifically configured to: and carrying out fusion processing on a plurality of lane traffic flow representations based on a single spatial relationship by adopting an attention mechanism so as to generate a lane traffic flow comprehensive spatial representation.

The second obtaining module 803 is configured to obtain an embedded representation of the traffic flow global distribution of the lane according to traffic flow data observed by the lane in the first period.

In one implementation, the second obtaining module 803 is specifically configured to: carrying out coding processing on traffic flow data observed by a lane in a first time period to obtain lane traffic flow codes; and processing the lane traffic flow codes based on a preset function to obtain embedded representation of the lane traffic flow global distribution.

In one possible implementation, the second obtaining module 803 is specifically configured to: and carrying out coding processing on traffic flow data observed by the lane in a first time period based on a pre-trained self-coding model to obtain lane traffic flow codes.

The third obtaining module 804 is configured to obtain a time relationship representation of the traffic flow of the lane according to the embedded representation of the global traffic flow distribution of the lane. In one implementation, the third obtaining module 804 is specifically configured to: and carrying out time sequence prediction on embedded representation of the traffic flow global distribution of the lane based on a pre-trained two-way long-short-term memory artificial neural network LSTM model to obtain the traffic flow time relation representation of the lane.

The completion module 805 is configured to perform a completion operation on missing data in the traffic flow of the traffic road at the intersection according to the traffic flow comprehensive space representation and the traffic flow time relationship representation. In one implementation, the complement module 805 is specifically configured to: and carrying out fusion processing on the lane traffic flow comprehensive space representation and the lane traffic flow time relation representation, and determining the data obtained after the fusion processing as lane traffic flow data of the intersection, wherein the lane traffic flow data comprises data for observing existence and an estimated value of missing data.

In some embodiments of the present application, as shown in fig. 9, the lane traffic flow data completing apparatus may further include: a pre-training module 906. The pre-training module 906 is configured to pre-train the self-coding model; the pre-training module 906 is specifically configured to: acquiring traffic flow data observed by a lane in an intersection in a second time period; wherein the second time period is earlier than the first time period; inputting traffic flow data observed by the lane in a second time period into a self-coding learning model; the self-coding learning model comprises an encoder and a decoder; the method comprises the steps of obtaining coding information which is output after the coder codes traffic flow data observed by a lane in a second time period for a plurality of times; inputting the encoded information output by the encoder to a decoder to obtain lane traffic flow reconstruction data which are output after the decoder decodes the encoded information for a plurality of times in sequence; wherein the number of codes of the encoder is the same as the number of decodes of the decoder; acquiring the position information of traffic flow observation data of a lane at an intersection; the lane traffic flow observation data position information comprises position information of data existing in the lane traffic flow observation and missing data; calculating a loss value according to the traffic flow data observed by the lane in the second time period, the lane traffic flow reconstruction data and the lane traffic flow observation data position information; and training a self-coding learning model according to the loss value, and determining an encoder in the trained self-coding learning model as the self-coding model.

Wherein 901-905 in fig. 9 and 801-805 in fig. 8 have the same function and structure.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

The application also provides a training device of the lane traffic flow data complement model. In an embodiment of the present application, the lane traffic flow data complement model includes a plurality of spatial relationship fusion modules, a pre-trained self-coding module, and a two-way long-short-term memory artificial neural network LSTM module, where, as shown in fig. 10, the training device of the lane traffic flow data complement model may include: a first acquisition module 1001, a second acquisition module 1002, a third acquisition module 1003, a fourth acquisition module 1004, a calculation module 1005, and a training module 1006.

The first obtaining module 1001 is configured to obtain attribute information of a lane in an intersection and traffic flow data observed by the lane in a third period of time.

The second obtaining module 1002 is configured to input attribute information of the lane and traffic flow data observed by the lane in a third time period into a plurality of spatial relationship fusion modules, so as to obtain a comprehensive spatial representation of traffic flow of the lane. In one implementation, the second obtaining module 1002 is specifically configured to: inputting the attribute information of the lane and the traffic flow data observed by the lane in a third time period into a plurality of spatial relationship fusion modules; calculating the attribute information of the lane and the traffic flow data observed by the lane in a third time period based on the multiple spatial relationship fusion modules to obtain multiple spatial relationship matrixes of the lane; based on a plurality of spatial relationship fusion modules, adopting graph convolution operation to aggregate multi-order spatial relationships to respectively process a plurality of spatial relationship matrixes of the lanes so as to obtain a plurality of lane traffic flow representations based on a single spatial relationship; and carrying out fusion processing on a plurality of lane traffic flow representations based on the single spatial relationship based on the multiple spatial relationship fusion modules to generate a lane traffic flow comprehensive spatial representation.

In one possible implementation, the second obtaining module 1002 is specifically configured to: based on the multiple spatial relationship fusion modules, a concentration mechanism is adopted to fuse a plurality of lane traffic flow representations based on a single spatial relationship so as to generate a lane traffic flow comprehensive spatial representation.

The third obtaining module 1003 is configured to input traffic flow data observed by the lane in a third period of time to the self-encoding module, and obtain an embedded representation of the traffic flow global distribution of the lane.

The fourth obtaining module 1004 is configured to input the lane traffic flow comprehensive space representation and the embedded representation of the lane traffic flow global distribution to the bidirectional LSTM module, and obtain lane traffic flow data of the intersection output by the bidirectional LSTM module.

In one implementation, the fourth obtaining module 1004 is specifically configured to: the embedded representation of the overall distribution of the traffic flow of the lane is input to a bidirectional LSTM module, and an estimated value of the traffic flow time relationship of the lane in the first direction and an estimated value of the traffic flow time relationship of the lane in the second direction are obtained; processing the lane traffic flow time relation estimation value in the first direction and the lane traffic flow time relation estimation value in the second direction based on the bidirectional LSTM module to obtain a lane traffic flow time relation representation; the lane traffic flow comprehensive space representation is input to a bidirectional LSTM module, and the lane traffic flow time relation representation and the lane traffic flow comprehensive space representation are fused based on the bidirectional LSTM module to obtain lane traffic flow data of an intersection output by the bidirectional LSTM module.

The calculating module 1005 is configured to calculate a model loss value according to traffic flow data observed by the lane in the third time period, the lane traffic flow integrated space representation, and the lane traffic flow data of the intersection.

In one implementation, the computing module 1005 is specifically configured to: calculating a first loss value of the first direction in the bidirectional LSTM module according to traffic flow data observed by the lane in a third time period, the lane traffic flow time relation estimation value of the first direction, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection; calculating a second loss value in a second direction in the bidirectional LSTM module according to traffic flow data observed by the lane in a third time period, the lane traffic flow time relation estimation value in the second direction, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection; calculating a third loss value according to the lane traffic flow time relation estimation in the first direction and the lane traffic flow time relation estimation in the second direction; and calculating a model loss value according to the first loss value, the second loss value and the third loss value.

The training module 1006 is configured to train the lane traffic flow data complement model according to the model loss value.

In order to implement the above embodiment, another lane traffic flow data complement device is also provided. As shown in fig. 11, the lane traffic flow data complement apparatus may include: a first acquisition module 1101, an input module 1102, and a second acquisition module 1103. The first obtaining module 1101 is configured to obtain attribute information of a lane in an intersection and traffic flow data observed by the lane in a fourth time period.

The input module 1102 is configured to input attribute information of a lane and traffic flow data observed by the lane in a fourth time period to a pre-trained lane traffic flow data complement model; the lane traffic flow data complement model is obtained through training by the training device in any embodiment.

The second obtaining module 1103 is configured to obtain lane traffic flow data output by the lane traffic flow data complement model; the lane traffic flow data includes data of the existence of the lane traffic flow observation and missing data estimation values.

According to embodiments of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 12, is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.

As shown in fig. 12, the electronic device includes: one or more processors 1201, memory 1202, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 1201 is illustrated in fig. 12.

Memory 1202 is a non-transitory computer-readable storage medium provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the lane traffic data complement method and/or the training method of the lane traffic data complement model provided by the application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of any one of the embodiments provided herein.

The memory 1202 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the lane traffic flow data complement method and/or the training method of the lane traffic flow data complement model in the embodiments of the present application. The processor 1201 performs various functional applications of the server and data processing, i.e., implements the lane traffic data complement method and/or the training method of the lane traffic data complement model in the above-described method embodiments by running non-transitory software programs, instructions, and modules stored in the memory 1202.

Memory 1202 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 1202 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 1202 optionally includes memory remotely located relative to processor 1201, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device may further include: an input device 1203 and an output device 1204. The processor 1201, the memory 1202, the input device 1203, and the output device 1204 may be connected by a bus or otherwise, for example in fig. 12.

The input device 1203 may receive entered numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output device 1204 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

According to the technical scheme of the embodiment of the application, the spatial correlation among various lanes is fused, and the complex spatial correlation information among the lanes is fully utilized to pointedly solve the traffic flow loss problem at the lane level; in addition, the time-space correlation information is fused, so that the correlation performance in two aspects can be effectively utilized simultaneously, and the missing data can be effectively and accurately estimated aiming at the problems of continuous missing and simultaneous missing.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A lane traffic flow data completion method comprising:

Performing time sequence prediction on the embedded representation of the traffic flow global distribution of the lane based on a pre-trained two-way long-short-term memory artificial neural network LSTM model to obtain a traffic flow time relationship representation of the lane;

2. The method of claim 1, wherein the generating a lane traffic flow integrated spatial representation from attribute information of the lane and traffic flow data observed by the lane over a first period of time comprises:

generating a plurality of spatial relation matrixes of the lanes according to the attribute information of the lanes and the traffic flow data observed by the lanes in a first time period;

and generating the lane traffic flow comprehensive spatial representation based on the lane multiple spatial relationship matrices.

3. The method of claim 2, wherein the generating the lane traffic flow integrated spatial representation based on the lane multiple spatial relationship matrices comprises:

aggregating multiple spatial relationship matrixes of the lanes based on graph convolution operation to respectively process the multiple spatial relationship matrixes of the lanes so as to obtain multiple lane traffic flow representations based on a single spatial relationship;

And carrying out fusion processing on the lane traffic flow representations based on the single spatial relationship to generate the lane traffic flow comprehensive spatial representation.

4. The method of claim 3, wherein the fusing the plurality of single spatial relationship-based lane traffic flow representations to generate the lane traffic flow integrated spatial representation comprises:

and carrying out fusion processing on the lane traffic flow representations based on the single spatial relationship by adopting an attention mechanism so as to generate the lane traffic flow comprehensive spatial representation.

5. The method of claim 1, wherein the obtaining an embedded representation of a global distribution of traffic flow of a lane from traffic flow data observed by the lane over a first period of time comprises:

coding the traffic flow data observed by the lane in a first time period to obtain lane traffic flow codes;

and processing the lane traffic flow codes based on a preset function to obtain the embedded representation of the lane traffic flow global distribution.

6. The method of claim 5, wherein the encoding the traffic flow data observed by the lane during the first time period to obtain a lane traffic flow code comprises:

And carrying out coding processing on traffic flow data observed by the lane in a first time period based on a pre-trained self-coding model to obtain lane traffic flow codes.

7. The method of claim 6, wherein the self-encoding model is pre-trained by:

acquiring traffic flow data observed by a lane in the intersection in a second time period; wherein the second time period is earlier than the first time period;

inputting traffic flow data observed by the lane in a second time period into a self-coding learning model; the self-coding learning model comprises an encoder and a decoder;

obtaining coding information which is output after the coder codes traffic flow data observed by the lane in a second time period for a plurality of times;

inputting the encoded information output by the encoder to the decoder to obtain lane traffic flow reconstruction data which are output after the decoder decodes the encoded information for a plurality of times in sequence; wherein the number of codes of the encoder is the same as the number of decodes of the decoder;

acquiring the position information of the traffic flow observation data of the lane of the intersection; the lane traffic flow observation data position information comprises position information of data existing in the lane traffic flow observation and missing data;

Calculating a loss value according to the traffic flow data observed by the lane in the second time period, the lane traffic flow reconstruction data and the lane traffic flow observation data position information;

and training the self-coding learning model according to the loss value, and determining the encoder in the trained self-coding learning model as the self-coding model.

8. The method of claim 1, wherein the performing a complement operation on missing data in the traffic flow of the intersection based on the comprehensive spatial representation of traffic flow of the lane and the temporal relationship representation of traffic flow of the lane comprises:

and carrying out fusion processing on the lane traffic flow comprehensive space representation and the lane traffic flow time relation representation, and determining data obtained after the fusion processing as lane traffic flow data of the intersection, wherein the lane traffic flow data comprises data for observing existence and an estimated value of missing data.

9. The lane traffic flow data complement model comprises a plurality of spatial relationship fusion modules, a pre-trained self-coding module and a two-way long-short-term memory artificial neural network LSTM module, wherein the method comprises the following steps:

10. The method of claim 9, wherein the inputting the attribute information of the lane and the traffic flow data observed by the lane during a third time period to the plurality of spatial relationship fusion modules to obtain a lane traffic flow integrated spatial representation comprises:

Inputting the attribute information of the lane and the traffic flow data observed by the lane in a third time period into the multiple spatial relationship fusion module;

calculating the attribute information of the lane and the traffic flow data observed by the lane in a third time period based on the multiple spatial relationship fusion modules to obtain multiple spatial relationship matrixes of the lane;

based on the multiple spatial relationship fusion module, adopting graph convolution operation to aggregate multiple-order spatial relationships to respectively process the multiple spatial relationship matrixes of the lanes so as to obtain multiple lane traffic flow representations based on single spatial relationships;

and carrying out fusion processing on the lane traffic flow representations based on the single spatial relationship based on the multiple spatial relationship fusion modules to generate the lane traffic flow comprehensive spatial representation.

11. The method of claim 10, wherein the fusing the plurality of single spatial relationship based lane traffic flow representations based on the plurality of spatial relationship fusion modules to generate the lane traffic flow integrated spatial representation comprises:

based on the multiple spatial relationship fusion modules, adopting an attention mechanism to fuse the multiple lane traffic flow representations based on the single spatial relationship so as to generate the lane traffic flow comprehensive spatial representation.

12. The method of claim 9, wherein the inputting the lane traffic volume composite spatial representation and the embedded representation of the lane traffic volume global distribution to the bi-directional LSTM module to obtain lane traffic volume data for the intersection output by the bi-directional LSTM module comprises:

the embedded representation of the traffic flow global distribution of the lane is input to the bidirectional LSTM module, and an estimated traffic flow time relationship value of the lane in the first direction and an estimated traffic flow time relationship value of the lane in the second direction are obtained;

processing the lane traffic flow time relation estimation value in the first direction and the lane traffic flow time relation estimation value in the second direction based on the bidirectional LSTM module to obtain a lane traffic flow time relation representation;

inputting the lane traffic flow comprehensive space representation to the bidirectional LSTM module, and carrying out fusion processing on the lane traffic flow time relation representation and the lane traffic flow comprehensive space representation based on the bidirectional LSTM module to obtain lane traffic flow data of the intersection output by the bidirectional LSTM module.

13. The method of claim 12, wherein the calculating a model loss value from traffic flow data observed by the lane over a third time period, the lane traffic flow integrated spatial representation, and lane traffic flow data of the intersection comprises:

Calculating a first loss value of the first direction in the bidirectional LSTM module according to traffic flow data observed by the lane in a third time period, the lane traffic flow time relation estimation value of the first direction, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection;

calculating a second loss value of the second direction in the bidirectional LSTM module according to traffic flow data observed by the lane in a third time period, the lane traffic flow time relation estimation value of the second direction, the lane traffic flow comprehensive space representation and the lane traffic flow data of the intersection;

calculating a third loss value according to the lane traffic flow time relation estimation value in the first direction and the lane traffic flow time relation estimation value in the second direction;

and calculating the model loss value according to the first loss value, the second loss value and the third loss value.

14. A lane traffic flow data completion method comprising:

Inputting the attribute information of the lane and the traffic flow data observed by the lane in a fourth time period into a pre-trained lane traffic flow data complement model; wherein the lane traffic flow data complement model is trained by the method of any one of claims 9 to 13;

15. A lane traffic flow data complement device comprising:

the third acquisition module is used for carrying out time sequence prediction on the embedded representation of the traffic flow global distribution of the lane based on a pre-trained two-way long-short-term memory artificial neural network LSTM model to obtain a traffic flow time relation representation of the lane;

16. The apparatus of claim 15, wherein the generating means comprises:

the first generation unit is used for generating a plurality of spatial relationship matrixes of the lanes according to the attribute information of the lanes and the traffic flow data observed by the lanes in a first time period;

and the second generation unit is used for generating the lane traffic flow comprehensive space representation based on the lane multiple space relation matrixes.

17. The apparatus of claim 16, wherein the second generating unit is specifically configured to:

18. The apparatus of claim 17, wherein the second generating unit is specifically configured to:

19. The apparatus of claim 15, wherein the second acquisition module is specifically configured to:

20. The apparatus of claim 19, wherein the second acquisition module is specifically configured to:

21. The apparatus of claim 20, further comprising:

the pre-training module is used for pre-training the self-coding model; the pre-training module is specifically configured to:

22. The apparatus of claim 15, wherein the completion module is specifically configured to:

23. The utility model provides a training device of lane traffic flow data complement model, lane traffic flow data complement model includes multiple spatial relation fusion module, the self-coding module that retrains well in advance and two-way long and short term memory artificial neural network LSTM module, wherein, the device includes:

24. The apparatus of claim 23, wherein the second acquisition module is specifically configured to:

25. The apparatus of claim 24, wherein the second acquisition module is specifically configured to:

26. The apparatus of claim 23, wherein the fourth acquisition module is specifically configured to:

27. The apparatus of claim 26, wherein the computing module is specifically configured to:

28. A lane traffic flow data complement device comprising:

The input module is used for inputting the attribute information of the lane and the traffic flow data observed by the lane in a fourth time period into a pre-trained lane traffic flow data complement model; wherein the lane traffic flow data complement model is trained by the apparatus of any one of claims 23 to 27;

29. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 8, or to perform the method of any one of claims 9 to 13, or to perform the method of claim 14.

30. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1 to 8, or causing a computer to perform the method of any one of claims 9 to 13, or causing a computer to perform the method of claim 14.

31. A computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of any one of claims 1 to 8, or implements the steps of the method of any one of claims 9 to 13, or implements the steps of the method of claim 14.