CN114566045B

CN114566045B - Method and device for training scheduling model and method and device for realizing cooperative driving

Info

Publication number: CN114566045B
Application number: CN202210187529.3A
Authority: CN
Inventors: 李力; 张嘉玮; 常成; 彭心宇
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-02-28
Filing date: 2022-02-28
Publication date: 2023-01-17
Anticipated expiration: 2042-02-28
Also published as: CN114566045A

Abstract

The embodiment of the invention discloses a method and a device for training a scheduling model, and a method and a device for realizing cooperative driving, wherein sample vehicle state information is subjected to embedding processing, and a high-dimensional state vector corresponding to each first vehicle is mapped; processing the obtained high-dimensional state vector to obtain the corresponding incidence relation information of each first vehicle; determining the passing sequence information of the first vehicle according to the obtained incidence relation information; calculating the sum of delays of all the first vehicles passing through the no-signal intersections according to the obtained passing sequence information; and determining a deep learning model for vehicle dispatching according to the calculated delay sum. According to the embodiment of the invention, the scheduling model which can be used for vehicle scheduling is obtained through offline sample data training, the vehicle scheduling time is shortened through the scheduling model, and the efficiency and the quality of vehicle scheduling are improved.

Description

Method and device for training scheduling model and method and device for realizing cooperative driving

Technical Field

The present disclosure relates to but not limited to neural network technologies, and in particular, to a method and an apparatus for training a scheduling model, and a method and an apparatus for implementing cooperative driving.

Background

The intelligent vehicle-road cooperative system adopts advanced wireless communication and rapid edge calculation and other technologies to realize information sharing between vehicles and road side equipment in an all-around manner. The vehicle cooperative driving technology is based on the collected real-time traffic and vehicle information, adopts a centralized decision-making and control method, can ensure the traffic safety in the driving process of the vehicle, can obviously improve the efficiency of a traffic system, and is a brand-new technical route for realizing automatic driving.

At the non-signalized intersection, the core of the vehicle cooperative driving decision problem is the road right distribution problem, which determines the time consumption of the vehicle passing through the non-signalized intersection and directly influences the traffic efficiency of the non-signalized intersection. Because the intersection road right distribution problem is a non-deterministic polynomial (NP) -hard (hard) problem, the method provided by the related art can only deal with scenes with fewer vehicles (less than 15), and cannot provide a real-time and efficient road right distribution decision scheme for more general scenes with more vehicles (more than 20); in addition, methods in the related art mostly depend on the topological structure of the intersection, and for new intersections, the road right distribution decision is often adjusted and adapted through a large number of exploration experiments, and the method cannot be efficiently and quickly deployed to urban road network traffic systems with various intersections.

Disclosure of Invention

The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.

The embodiment of the invention provides a method and a device for training a scheduling model and a method and a device for realizing cooperative driving, which can improve the scheduling efficiency and quality of vehicles.

The embodiment of the invention provides a method for training a scheduling model, which comprises the following steps:

the method comprises the steps that a scheduling model to be trained carries out embedding processing on input sample vehicle state information of each first vehicle to pass through a signalless intersection to obtain a corresponding high-dimensional state vector of each first vehicle; the dimensionality of the high-dimensional state vector is a preset dimensionality;

processing the obtained high-dimensional state vector to obtain incidence relation information of each first vehicle; wherein the incidence relation information includes, for each first vehicle: the high-dimensional state vector and the information of the collision and coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle;

determining the passing sequence information of the first vehicle according to the obtained incidence relation information;

calculating the delay sum of all first vehicles to pass through the no-signal intersection according to the obtained passing sequence information;

and determining the parameters of the scheduling model to be trained according to the calculated delay sum.

On the other hand, an embodiment of the present invention further provides a computer storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for training a scheduling model is implemented.

In another aspect, an embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having a computer program stored therein; wherein, the first and the second end of the pipe are connected with each other,

the processor is configured to execute the computer program in the memory;

the computer program, when executed by the processor, implements a method of training a scheduling model as described above.

In another aspect, an embodiment of the present invention further provides a method for implementing collaborative driving, including:

receiving real-time vehicle state information of a second vehicle to pass through the signalless intersection, which is acquired by the drive test equipment in real time;

inputting the received real-time state information of the vehicle into a preset scheduling model to obtain the passing sequence information of a second vehicle;

carrying out running control on the second vehicle according to the obtained passing sequence information of the second vehicle;

the scheduling model is trained by the method for training the scheduling model.

In still another aspect, an embodiment of the present invention further provides a computer storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for implementing collaborative driving is implemented.

In another aspect, an embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having a computer program stored therein; wherein the content of the first and second substances,

the processor is configured to execute the computer program in the memory;

the computer program, when executed by the processor, implements a method of implementing coordinated driving as described above.

In another aspect, an embodiment of the present invention further provides a device for training a scheduling model, including: the device comprises an embedding unit, a correlation information determining unit, a communication sequence determining unit, a delay sum calculating unit and a model determining unit; wherein the content of the first and second substances,

the embedding unit is configured to: the method comprises the steps that a scheduling model to be trained carries out embedding processing on input sample vehicle state information of each first vehicle to pass through a signalless intersection to obtain a corresponding high-dimensional state vector of each first vehicle; the dimensionality of the high-dimensional state vector is a preset dimensionality;

determining the associated information unit to be set as: processing the obtained high-dimensional state vector to obtain the incidence relation information of each first vehicle; wherein the incidence relation information includes, for each first vehicle: the high-dimensional state vector and the information of the conflict and the coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle;

the communication order determining unit is set as: determining the passing sequence information of the first vehicle according to the obtained incidence relation information;

the calculate delay sum unit is set to: calculating the delay sum of all first vehicles to pass through the no-signal intersection according to the obtained passing sequence information;

the determination model unit is arranged to: and determining the parameters of the scheduling model to be trained according to the calculated delay sum.

In another aspect, an embodiment of the present invention further provides a device for implementing collaborative driving, including: the system comprises a receiving unit, a road right distribution unit and a control processing unit; wherein, the first and the second end of the pipe are connected with each other,

the receiving unit is configured to: receiving real-time vehicle state information of a second vehicle to pass through the signalless intersection, which is acquired by the drive test equipment in real time;

the right of way distribution unit sets up as: inputting the received real-time state information of the vehicle into a preset scheduling model to obtain the passing sequence information of a second vehicle;

the control processing unit is configured to: carrying out running control on the second vehicle according to the obtained passing sequence information of the second vehicle;

wherein the scheduling model is trained by the device for training the scheduling model.

The technical scheme of the application includes: the method comprises the steps that a scheduling model to be trained carries out embedding processing on input sample vehicle state information of each first vehicle to pass through a non-signalized intersection to obtain a corresponding high-dimensional state vector of each first vehicle; the dimensionality of the high-dimensional state vector is a preset dimensionality; processing the obtained high-dimensional state vector to obtain the incidence relation information of each first vehicle; wherein the incidence relation information includes, for each first vehicle: the high-dimensional state vector and the information of the collision and coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle; determining the passing sequence information of the first vehicle according to the obtained incidence relation information; calculating the delay sum of all first vehicles to pass through the no-signal intersection according to the obtained passing sequence information; and determining the parameters of the scheduling model to be trained according to the calculated delay sum. According to the embodiment of the invention, the scheduling model which can be used for vehicle scheduling is obtained through offline data training, the vehicle scheduling time is shortened through the scheduling model, and the vehicle scheduling efficiency and quality are improved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.

FIG. 1 is a flow chart of a method of training a scheduling model according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for implementing cooperative driving according to an embodiment of the present invention;

FIG. 3 is a block diagram of an apparatus for training a scheduling model according to an embodiment of the present invention;

FIG. 4 is a block diagram of an apparatus for implementing cooperative driving according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of an exemplary non-signalized intersection in which the present invention is applied;

FIG. 6 is a schematic diagram of an exemplary deep neural network in which the present invention may be implemented;

fig. 7 is a flowchart of a method for implementing cooperative driving according to an exemplary embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.

Fig. 1 is a flowchart of a method for training a scheduling model according to an embodiment of the present invention, as shown in fig. 1, including:

step 101, embedding input sample vehicle state information of each first vehicle to pass through the signalless intersection by a scheduling model to be trained to obtain a high-dimensional state vector corresponding to each first vehicle; the dimensionality of the high-dimensional state vector is a preset dimensionality;

in an exemplary embodiment, the dimension of the high-dimensional state vector of the embodiment of the present invention is greater than the dimension of the sample vehicle state information, in an exemplary embodiment, the dimension of the sample vehicle state information is about 10, and the dimension of the high-dimensional state vector is greater than or equal to 100; it should be noted that the embedding process according to the embodiment of the present invention may be implemented by any neural network that can implement the embedding process; for example, through an embedding process, low-dimensional sample vehicle state information may be mapped to a high-dimensional state vector;

the first vehicle in the embodiment of the present invention refers to a sample vehicle used for the training of the scheduling model.

102, processing the obtained high-dimensional state vector to obtain incidence relation information of each first vehicle; the association relationship information includes, for each first vehicle: the high-dimensional state vector and the information of the conflict and the coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle;

103, determining the passing sequence information of the first vehicle according to the obtained incidence relation information;

104, calculating the sum of the delay of all first vehicles to pass through the non-signalized intersection according to the obtained traffic sequence information;

and 105, determining the parameters of the scheduling model to be trained according to the calculated delay sum.

According to the embodiment of the invention, the scheduling model which can be used for vehicle scheduling is obtained through offline sample data training, the vehicle scheduling time is shortened through the scheduling model, and the efficiency and the quality of vehicle scheduling are improved.

In one illustrative example, the sample vehicle state information in the embodiment of the invention includes: pre-stored vehicle state information of sample vehicles for scheduling model training; in one illustrative example, the sample vehicle state information may be stored in an offline database.

In an illustrative example, an embodiment of the present invention processes an obtained high-dimensional state vector, including:

and processing the obtained high-dimensional state vector through a preset first recurrent neural network.

In an exemplary example, the determining of the traffic sequence information of the first vehicle according to the obtained association relation information includes:

and inputting the incidence relation information into a preset second recurrent neural network, and training the second recurrent neural network to obtain the passing sequence information of the first vehicle.

In one illustrative example, the sample vehicle state information in the embodiment of the invention includes information on one or any combination of the following of the first vehicle:

location, priority, speed, turn, and route.

In one illustrative example, calculating a sum of delays for all first vehicles to pass through the non-signalized intersection based on the obtained traffic sequence information includes calculating the sum of delays by an objective function of:

wherein the content of the first and second substances,

the delay when the first vehicle i to pass through the non-signalized intersection passes through the non-signalized intersection according to the traffic sequence information is represented, N represents the number of the first vehicles to pass through the non-signalized intersection, and J represents the sum of the delays.

In an exemplary embodiment, determining parameters of a scheduling model to be trained according to the calculated delay sums includes:

adjusting parameters of the scheduling model through the strategy gradient so as to enable the delay sum calculated according to the scheduling model after parameter adjustment to be converged;

and when the delay sum calculated by the scheduling model is converged, keeping the parameters of the scheduling model unchanged.

It should be noted that, when the parameters of the scheduling model are adjusted through the policy gradient, if the delay sum does not converge, the delay sum is continuously reduced in the parameter adjustment process, and when the delay sum remains unchanged or fluctuates slightly within a certain range, the delay sum is considered to reach the convergence state, that is, the convergence is satisfied.

In an illustrative example, the embodiment of the present invention may perform parameter adjustment on the neural network of one or any combination of the neural network performing the embedding process, the first hybrid neural network, and the second hybrid neural network through a policy gradient; it should be noted that the strategy gradient is an existing algorithm in the related art, and is not described herein.

The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the computer program is executed by a processor to realize the method for training the scheduling model.

An embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having stored therein a computer program; wherein, the first and the second end of the pipe are connected with each other,

the processor is configured to execute the computer program in the memory;

the computer program, when executed by a processor, implements a method of training a scheduling model as described above.

Fig. 2 is a flowchart of a method for implementing cooperative driving according to an embodiment of the present invention, as shown in fig. 2, including:

step 201, receiving real-time vehicle state information of a second vehicle to pass through a no-signal intersection, which is acquired by a drive test device in real time;

here, the second vehicle in the embodiment of the present invention refers to a vehicle to be scheduled by the scheduling model in real time.

Step 202, inputting the received real-time vehicle state information into a preset scheduling model to obtain the passing sequence information of a second vehicle;

step 203, controlling the running of the second vehicle according to the obtained traffic sequence information of the second vehicle;

wherein, the scheduling model is obtained by training in the steps 101 to 105.

According to the embodiment of the invention, the vehicle dispatching time length is shortened through the dispatching model, and the efficiency and the quality of vehicle dispatching are improved.

In an exemplary example, the real-time vehicle status information in the embodiment of the present invention includes one or any combination of the following information of the second vehicle:

location, priority, speed, turn, and route.

The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and when being executed by a processor, the computer program realizes the method for realizing the cooperative driving.

An embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having stored therein a computer program; wherein the content of the first and second substances,

the processor is configured to execute the computer program in the memory;

the computer program, when executed by a processor, implements a method of implementing coordinated driving as described above.

Fig. 3 is a block diagram of a model training apparatus according to an embodiment of the present invention, as shown in fig. 3, including: the device comprises an embedding unit, a correlation information determining unit, a communication sequence determining unit, a delay sum calculating unit and a model determining unit; wherein the content of the first and second substances,

determining the associated information unit to be set as: processing the obtained high-dimensional state vector to obtain the incidence relation information of each first vehicle; wherein the incidence relation information includes, for each first vehicle: the high-dimensional state vector and the information of the collision and coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle;

In an illustrative example, an embodiment of the present invention determines that the association information unit is set to:

In an illustrative example, the determining communication sequence unit of the embodiment of the present invention is configured to:

location, priority, speed, turn, and route.

In an illustrative example, an embodiment of the present invention calculates the delay sum unit is configured to calculate the delay sum by the following objective function:

wherein the content of the first and second substances,

In an illustrative example, embodiments of the present invention determine that a model element is set to:

when the delay sum is greater than or equal to a preset delay threshold value, parameters of the scheduling model are adjusted through strategy gradients, so that the delay sum calculated according to the scheduling model after parameter adjustment is converged;

Fig. 4 is a block diagram of a device for implementing cooperative driving according to an embodiment of the present invention, as shown in fig. 4, including: the system comprises a receiving unit, a road right distribution unit and a control processing unit; wherein the content of the first and second substances,

wherein the scheduling model is trained by the apparatus for training a scheduling model illustrated in fig. 3.

According to the embodiment of the invention, the vehicle scheduling time is shortened through the scheduling model, and the efficiency and the quality of vehicle scheduling are improved.

The following is a brief description of the embodiments of the present invention by way of application examples, which are only used to illustrate the embodiments of the present invention and are not used to limit the scope of the present invention.

Application example

Fig. 5 is a schematic diagram of an exemplary non-signalized intersection to which the present invention is applied, and as shown in fig. 5, a shaded area is a collision area where a vehicle collision may occur, and a circular area is a control area, and the radius of the control area is determined by a reliable communication range between the vehicle and roadside devices. At the no-signal intersection, the objective of the cooperative driving decision is to make the sum of delays of all vehicles passing through the intersection shortest by determining the priority relationship of the vehicles entering the conflict area. The priority relationship of the vehicles entering the conflict area can be simply represented by the arrangement of vehicle symbols, which is called as a "passing order" in the present example; for example, in the non-signalized intersection shown in fig. 5, ADBCE represents: vehicle a has the highest priority, vehicle D times, … …, and so on. When two vehicles encounter a conflict, such as vehicle B and vehicle D, vehicle D will obtain the right of way first, and vehicle B needs to decelerate or even stop for waiting. Different traffic sequences will result in different delay sums, which the present application example calculates by the following formula:

wherein the content of the first and second substances,

which represents the delay of the vehicle i passing through the no-signal intersection in accordance with the traffic order information, N is the number of vehicles, and J is the sum of the delays. The solution space of the above-described problem of vehicle scheduling grows exponentially with the number of vehicles N, and it is difficult for the method in the related art to find a passing order with a small delay total J within a limited calculation time.

According to the description of the collaborative vehicle driving decision problem at the signalless intersection, the application example of the invention can convert the problem into a sequence-to-sequence mapping problem. Where the input sequence is all vehicles within the intersection control area and the output sequence is the traffic sequence made up of these vehicles. The objective of the cooperative driving decision problem is therefore to find a better mapping so that the traffic sequence composed of the output sequences has a shorter delay sum J.

Application example input sequence of the present invention (sample vehicle state information for all first vehicles to pass through a no-signal intersection) may be expressed as: v = { V ₁ ,V ₂ ,…,V _N Where N represents the number of vehicles. V _i (i =1,2, … …, N) is vehicle state information of vehicle i, including: speed, position, steering, and course of vehicle i, etc.

In order to output a traffic sequence with good enough (i.e. with small sum of delay J) end to end in real time, the present application example designs a deep neural network shown in fig. 6 for learning a potential mapping relationship between sample vehicle state information and traffic sequence information. The core idea of the deep neural network is to assign a probability of occurrence to each traffic sequence. After training, the deep learning model will assign a greater probability of occurrence to traffic orders with smaller sum of delays J and a lesser probability of occurrence to traffic orders with larger sum of delays J. For this purpose, the deep neural network structure of the deep learning model designed by the application example of the present invention is shown in fig. 6, and includes: state embedding, encoder and decoder; the state embedding is composed of a neural network for embedding processing, can be a layer of linear neural network, and is used for mapping low-dimensional sample vehicle state information contained in an input sequence into a high-dimensional state vector so as to extract key incidence relation information conveniently; the encoder is a first hybrid neural network for obtaining a hidden state representation sequence (association information) of the high-dimensional state vector. The decoder is a second hybrid neural network for constructing the output sequence, i.e. the traffic order information. The scheduling model for vehicle scheduling is obtained through training, the scheduling model enables end-to-end output to be close to the optimal traffic sequence, and searching from zero in a huge solution space is avoided.

The application example is based on an unsupervised training paradigm of reinforcement learning, and model training and optimization are performed by taking a target value J of a traffic sequence output by a scheduling model as performance feedback. In order to make the end-to-end output approach the optimal passing order, the embodiment of the present invention optimizes the parameters of the scheduling model, i.e. the model training is required to obtain a better mapping relationship from the "vehicle state" to the "passing order". However, due to the huge solution space of the no-signalized intersection vehicle cooperative driving decision problem, it is very time-consuming to obtain the optimal solution labels of a large number of training samples, i.e. supervised learning is difficult to develop in the no-signalized intersection cooperative driving decision problem. To this end, the present application example is based on an unsupervised self-boosting training paradigm of reinforcement learning, which can find the parameter gradient of the indicator network according to the target value of the traffic sequence. In an exemplary embodiment, the gradient of the scheduling model is solved based on a reinforcement learning algorithm of a policy gradient in the related art, and model parameters of the scheduling model are continuously updated through iterative optimization. The trained network can coordinate with the driving scene for any intersection vehicle to obtain the near-optimal passing sequence end to end in real time.

The application example obtains the mapping relation between the potential vehicle state and the traffic sequence through off-line training, and is directly used for solving the on-line vehicle cooperative driving decision problem, so that the difficulty that the on-line solution needs to be solved and searched from zero in a huge solution space is avoided. Based on experience obtained by off-line training, the scheduling model can obtain an approximately optimal passing sequence for an on-line intersection cooperative driving scene in real time, time consumption of vehicles passing through the intersection is obviously reduced, and efficiency of a traffic system is improved.

The scheduling model for solving the collaborative driving decision problem of the vehicles at the signalless intersection and the training of the scheduling model are independent of the topological structure of the signalless intersection; therefore, the invention applies the example scheduling model for signalless intersections with arbitrary topologies without modification. Meanwhile, due to the fact that the potential optimal mapping relations among different intersections are similar in nature, the application example of the invention can rapidly deploy the scheduling model to the urban road network containing the signalless intersections with various topological structures through efficient transfer learning.

Under the cooperative environment of the vehicle and the road, vehicle-to-infrastructure (V2I) communication equipment is arranged on the vehicle and the road side equipment of the intersection, the vehicle can send real-time state information to the road side equipment, and the road side equipment can carry out cooperative driving decision planning of the vehicle and send decision results to the vehicle. And the vehicle plans the motion control of the vehicle and executes the motion control according to the decision result, completes the cooperative driving task, and passes through the signalless intersection safely and efficiently.

Fig. 7 is a flowchart of a method for implementing cooperative driving according to an exemplary application of the present invention, as shown in fig. 7, including:

701, the vehicles in the control area send real-time vehicle real-time state information to roadside equipment; the vehicle real-time status information includes: position, speed, lane, vehicle steering and driving route, etc.

Step 702, calling the trained scheduling model by the road side equipment, and inputting the real-time state information of the vehicle into the scheduling model to obtain the passing sequence information of the vehicle;

703, generating a road right distribution result by the road side equipment according to the obtained traffic sequence information and the conflict relationship between the vehicles; the road right distribution result comprises: the order of vehicle passage and the time each vehicle enters the conflict area.

Step 704, the road side equipment sends the road right distribution result to each vehicle in the control area;

step 705, the vehicle-mounted computing unit of the vehicle plans the motion control of the vehicle according to the received road right distribution result and executes the motion control so as to drive into the conflict area on time according to the distributed time.

The application example is a vehicle cooperative driving decision-making method of the signalless intersection based on deep learning, under the condition that real-time constraint is met, end-to-end output is close to an optimal road weight distribution solution, and solution from zero in a huge solution space is avoided. The cooperative driving decision-making model irrelevant to the topological structure of the intersection obviously improves the generality and the applicability of the model, and can be efficiently and quickly deployed to the signalless intersection with any topological structure.

The method provided by the application example can also be used for effectively solving the problem of space-time resource allocation of multi-robot system cooperative motion, and particularly improves the operation efficiency of the multi-robot system by allocating right-of-way resources through scheduling.

By adopting the application example method, the traffic efficiency of the vehicle passing through the no-signal intersection can be improved. Table 1 shows the delay sum comparison data, and table 1 shows the average performance of the decision result obtained by the exemplary method of the present invention and the decision result obtained by the conventional first-in first-out method for the collaborative driving scene of the signalless intersection including different numbers of vehicles. Therefore, the delay of the vehicle passing through the non-signalized intersection is reduced, and the traffic efficiency of the non-signalized intersection is improved.

TABLE 1

It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, or suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims

1. A method of training a scheduling model, comprising:

processing the obtained high-dimensional state vector to obtain the incidence relation information of each first vehicle; wherein the incidence relation information includes, for each first vehicle: the high-dimensional state vector and the information of the collision and coupling relation between the first vehicle and other first vehicles, wherein the other first vehicles are other vehicles except the first vehicle;

2. The method of claim 1, wherein the processing the obtained high-dimensional state vector comprises:

3. The method according to claim 1, wherein determining the traffic sequence information of the first vehicle according to the obtained association relation information comprises:

4. The method of claim 1, wherein the sample vehicle state information comprises one or any combination of the following information for the first vehicle:

location, priority, speed, turn, and route.

5. The method of claim 1, wherein calculating a sum of delays for all first vehicles to pass through the non-signalized intersection from the obtained traffic sequence information comprises calculating the sum of delays by an objective function of:

；

wherein the content of the first and second substances,

and the delay of the first vehicle i to pass through the non-signalized intersection when passing through the non-signalized intersection according to the traffic sequence information is represented, N represents the number of the first vehicles to pass through the non-signalized intersection, and J represents the sum of the delays.

6. The method of any of claims 1~5 wherein determining parameters of a scheduling model to be trained from the calculated sum of delays comprises:

adjusting parameters of the scheduling model through a strategy gradient so as to enable the delay sum calculated according to the scheduling model after parameter adjustment to be converged;

7. A method of enabling coordinated driving, comprising:

controlling the second vehicle to run according to the obtained passing sequence information of the second vehicle;

wherein the scheduling model trained by a method of training a scheduling model as claimed in any one of claims 1~6.

8. A computer storage medium having stored thereon a computer program which, when executed by a processor, implements a method of training a scheduling model according to any one of claims 1~6, or implements a method of collaborative driving according to claim 7.

9. A terminal, comprising: a memory and a processor, the memory having a computer program stored therein; wherein the content of the first and second substances,

the processor is configured to execute the computer program in the memory;

the computer program, when executed by the processor, implements a method of training a scheduling model according to any one of claims 1~6, or a method of implementing co-driving according to claim 7.

10. An apparatus to train a scheduling model, comprising: the device comprises an embedding unit, a correlation information determining unit, a communication sequence determining unit, a delay sum calculating unit and a model determining unit; wherein the content of the first and second substances,

the calculate delay sum unit is set to: calculating the sum of the delay of all first vehicles to pass through the non-signalized intersection according to the obtained traffic sequence information;

11. An apparatus for implementing cooperative driving, comprising: the system comprises a receiving unit, a road right distribution unit and a control processing unit; wherein the content of the first and second substances,

the control processing unit is configured to: controlling the second vehicle to run according to the obtained passing sequence information of the second vehicle;

wherein the scheduling model is trained by an apparatus for training a scheduling model according to claim 10.