WO2020199524A1

WO2020199524A1 - Method for matching ride-sharing travellers based on network representation learning

Info

Publication number: WO2020199524A1
Application number: PCT/CN2019/107011
Authority: WO
Inventors: 唐蕾; 赵亚玲; 刘子航; 段宗涛
Original assignee: 长安大学
Priority date: 2019-04-02
Filing date: 2019-09-20
Publication date: 2020-10-08
Also published as: CN110009455A; CN110009455B

Abstract

A method for matching ride-sharing travellers based on network representation learning: on the basis of the relationship between the start point and end point of a passenger and the original route of a driver, dividing ride sharing into two types: the first type is end point ride-sharing and the other type is en route ride-sharing; the passenger needs to walk from the start point to the boarding point to implement ride-sharing, and then walk from the drop off point to the target destination, the ride-sharing route trajectory being part of the passenger trajectory; constructing a heterogeneous ride-sharing network, and using a network representation learning model to perform representation learning on the heterogeneous ride-sharing network to obtain a low-dimensional vector representation of user nodes; calculating the cosine similarity between driver and passenger nodes, sorting the calculated cosine similarity values from high to low, returning the top k passengers having the highest similarity value to the driver as passengers that can implement ride sharing, and implementing ride sharing. The present ride-sharing matching method is more reliable than traditional methods that only use distance recommendations, has intuitive semantic comprehensiveness, and can accurately discover potential ride-sharing users, providing same with a faster and more convenient service.

Description

A Method for Matching Travelers Based on Network Representation Study

Technical field

The present invention belongs to the field of group recommendation, and specifically relates to a method for matching online appointment shared travelers based on network representation learning.

Background technique

With the increasing development of online car-hailing platforms and APPs, shared travel is gradually recognized and accepted by the public, and with the development of related travel technologies, such as travel route matching, travel group discovery, route planning, travel behavior analysis and other related work According to the research, ride sharing has also become a convenient and feasible mode of travel. The research on carpool matching can provide users with better travel experience and higher travel efficiency.

Network representation learning and the use and influence of ride sharing have made the research on ride sharing matching methods receive a lot of attention. In the research of carpool matching, the main problem is how to accurately allocate passengers to drivers, and how to optimize the carpool matching among different travel groups. The traditional matching method only relies on the geographic distance between the passenger and the driver, and does not consider the relationship between the passenger or the driver and other characteristics, such as the relationship between the traveler and the destination, and time. The heterogeneous information network can provide a more effective analysis method for sharing and matching. Therefore, by using heterogeneous networks constructed with user, time and location information to learn latent semantics from sharing information, and extracting features from user trajectories and emotions, it is possible to better provide users with appropriate sharing matching relationships.

Summary of the invention

Aiming at the user matching problem of shared travel, the purpose of the present invention is to provide a method for matching online appointment shared travel personnel based on network representation learning. The invention adopts the meta-path theory in the heterogeneous information network, converts the matching between the driver and the passenger into the similarity measurement problem of the nodes in the heterogeneous ride-sharing network, and establishes the ride-sharing behavior characteristic model based on the "driver-passenger" structure , And according to different ride sharing situations, the model is divided into end ride sharing and along the way ride sharing. The model starts from the departure time of the driver and the passenger, and the location of the boarding and alighting, combined with the skip-gram model in machine learning, analyzes the relationship between features, infers the possibility of ride sharing between drivers and passengers, and provides support for high-quality services for ride sharing. .

In order to achieve the above objectives, the present invention adopts the following technical solutions:

A method for matching online appointment shared travel personnel based on network representation learning, including the following steps:

Step 1: Carpool classification

In the case that the driver's original route is determined, based on the relationship between the passenger's starting point and end point and the driver's original route, carpooling is divided into two categories: the first type is end-point carpooling, where the starting and ending points of the passengers are on the driver's original route; The other type is shared rides along the way. The starting and ending points of passengers are not on the original path of the driver. Passengers need to walk from the starting point to the boarding point, then reach a shared ride, and then walk from the point of disembarkation to the destination. The shared path is Part of the passenger trajectory;

Step 2: Build a heterogeneous ride-sharing network

The request information of the driver and the passenger is expressed as a heterogeneous ride-sharing network, and the passenger and the driver are connected by location and time information to construct a heterogeneous ride-sharing network. The types of nodes in the heterogeneous ride-sharing network include users, Location, time period and activity;

Step 3: Use the network representation learning model to perform representation learning on the heterogeneous co-multiplying network to obtain the low-dimensional vector representation of the user node;

Step 4: Calculate the cosine similarity between the driver and the passenger node according to the low-dimensional vector representation of the user node, and sort the calculated cosine similarity values from large to small, and return the top k passengers with the highest similarity to the driver as possible Passengers sharing a ride, reach a shared ride.

A further improvement of the present invention is that, in step 1, the request information of the driver and the passenger includes the driver's starting point and ending point, departure time, driver trajectory, getting on and off position, passenger starting point and ending point, and getting on and off time.

A further improvement of the present invention is that in step 2, the types of nodes in the heterogeneous ride-sharing network include users, locations, time periods, and activities.

A further improvement of the present invention is that in step 3, the process of performing representation learning specifically includes the following two steps:

1) Generating node sequence set: The meta-path guides the node to walk in the heterogeneous co-multiplication network, and generates a fixed-length node sequence set;

2) Input the generated fixed-length node sequence set into the skip-gram model for training, and obtain the vector representation of the driver and passenger nodes.

A further improvement of the present invention lies in that in step 1), for the end point co-multiplication, construct a meta-path with a structure of ULTLU; for co-multiplication along the way, under the same time period constraints, construct a meta-path with a ULU structure.

A further improvement of the present invention is that in step 3, the specific process of representation learning is as follows:

First, given a specific meta path

Through meta path

To guide the random walk of nodes in the heterogeneous co-multiplication network, to generate a fixed-length node sequence set; secondly, for any user node v _{u in the} fixed-length node sequence set, suppose the position number of a node in the sequence set is j , The method will select the node set v _jc ,..., v _j+c as neighbor nodes, and c is half of the window size in skip-gram; therefore, given the user node v _u , the goal of the skip-gram model is Maximize the conditional probability of context with heterogeneous neighbor nodes:

Wherein, N _a (v _u) v _u neighbor nodes is the set of nodes,

Is a set of node types, p(v _jc ,...,v _j+c |v _u ; θ) is the conditional probability of the context under the condition of a known central node;

Under the assumption that each node is independent of each other, the conditional probability of the context is log p(v _jc ,...,v _j+c |v _u ；θ) under the condition that the central node is known.

Among them, p(v _k |v _u ; θ) uses the softmax function to define the conditional probability of the context node v _k of the given node v _u ;

among them

Represents the representation vector of node v _u ;

The representation vector is generated according to the conditional probability of the context node v _k of the given node v _u , and then negative sampling is used to optimize the representation vector to obtain the low-dimensional vector representation of each user node in the heterogeneous co-multiplication network.

A further improvement of the present invention is that the walk probability of random walk

as follows:

In formula (1), a represents the node type,

Is a type a node on the path,

Represents along a predefined meta path

Upper node

The number of neighbor nodes, ε represents the set of links in the network, v ⁱ⁺¹ ,

It means that two nodes can form a link in the network, f _v (v ⁱ⁺¹ )=a+1 means that the node v ⁱ⁺¹ is a node of type a+1;

Formula (2) indicates that the selected element paths are all symmetrical element paths.

A further improvement of the present invention is that the specific process of using negative sampling to optimize the representation vector is as follows:

among them

Is a set of random negative node samples of v _u , negative sample node set

Sampling according to the noise distribution p(v′ _u ),

Then use the stochastic gradient descent method to make the log likelihood function

Maximize, to update the vector representation of the node in equation (5), specifically as shown in equations (6) and (7), to obtain the low-dimensional vector representation of each user node in the heterogeneous co-multiplying network;

among them

The function indicates whether _v'u is a context neighbor node v _k .

Compared with the prior art, the present invention has the following beneficial effects:

Different from the common ride sharing recommendation mechanism, the present invention uses the location and time information of passengers and drivers to construct a heterogeneous ride sharing network, and distinguishes two different ride sharing types. For these two types, symmetrical meta-paths are selected to be generated, and different restrictions are added to the generation of meta-path sequence sets for different co-multiplication types. The negative sampling skip-gram is used to represent the sequence set to generate the representation vector, and finally the cosine similarity is used to calculate the similarity between the user representation vectors for common multiplication recommendation. The ride sharing recommendation method proposed by the present invention is more reliable than the traditional method using only distance recommendation, has intuitive semantic comprehension, can accurately find potential ride sharing users, and provide them with faster and more convenient services.

Description of the drawings

Fig. 1 is a topological structure diagram of the heterogeneous multiplication network constructed by the present invention.

detailed description

Hereinafter, the shared matching method proposed by the present invention will be described in detail with reference to the accompanying drawings.

The method of the present invention for online appointment sharing traveler matching based on network representation learning includes the following steps:

Step 1: Carpool classification

The driver and passenger request information includes the driver's starting point and destination, departure time, driver's trajectory, alighting position, passenger starting and ending point, and alighting time.

Step 2: Build a heterogeneous ride-sharing network

In step three, the process of representation learning specifically includes the following two steps:

1) Generate node sequence set: The meta-path guides nodes to walk in the heterogeneous co-multiplying network to generate a fixed length node sequence set. For end-point sharing, construct the meta-path with the structure of ULTLU; for road sharing, construct the meta-path with the structure of ULU under the same time period constraints.

According to different types of co-multiplication, different symmetric meta-paths are used for representation learning. For endpoint sharing, the ULTLU meta path is used, which means that passengers and drivers arriving at the same place at the same time can share the ride. The first U in ULTLU represents the user node, here represents the driver, and the second U represents the user node , Here represents the passenger, L represents the boarding location or the drop-off location, and T represents the time period of the corresponding location. For shared rides along the way, since the departure place and destination of the driver and the passenger are different, the passenger still needs to spend extra time walking to the boarding point and destination before and after getting off the bus. Therefore, based on the ULU yuan path, Under the condition that the driver and the passenger are in the same time period, the best boarding location is obtained as the boarding location L, (L represents the boarding location or the getting off location) this meta path means that the driver and the passenger are in the time period There is the best meeting point as the boarding location L, which can be shared.

In step three, the specific process of representation learning is as follows:

First, given a specific meta path

Through meta path

To guide the random walk of nodes in the heterogeneous co-multiplying network, generate a fixed-length node sequence set; the walk probability of random walk

as follows:

In formula (1), a represents the node type,

Is a type a node on the path,

Represents along a predefined meta path

Upper node

It means that two nodes can form a link in the network, f _v (v ⁱ⁺¹ )=a+1 means that the node v ⁱ⁺¹ is a node of type a+1.

Secondly, for any user node v _u in a fixed-length node sequence set, assuming that the position number of a node in the sequence set is j, the method will select the node set v _jc ,..., v _j+c as neighbor nodes, c is half of the window size in skip-gram; therefore, given a user node v _u , the goal of the skip-gram model is to maximize the conditional probability of context with heterogeneous neighbor nodes:

Wherein, N _a (v _u) v _u neighbor nodes is the set of nodes,

Is a set of node types, p(v _jc ,...,v _j+c |v _u ; θ) is the conditional probability of the context under the condition of a known central node.

Among them, p(v _k |v _u ;θ) uses the softmax function to define the conditional probability of the context node v _k of the given node v _u .

among them

The representation vector representing the node v _u .

The specific process of using negative sampling to optimize the representation vector is as follows:

among them

Is a set of random negative node samples of v _u , negative sample node set

Sampling according to the noise distribution p(v′ _u ),

Then use the stochastic gradient descent (SGD) method to make the log likelihood function

Maximize, to update the vector representation of the node in equation (5), specifically as shown in equations (6) and (7), to obtain the low-dimensional vector representation of each user node in the heterogeneous co-multiplying network.

among them

The function indicates whether _v'u is a context neighbor node v _k .

Step 4: Calculate the cosine similarity between the driver and the passenger node according to the low-dimensional vector representation of the user node, and sort the calculated cosine similarity values from large to small, and return the top k passengers with the highest similarity to the driver as possible Passengers sharing a ride, reach a shared ride. The size of k is determined by the maximum number of passengers that the driver can carry.

Example 1

Step 1: Data classification and extraction;

The experimental data of the present invention comes from the local area data of Chengdu provided by the Didi Gaia Data Open Program, including the driver's GPS trajectory data and passenger order data. In the experiment, the driver and passengers are numbered, and the passenger's departure place and destination are extracted And the driver’s trajectory and corresponding time, where the first point of the driver’s trajectory is used as the driver’s starting point, and the trajectory end point is used as the driver’s destination. According to the relationship between the passenger's starting point and the driver's trajectory, the present invention divides the ride sharing types into end ride sharing and along the way ride sharing.

Specifically, for end-point sharing, the starting point and ending point of the passengers are on the driver's original path; along the way, the starting point and ending point of the passengers are not on the original path of the driver, and the passengers need to walk from the starting point to the boarding point, and then reach a common Ride, and then walk from the drop-off point to the destination. The shared path trajectory is only part of the passenger trajectory.

In addition, for the two types of ride-sharing, the following conditions must be met. The following symbols are used here for analysis. Denote d as a driver and p as a passenger. Each driver and passenger has their own origin O and destination D. For carpooling, x is the passenger's boarding position and y is the passenger's alighting position. TT(O,D) represents the travel time required from the departure place O of the passenger p or the driver d to the destination D, DTime _d represents the departure time of the driver d from a certain location, and DTime _p represents the departure of the passenger p from a certain location time. For the end-point sharing type, only when the following conditions are met, the driver d can use p as a shared passenger for end-point sharing:

max _p TT(O _p ,D _p )≤TT(O _d ,D _d ) (8)

In the formula, TT(O _p ,D _p ) represents the travel time required from the departure place O of the passenger p to the destination D, and TT(O _d ,D _d ) represents the travel time required by the driver d’s departure place O to the destination D Driving time

Shared rides along the route are a typical way of travel. Passengers walk from the departure point to the best meeting point, travel at the departure time, and after a period of time sharing, choose the driver's track to get off at the location closest to the passenger's destination, and then walk to destination. MT _p is used here to indicate the maximum walking time of passenger p from the starting point of departure to the boarding point and the final destination to the alighting point. Therefore, for ride sharing along the way, the driver can allow the passenger to ride sharing if the following conditions are met:

In the formula, TT (x _p , y _p ) represents the travel time required for passenger p's departure O to destination D.

For the selection of the best meeting point, the present invention uses the Dijkstra algorithm to obtain the best meeting point between the driver and the passenger, and uses this point as the passenger's boarding position.

Step 2: Preprocess the location and time data extracted in Step 1 to construct a shared heterogeneous information network:

Definition 1 Heterogeneous Information Network (HIN) is defined as a network with multiple types of nodes and/or multiple types of links. It can be expressed as H=(v,ε), where v is a set of nodes and ε is a set of links. Links can be weighted, unweighted, directed or undirected. Node type mapping function

Map nodes to predefined types, link type mapping functions

Map links to predefined link types.

Heterogeneous information networks are composed of different but related nodes connected by the edges of the networks. "Different" here means-the vertices of the network have different types, and "related" means that two nodes have a specific type of interaction or relationship.

The heterogeneous ride-sharing network constructed by the present invention is shown in Figure 1. For the two types of carpooling, the present invention constructs the same network mode. Both types of carpooling are based on the time and location constraints of the driver and passenger to match the driver and passenger. The node types in this network mode include: location (L ), time (T), activity (A) and user (ie passenger or driver) (U). The user type node (U) includes drivers and passengers. Passengers are numbered sequentially starting from 1 with p, and drivers are numbered sequentially starting from 1 with d; the present invention serializes and obfuscates the time. Hour divides 24 hours a day into 48 time periods, specifically starting from 00:00:00, every half an hour is a time period, the time period is numbered (1～48), and the number corresponding to each time period As a time type node (T). For the location type node (L), select the passenger's boarding and disembarkation location as the location type node. In the end-point sharing, the passenger's O and D are used as the location type node, because in the end-point sharing, the passenger's OD is On the driver’s trajectory, the passenger’s boarding and disembarking point is the passenger’s OD point. In the shared ride, the best meeting point between the driver and the passenger is used as the position type node, because in the shared ride, both the passenger and the driver need to reach the most A good meeting point can achieve a ride-sharing. For the activity type node (A), the present invention obtains the type of each location, including real estate, educational institution, etc., through Baidu API conversion. Link types in heterogeneous network

Including the occurrence of a certain activity in a certain location, the path between locations and the range of time periods. For each location l ∈ L, there is a set of links for users, activities and a set of departure times belong to the link type

It can also contain information about the route connecting the two locations, as well as information on the time interval for some passengers to get to the meeting point for the ride. The network can construct a meta-path like ULU to show the relationship between different types of nodes. For example, UL link indicates that a user starts from a certain location or intends to reach a certain destination, showing a staying relationship; LT link can indicate that a behavior of starting from or arriving at a certain place occurs during a certain period of time;

Step 3: According to different types of sharing, select the corresponding meta-path, use the network representation learning model to learn the representation of the heterogeneous sharing network, and obtain the low-dimensional vector representation of the user node;

The purpose of constructing a heterogeneous ride-sharing network is to establish a connection between drivers and passengers. In order to show that the two have the same purpose or requirements, a symmetric meta-path is used here to express this relationship. The meta path is defined as follows:

Definition 2 (Metapath) Metapath is in network mode

Path defined above and end with

The form represents the composition relationship between two given node types. The meta path is usually used in a symmetrical manner, that is, its first node V ₁ and the last node V _m are of the same type.

For end-point ride sharing, the ULTLU meta-path is used, which means that passengers and drivers who arrive at the same place at the same time can ride together. Among them, the two U are the driver and the passenger respectively, L represents the boarding location or the getting off location, and T represents the time period when getting on and off the vehicle at the corresponding location. For shared rides along the way, since the departure place and destination of the driver and the passenger are different, the passengers still need to spend extra time walking to the boarding point and destination before and after getting off the bus. Multiplication is more complicated. Based on the ULU meta-path, the L in the meta-path represents the best meeting point. It means that the driver and passengers who can reach the same meeting point at the same time can reach a shared ride. For Dijkstra's algorithm to obtain the best meeting point, the trajectory network containing the corresponding driver and passenger trajectories is first used as input, and the shortest path between the two ODs on the road network is obtained through the algorithm, and it is taken as the best meeting point .

Network representation learning can represent the nodes in the network as low-dimensional dense real-valued vector forms. The present invention inputs the node sequence set based on the symmetric element path, encodes the nodes in a one-hot encoding method as the initial vector, and then performs low Conversion of dimensional vectors.

In the present invention, the network representation learning is divided into two stages: first, the first stage is the random walk of the meta-path instructing the node in the heterogeneous co-multiplication network, and further generating a fixed-length node sequence set.

Take the ULTLU type meta path used by the endpoint multiplication as an example. If the current node is of the user (U) type, the next hop node under the guidance of the ULTLU meta path is of type L, and the jump probability is shown in formula (1), L The next node of the type is T type, and the length of each node sequence generated by the present invention is 5.

Random Walk Probability

as follows:

In formula (1), the node in it is represented by a node type,

Is a type a node on the path,

Represents along a predefined meta path

Upper node

It means that two nodes can form a link in the network, and f _v (v ⁱ⁺¹ )=a+1 means that the node v ⁱ⁺¹ is a type of a+1. So in the meta path

Under the guidance of, the random walk can proceed only when the next node v ⁱ⁺¹ is a node of type a+1. Formula (2) represents the meta path used

Is a symmetric meta path.

In the second stage, the node sequence set of length 5 is input into the skip-gram model for training, and the vector representation of the driver and passenger nodes is obtained.

Here, the skip-gram model is used to construct a feature vector for each user type U node, and negative sampling is used in the skip-gram model to optimize the representation vector. For any user node v _{u in the} sequence set, assuming that the position number of the node in a certain sequence is j, this method will select the node set v _jc ,..., v _{j+c of} type t as neighbor nodes, and c is skip -Half of the window size set in gram. Therefore, given a user node v _u , the goal of the skip-gram model is to maximize the conditional probability of a context with heterogeneous neighbor nodes:

N _a (v _u) v _u neighbor nodes is the set of nodes,

Is a collection of node types. Under the assumption that each node is independent of each other, log p(v _jc ,...,v _j+c |v _u ；θ) can be further decomposed into

among them

The representation vector representing the node v _u .

In addition to using the relationship between the context node set to generate the vector, the negative sampling method is also used to optimize the representation vector. This method can remove the influence of irrelevant nodes on the target vector, making the distinction between vectors of different categories more obvious. The likelihood function of this method is as follows:

among them

Is the set of random negative node samples of v _u , including the rest of the nodes except v _jc ,...,v _j+c , the set of negative sampling nodes

Sampling according to the noise distribution p(v′ _u ), namely

Partly means that the sampling expectations of random negative nodes follow the probability density function p(v' _u ). Then use Stochastic Gradient Descent (SGD) to maximize the log likelihood function:

among them

The function indicates whether _v'u is a context neighbor node v _k . In the present invention, the generated vector has a dimension of 128 dimensions, wherein the window size c used by skip-gram is set to 2, and the number of samples for negative sampling is 5.

Step 4: According to the low-dimensional vector representation of the user node, the similarity between the driver and the passenger is calculated through the cosine similarity algorithm, and the similarity results are ranked from large to small, so the top k passengers with the highest similarity value to the driver are obtained as a Passengers sharing a ride, reach a shared ride. The size of k is determined by the maximum number of passengers that the driver can carry.

The specific process of calculating the similarity between the driver and the passenger through the cosine similarity algorithm is as follows:

For any pair of user node v _i, v _j, which represents the vector X _i, X _j cosine similarity Sim (v _i, v _j) is defined as follows

Wherein, X _i is the node of vector v _i, X _j is the vector of node v _j; when ‖X _i || = || X _j || = 1, cosine similarity is equivalent to the Euclidean distance, which allows using approximate nearest neighbor searching after normalization can be efficiently positioned to the first k similar node of a given node v _i (passenger). Thus, a given user has previously learned (the driver) vector, which is calculated by the cosine between vectors represented by X _i and X _j to find the similarity of the driver given a potential passenger. The k passengers with the greatest similarity are identified as ridesharing participants, and then the candidates can be ranked and the ridesharing type can be observed.

The present invention uses the trajectory data set of Didi users to construct a heterogeneous co-multiplication heterogeneous network model, and classifies different types of co-multiplication. The definition of two types of meta-paths is proposed, and the meta-path sequence set is generated in the co-multiplication network, and skip-gram is negatively sampled to generate the user's representation vector. Finally, the cosine similarity algorithm is used to realize the similarity calculation between users. The top k similarity predicts the passengers who can share the ride. The ride sharing recommendation method proposed by the present invention is more reliable than the traditional method that only uses distance recommendation, has intuitive semantic comprehension, can accurately find potential ride sharing users, and provide them with faster and more convenient services.

Claims

An online appointment sharing traveler matching method based on network representation learning is characterized in that it includes the following steps:

Step 1: Carpool classification

In the case that the driver's original route is determined, based on the relationship between the passenger's starting point and end point and the driver's original route, carpooling is divided into two categories: the first type is end-point carpooling, where the starting and ending points of passengers are on the driver's original route; The other type is shared rides along the way. The starting and ending points of passengers are not on the original path of the driver. Passengers need to walk from the starting point to the boarding point, then reach a shared ride, and then walk from the point of disembarkation to the destination. The shared path is Part of the passenger trajectory;

Step 2: Build a heterogeneous ride-sharing network

The request information of the driver and the passenger is expressed in the form of a heterogeneous ride-sharing network, and the passenger and the driver are connected by location and time information to construct a heterogeneous ride-sharing network;

Step 3: Use the network representation learning model to perform representation learning on the heterogeneous co-multiplying network to obtain the low-dimensional vector representation of the user node;

Step 4: Calculate the cosine similarity between the driver and the passenger node according to the low-dimensional vector representation of the user node, and sort the calculated cosine similarity values from large to small, and return the top k passengers with the highest similarity to the driver as possible Passengers sharing a ride, reach a shared ride.
The method for matching online appointment sharing travelers based on network representation learning according to claim 1, characterized in that, in step 1, the request information of the driver and the passenger includes the starting point and ending point of the driver, the departure time, the trajectory of the driver, and getting on and off the vehicle. Location, starting and ending points of passengers, and time of getting on and off.
The method for matching online appointment shared travelers based on network representation learning according to claim 1, characterized in that, in step 2, the types of nodes in the heterogeneous ride-sharing network include users, locations, time periods, and activities.
The online appointment sharing traveler matching method based on network representation learning according to claim 1, characterized in that, in step 3, the process of representation learning specifically includes the following two steps:

1) Generate node sequence set: The meta-path guides the node to walk in the heterogeneous co-multiplying network, and generates a fixed-length node sequence set;

2) Input the generated fixed-length node sequence set into the skip-gram model for training, and obtain the vector representation of the driver and passenger nodes.
According to claim 4, a method for matching travelers with online appointment sharing based on network representation learning, characterized in that, in step 1), for endpoint sharing, construct a meta-path with a structure of ULTLU; for sharing along the way, Under the constraints of the same time period, construct a meta-path with a ULU structure.
The online appointment sharing traveler matching method based on network representation learning according to claim 4, characterized in that, in step 3, the specific process of representation learning is as follows:

First, given a specific meta path
Through meta path
To guide the random walk of nodes in the heterogeneous co-multiplication network, to generate a fixed-length node sequence set; secondly, for any user node v u in the fixed-length node sequence set, suppose the position number of a node in the sequence set is j , The method will select the node set v jc ,..., v j+c as neighbor nodes, and c is half of the window size in skip-gram; therefore, given the user node v u , the goal of the skip-gram model is to maximize Conditional probability of context with heterogeneous neighbor nodes:

Wherein, N a (v u) v u neighbor nodes is the set of nodes,
Is a set of node types, p(v jc ,…,v j+c |v u ; θ) is the conditional probability of the context under the condition of a known central node;

Under the assumption that each node is independent of each other, the conditional probability of the context is logp(v jc ,…,v j+c |v u ；θ) under the condition that the central node is known.
Among them, p(v k |v u ; θ) uses the softmax function to define the conditional probability of the context node v k of the given node v u ;

among them
Represents the representation vector of node v u ;

The representation vector is generated according to the conditional probability of the context node v k of the given node v u , and then negative sampling is used to optimize the representation vector to obtain the low-dimensional vector representation of each user node in the heterogeneous co-multiplication network.
According to claim 6, a method for matching travelers with online appointment sharing based on network representation learning, characterized in that the walk probability of random walk
as follows:

In formula (1), a represents the node type,
Is a type a node on the path,
Represents along a predefined meta path
Upper node
The number of neighbor nodes, ε represents the set of links in the network, v i+1 ,
It means that two nodes can form a link in the network, f v (v i+1 )=a+1 means that the node v i+1 is a node of type a+1;

Formula (2) indicates that the selected element paths are all symmetrical element paths.
The method for matching online appointment shared travelers based on network representation learning according to claim 6, characterized in that the specific process of using negative sampling to optimize the representation vector is as follows:

among them
Is a set of random negative node samples of v u , negative sample node set
Sampling according to the noise distribution p(v′ u ),

Then use the stochastic gradient descent method to make the log likelihood function
Maximize, to update the vector representation of the node in equation (5), specifically as shown in equations (6) and (7), to obtain the low-dimensional vector representation of each user node in the heterogeneous co-multiplying network;

among them
The function indicates whether v'u is a context neighbor node v k .