CN113819916B

CN113819916B - Travel route planning method based on cultural genetic algorithm

Info

Publication number: CN113819916B
Application number: CN202110993353.6A
Authority: CN
Inventors: 王磊; 许向荣; 江巧永; 费蓉; 王彬; 张朔; 郑伟
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2024-01-09
Anticipated expiration: 2041-08-27
Also published as: CN113819916A

Abstract

The invention discloses a travel route planning method based on a cultural genetic algorithm, which comprises the following specific steps: step 1, acquiring potential preference data of a user; step 2, road network data are obtained and cleaned; step 3, obtaining user interaction information data of the POI, and forming a POI attribute list; step 4, modeling the road network according to the potential preference data of the user, the road network data and the attribute table of the POI, and extracting the effective edge which can reach the end point from the starting point according to the inquiry condition of the user; step 5, obtaining the landscape value of the effective edge according to the potential preference data of the user, the road network data and the attribute list of the POI; and 6, searching the effective edges and the scenery values in the road network by using a cultural genetic algorithm, and planning a scenery travel route with high scenery values and high user satisfaction for the user. The invention solves the problem that the user preference, the path landscape value and the user personalized demand are not considered when the path planning is carried out in the prior art.

Description

Travel route planning method based on cultural genetic algorithm

Technical Field

The invention belongs to the technical field of route planning, and relates to a travel route planning method based on a cultural genetic algorithm.

Background

With the improvement of the living standard of people, more and more people choose to take travel as a way of recreation, and meanwhile, the people hope to reach a destination with the shortest distance and the least expense, so that the whole travel experience is improved. Therefore, various route planning algorithms are endless, and reasonable route planning can save travel cost for users, greatly improve travel interests of the users and has important significance for development of the travel industry.

The traditional route planning methods aim at planning a route from a source to a destination, which is short in distance, less in time and low in cost, for a user, and only consider the travel efficiency of the user, and neglect the personal preference of the user, the scenery value of the route, the personalized requirement of the user and the like, namely when the user wants to drive out, scenery along the way, which accords with the preference of the user, can improve the travel experience of the user to a certain extent, which is very important for the user.

Disclosure of Invention

The invention aims to provide a travel route planning method based on a cultural genetic algorithm, which solves the problem that user preference, path landscape value and user personalized requirements are not considered when path planning is carried out in the prior art.

The technical proposal adopted by the invention is that,

a travel route planning method based on cultural genetic algorithm comprises the following specific steps:

step 1, acquiring potential preference of a user to POIs (POIs) (Point of Interest, interest points) according to historical sign-in data of the user and POIs type characteristic data;

step 2, road network data are obtained from an OSM (open street map) and are subjected to data preprocessing;

step 3, obtaining user interaction information data of the POI, including information such as photo number, comment number, grade, score and the like, and adding the information into a corresponding POI attribute list;

step 4, modeling the road network according to the user potential preference data, the preprocessed road network data and the attribute table of the POI, and extracting effective edges which can reach the end point from the starting point according to the user query condition;

step 5, calculating the landscape value of the effective edge according to the potential preference data of the user, the preprocessed road network data and the attribute list of the POI;

and 6, searching the effective edges and the scenery values in the road network by using a cultural genetic algorithm, and planning a scenery travel route with high scenery values and high user satisfaction for the user.

The invention is also characterized in that:

wherein step 1 comprises:

step 1.1, acquiring historical sign-in data of a user from a travel website, wherein the data content is txt text, and one row of sign-in information of the user is represented by one to a plurality of POIs;

step 1.2, acquiring basic information of POIs signed by users from various tourist websites, wherein the data content comprises names, geographic positions, type characteristics and open time of tourist attractions;

step 1.3, extracting user preference from the information obtained in the step 1.1 and the step 1.2 by a label classification statistical method to obtain user potential preference data, wherein the specific steps are as follows:

step 1.3 comprises:

step 1.3.1, acquiring a user-POI sign-in matrix UC according to user historical sign-in data; each element uc in the matrix _i,j Obtained from formula (1):

wherein uc _i,j Indicating that user i is at POI _j The number of check-in times, i and j are subscript variables;

step 1.3.2, acquiring a POI-type matrix PT according to POI type characteristic data and sign-in data; each element pt in the matrix _i,j Obtained from formula (2):

wherein pt _i,j Representing POIs _i Whether or not to have type feature T _j ；

Step 1.3.3, obtaining a user-type sign-in matrix UT according to the user-POI sign-in matrix and the POI-type matrix; each of the matricesIndividual element ut _i,j Obtained from formula (3):

wherein ut is _i,j Representing user i versus type feature T _j Of the number of check-ins, i.e. the user has gone through a user having a type feature T _j The number of POIs of (a) a;

and 1.3.4, sorting POI type features in a user-type sign-in matrix according to the decreasing order of the number of times of user sign-in, taking TOP-N types with the largest number of times of sign-in as user preference, and encoding by using a single-heat encoding to obtain a user preference vector, namely the potential preference data of the user.

The specific steps of the step 2 are as follows: and downloading urban road network data from the OSM, reserving national road, province road, county road, village road, university, park and POI data in the road network data, and performing cleaning treatment on other data to obtain preprocessed urban road network data.

The specific steps of the step 3 are as follows: and extracting scoring, comment number, grade and photo number information of the POI from the user in the tourism website, and marking the scoring, comment number, grade and photo number information on the corresponding POI to form a POI attribute list.

Wherein step 4 comprises:

step 4.1, road network modeling: a road network is modeled as a graph g= (N, E), where N is the set of nodes (intersections and dead-ends), E N x N is the set of directed edges;

step 4.2, path definition: one path is formed by sequentially connecting a plurality of edges in a road network, and one path is formed from a source point n ₀ To destination n _k The path is denoted as r= (e _0,1 ,e _1,2 ,…,e _k-1,k ) Wherein n is ₀ 、n _k Belongs to the set N, E belongs to the set E, E _0,1 Representing node n ₀ To n ₁ Is a side of (2);

step 4.3, user query definition: user queries are defined as triples, denoted q= < n ₀ ,n _k D >, wherein n ₀ 、n _k Respectively representing a user-defined start point and an end point, d beingMaximum travel distance allowed by the user;

step 4.4, determining an effective area: taking the middle points of a starting point and a finishing point defined by a user as a circle center, taking the maximum travel distance allowed by the user as a diameter, drawing a circle, wherein an area in the circle is an effective area, and an edge in the effective area is an effective edge;

step 4.5, obtaining a neighbor table: the scenery value on the effective edge is calculated according to the scenery value of the POI adjacent to the edge, so that the adjacent distance needs to be set, and an adjacent table of the edge is obtained; the contents of the table include: the ID and name of the edge, the ID and name of the POI adjacent to the edge, and the adjacent distance.

Wherein step 5 comprises:

step 5.1, establishing a landscape value mathematical model, wherein the landscape value mathematical model comprises a co-visit probability function among POIs, a similarity function among POIs and a correlation function among POIs, and the formula of the co-visit probability function is as follows:

wherein Co-VP (i, j) represents POI _i And POI _j Co-visit probability of N _i,j Indicating simultaneous access to POIs _i And POI _j Number of users, N _i Indicating that only POIs have been accessed _i Without accessing POI _j The number of users of (a);

the similarity function formula between POIs is:

wherein Sim (i, j) represents POI _i And POI _j Similarity between type features, T _i,d Is POI _i Type feature vector T _i A component of dimension d;

the correlation function formula between POIs is as follows:

r(i,j)＝Co_VP(i,j)×Sim(i,j) (6)

wherein r (i, j) represents POI _i And POI _j The larger the correlation between the values of r (i, j), the description POI _i And POI _j The closer the relationship is, the less the scenery value is lost when the two are combined;

step 5.2, calculating landscape value: the calculation of the landscape value comprises three parts: POI landscape value calculation, edge landscape value calculation and path landscape value calculation are specifically as follows:

the scenery value of the POI is mainly determined by scores (score), levels, photo numbers (pictures), comment numbers (com count) and the like corresponding to the POI, and the larger the values are, the larger the scenery value of the POI is, and the calculation mode is as follows:

scenic(i)＝(score(i)+level(i)+picture(i)+comCount(i))×(1+w _i ) (7)

wherein scenic (i) is POI _i Scenery (i), level (i), picture (i), com count (i) represent POI respectively _i Scoring, rating, number of photos and number of comments on the carrier network; w (w) _i Is POI _i Feature vector T of (1) _i Cosine similarity between the user preference vector P (u) is used for describing the preference condition of the user to the POI;

the landscape value on the edge is determined by the landscape value of the POI adjacent to the edge, and for convenience of searching, the edge with the landscape value larger than 0 is marked as the landscape edge, and the formula is as follows:

wherein m is the sum of edges e _i,i+1 The number of neighboring POIs;

the landscape value of the path is calculated from the landscape value of the edge included in the path by the following formula:

wherein step 6 comprises:

step 6.1, chromosome coding: firstly, initializing an empty chromosome; next, a distance is selected from the landscape edge obtained in the step 5.2.2Adding the scenery edge closest to the starting point to the end of the chromosome; again, the d value in the user query condition is updated, i.e. d=d-dis (e _i,j ) Wherein dis (e _i,j ) Is edge e _i,j Is a distance of (2); finally, the above operation is carried out in a cyclic way until d is less than or equal to 0, and the obtained chromosome is encoded by a series of landscape edges;

step 6.2, chromosome decoding: for the encoded chromosome, the purpose of decoding is to fill the gap between two continuous landscape edges, namely, find the real path of the encoded chromosome on the road network, the landscape value of the chromosome is contributed by the landscape value of the landscape edge, and a formula (10) is defined as the fitness function of the chromosome, and the chromosome is decoded into the corresponding path to obtain the real travel distance;

wherein f (R) is the fitness value of the chromosome; sim (e) _i,i+1 ,e _j,j+1 ) Is edge e _i,i+1 And edge e _j,j+1 (i, j=0, 1, …, k-1 and j+.i), the characteristics of the edge being determined by the characteristics of the POI in close proximity to the edge; k is the number of edges contained on path R;

step 6.3, local search, specifically as follows:

step 6.3.1, variation: randomly selecting one scenery side of one chromosome and replacing the scenery side with the other scenery side; ensuring that the maximum travel distance constraint allowed by the user is not violated when a new landscape edge is selected; finally, selecting an optimal chromosome for decoding;

step 6.3.2, crossing: selecting two chromosomes from all chromosomes by adopting a game mechanism, selecting one crossing position, and exchanging other genes after the crossing positions of the two chromosomes;

and 6.4, taking the path with the highest final fitness value as the scenic travel route planned for the user.

The beneficial effects of the invention are as follows:

the method introduces the concept of user preference, namely, when route planning is carried out, the scenery value of the route is required to be higher, and the scenery of the route is required to meet the preference of the user as far as possible, so that the route planning is more personalized and the personalized requirements of the user can be met. Second, in computing the scenery value, an edge is typically adjacent to multiple POIs, and if only the scenery values of neighboring POIs are summed, the problem of scenery value loss in POI bonding is ignored. Therefore, before calculating the landscape value of the edge, the relation between the POIs is modeled, and the co-visit probability of the user on the POIs and the feature similarity between the POIs are considered. Thus, the closer the relationship between two POIs (the greater the probability of co-visit, the greater the similarity of features), the greater the contribution to the scenery value; conversely, the smaller the contribution to the scenery value. The modeling can calculate the landscape value of the edge more accurately, so that the obtained path is better. Finally, aiming at the problem that one POI is adjacent to a plurality of edges, in order to prevent the scenery of the path from being repeated, the similarity of edge characteristics is considered when the scenery value of the path is calculated, and the larger the similarity is, the more repeated POIs are, the smaller the contribution of the current edge to the scenery value of the path is; conversely, the greater the contribution.

Drawings

FIG. 1 is a flow chart of a travel route planning method based on cultural genetic algorithm of the invention;

fig. 2 is a flowchart of step 6.1 in the flowchart of a travel route planning method based on a cultural genetic algorithm according to the present invention.

Detailed Description

The invention will be described in detail below with reference to the drawings and the detailed description.

The invention relates to a travel route planning method based on a cultural genetic algorithm, which is shown in figure 1 and comprises the following specific steps:

The invention is also characterized in that:

wherein step 1 comprises:

step 1.3 comprises:

step 1.3.2, acquiring a POI-type matrix PT according to POI type characteristic data and sign-in data; each element pt in the matrix _i,j From the formula%2) The method comprises the following steps:

Step 1.3.3, obtaining a user-type sign-in matrix UT according to the user-POI sign-in matrix and the POI-type matrix; each element ut in the matrix _i,j Obtained from formula (3):

Wherein step 4 comprises:

step 4.2, path definition: one path is formed by sequentially connecting a plurality of edges in a road network, and one path is formed from a source point n ₀ To the destinationn _k The path is denoted as r= (e _0,1 ,e _1,2 ,…,e _k-1,k ) Wherein n is ₀ 、n _k Belongs to the set N, E belongs to the set E, E _0,1 Representing node n ₀ To n ₁ Is a side of (2);

step 4.3, user query definition: user queries are defined as triples, denoted q= < n ₀ ,n _k D >, wherein n ₀ 、n _k Respectively representing a starting point and a terminal point defined by a user, wherein d is the maximum travel distance allowed by the user;

Wherein step 5 comprises:

the similarity function formula between POIs is:

the correlation function formula between POIs is as follows:

r(i,j)＝Co_VP(i,j)×Sim(i,j) (6)

scenic(i)＝(score(i)+level(i)+picture(i)+comCount(i))×(1+w _i ) (7)

wherein m is the sum of edges e _i,i+1 The number of neighboring POIs;

wherein step 6 comprises:

step 6.1, chromosome coding: firstly, initializing an empty chromosome; secondly, selecting a landscape edge closest to the starting point from the landscape edges obtained in the step 5.2.2, and adding the landscape edge to the tail of the chromosome; again, the d value in the user query condition is updated, i.e. d=d-dis (e _i,j ) Wherein dis (e _i,j ) Is edge e _i,j Is a distance of (2); finally, the above operation is carried out in a cyclic way until d is less than or equal to 0, and the obtained chromosome is encoded by a series of landscape edges;

wherein f (R) is the fitness value of the chromosome; sim (e) _i,i+1 ,e _j,j+1 ) Is edge e _i,i+1 And edge e _j,j+1 (j=0, 1, …, k-1 and j+.i), the characteristics of the edge are determined by the characteristics of the POI that is in close proximity to the edge; k is the number of edges contained on path R;

step 6.3, local search, specifically as follows:

Example 1

The embodiment is a scenic tourism route planning method based on an improved cultural genetic algorithm, and the specific implementation process is as follows:

step 1, acquiring potential user preferences according to historical sign-in data of a user, wherein the potential user preferences are specifically as follows:

firstly, crawling historical sign-in data of a user from travel websites such as a carrying course, a passing cow, a going place and the like, obtaining data which is txt text, wherein one row represents a POI which is signed in by the user in a historical way, each row comprises at least one POI, and processing the text data to obtain a user sign-in matrix; secondly, crawling category characteristic data of each POI; finally, a label classification statistical method is used to obtain the favorite scenic spot types of the user, and the favorite scenic spot types are represented by single-hot codes to obtain the preference vector P (u) of the user. Meanwhile, the POI is encoded according to the characteristic type of the POI to obtain a characteristic vector T of the POI. Here, randomly selecting one user from sign-in users to perform preference extraction, and obtaining user preference historical remains, religions and temples.

Step 2, road network data of the western security city are obtained from the OSM, and preliminary cleaning is carried out; the method comprises the following steps:

firstly, entering an OSM (open service model) official network, finding out road network data of China, selecting the Sichuan city according to the range, and exporting to obtain the road network data of the Sichuan city; secondly, importing the shp file in the obtained road network data into an arcGIS, opening an attribute table of the layer data, and deleting useless records; and finally, exporting and storing the processed road network data.

Step 3, obtaining user interaction data of the POI, including information such as photo number, comment number, grade, score and the like, and adding the information into an attribute list of the corresponding POI; the method comprises the following steps:

firstly, entering a attack page of a travel network, searching a corresponding POI, and checking the corresponding photo number, comment number, grade and score; then, opening the attribute list of the POI in the arcGIS to edit, adding attribute field scores (score) (0.0-5.0), grades (levels) (none, A-AAAAA), photo numbers (pictures) and comment numbers (comment), and filling corresponding data on the carrier network; finally, according to the POI category characteristic data obtained in the step 1.2, adding a field category (type) into the POI attribute list (the method divides the POI types into 24) and filling corresponding data.

Step 4, modeling the road network according to the data obtained in the steps 1,2 and 3, and extracting effective edges which can reach the end point from the starting point according to the inquiry condition of the user; the method comprises the following steps:

firstly, modeling the processed road network data into a graph G= (N, E); then, assuming that the user query condition is Q= < south door of the university of western security worker, lotus lake park, 7.00km >, measuring in arcGIS, wherein the distance between the south door of the university of western security worker and the lotus lake park is 5.22km, drawing a rectangle by taking the line as a diagonal line, connecting another diagonal line, wherein the intersection point of the two diagonal lines is the midpoint of the line, taking the midpoint as the center of a circle, drawing a circle by taking 7.00km as the diameter, and the obtained area in the circle is the effective area; then, the data (points and lines) in the effective area are exported and stored as a working space, and the data processing is only carried out in the effective area later; finally, the road network data of the newly stored effective area is opened in arcGIS, and neighbor tables (the neighbor distance is 300m in the neighbor analysis because the POI within the distance of 300m is considered to be visible in the method) between the national road, the county road and the village road and the POI are respectively generated by using a neighbor analysis tool, and part of the neighbor tables are shown in table 1.

TABLE 1 neighbor list between village and POI

OBJECTID*	IN_FID	NEAR_FID	NEAR_DIST	NEAR_FC
					1	1	45	0.012451	Other POIs
2	2	45	0.012892	Other POIs
					3	4	2	0.181624	School
4	5	2	0.197818	School

In the table, the object field is the ID of the record, and is used to represent the serial number of the record; in_fid represents the ID of the edge performing the neighbor analysis IN the attribute table (here, the "village-road" attribute table); near_fid represents the ID of the POI point IN the attribute table (here, lines 1,2 are the "other POI" attribute table, lines 3, 4 are the "school" attribute table) IN the attribute table adjacent to in_fid; the near_dist field indicates the distance from near_fid to in_fid IN km; near_fc represents the table name of the table to which near_fid that is adjacent to in_fid belongs.

Step 5, calculating the landscape value on the effective edge in the road network according to the data obtained in the steps 1,2 and 3; the method comprises the following steps:

firstly, constructing a user-POI sign-in matrix according to user historical sign-in data; then, extracting POIs (point of interest) involved in an effective area, constructing a co-access matrix for the POIs, wherein a first row and a first column of the co-access probability matrix are POIs, elements at other positions of the matrix represent co-access probabilities of two POIs at corresponding positions, and the co-access probability is calculated by combining a formula (3) with a user-POI sign-in matrix; then, a similarity matrix is constructed, and matrix elements represent similarity of category characteristics between two POIs; then, carrying out element product on the co-visit probability matrix and the similarity matrix to obtain a POI correlation matrix, wherein matrix elements represent the degree of closeness of the relationship between any two POIs; finally, calculating scenery values of POI, edge and route according to formulas (6), (7) and (8).

Step 6, searching the effective edges in the road network according to the scenic values by using a cultural genetic algorithm based on the data obtained in the step, and planning a scenic travel route with high scenic values and high user satisfaction for the user; the method comprises the following steps:

firstly, initializing an empty chromosome, then selecting a landscape edge closest to a starting point from landscape edges, adding the landscape edge to the tail of the chromosome, and updating constraint conditions; then, the selection, insertion and updating operations are circularly executed until the constraint condition is not met; then, decoding the encoded chromosome to obtain an actual form distance, and determining an fitness value (the physical meaning is user satisfaction) of the chromosome according to a formula (10); finally, performing crossover and mutation operations to improve the fitness value of the chromosome, and finally converting the chromosome with the highest fitness value into an actual driving route to be returned to a user, wherein the final route is as follows: south door of the university of Siam Innovative Siro, - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -string) string of the lotus) lake park end point.

Claims

1. A travel route planning method based on cultural genetic algorithm is characterized by comprising the following specific steps:

step 1, acquiring potential preference of a user to POI according to historical sign-in data of the user and POI type feature data;

step 2, obtaining road network data from the OSM and carrying out data preprocessing;

step 3, obtaining user interaction information data of the POI, including the number of photos, the number of comments, the grade and grading information, and adding the user interaction information data into a corresponding POI attribute list;

the step 4 comprises the following steps:

step 4.1, road network modeling: a road network is modeled as a graph g= (N, E), where N is a set of nodes and E N x N is a set of directed edges;

step 4.3, user query definition: user queries are defined as triples, denoted q=<n ₀ ,n _k ,d>Wherein n is ₀ 、n _k Respectively representing a starting point and a terminal point defined by a user, wherein d is the maximum travel distance allowed by the user;

step 4.5, obtaining a neighbor table: the scenery value on the effective edge is calculated according to the scenery value of the POI adjacent to the edge, so that the adjacent distance needs to be set, and an adjacent table of the edge is obtained; the contents of the table include: the ID and name of the edge, the ID and name of the POI adjacent to the edge, and the adjacent distance;

the step 5 comprises the following steps:

wherein Co_VP (i, j) represents POI _i And POI _j Co-visit probability of N _i,j Indicating simultaneous access to POIs _i And POI _j Number of users, N _i Indicating that only POIs have been accessed _i Without accessing POI _j The number of users of (a);

the similarity function formula between the POIs is as follows:

wherein Sim (i, j) represents POI _i And POI _j Similarity between features, T _i,d Is POI _i Feature vector T _i A component of dimension d;

the formula of the relevance function between the POIs is as follows:

r(i,j)＝Co_VP(i,j)×Sim(i,j) (6)

the scenery value of the POI is mainly determined by the corresponding score, grade, photo number and comment number of the POI, and the larger the values are, the larger the scenery value of the POI is, and the calculation mode is as follows:

scenic(i)＝(score(i)+level(i)+picture(i)+comCount(i))×(1+w _i ) (7)

the landscape value on the edge is determined by the landscape value of the POI adjacent to the edge, and for convenience in searching, the edge with the landscape value larger than 0 is marked as a landscape edge, and the formula is as follows:

wherein m is the sum of edges e _i,i+1 The number of neighboring POIs;

the landscape value of the path is calculated by the landscape value of the edge contained in the path, and the calculation formula is as follows:

step 6, searching the effective edges and landscape values in the road network by using a cultural genetic algorithm, and planning a landscape travel route with high landscape values and high user satisfaction for the user;

the step 6 comprises the following steps:

step 6.1, chromosome coding: firstly, initializing an empty chromosome; next, a nearest scene edge from the beginning point is selected from the scenery edges obtained in step 5.2Is added to the end of the chromosome; again, the d value in the user query condition is updated, i.e. d=d-dis (e _i,j ) Wherein dis (e _i,j ) Is edge e _i,j Is a distance of (2); finally, the above operation is carried out in a cyclic way until d is less than or equal to 0, and the obtained chromosome is encoded by a series of landscape edges;

step 6.3, local search, specifically as follows:

2. The travel route planning method according to claim 1, wherein the step 1 includes:

and step 1.3, extracting the user preference of the information obtained in the step 1.1 and the step 1.2 by a label classification statistical method to obtain the potential preference data of the user.

3. The travel route planning method according to claim 2, wherein the step 1.3 includes:

4. The travel route planning method based on cultural genetic algorithm as defined in claim 1, wherein step 2 specifically comprises: and downloading urban road network data from the OSM, and reserving national road, provincial road, county road, village road, university, park and POI data in the road network data, and performing other clearing treatment to obtain pretreated urban road network data.

5. The travel route planning method based on cultural genetic algorithm as defined in claim 1, wherein the specific steps of the step 3 are as follows: and extracting scoring, comment number, grade and photo number information of the POI from the user in the tourism website, and marking the scoring, comment number, grade and photo number information on the corresponding POI to form a POI attribute list.