CN117076786A - Cross-province travel hot line recommendation method based on roaming information - Google Patents

Cross-province travel hot line recommendation method based on roaming information Download PDF

Info

Publication number
CN117076786A
CN117076786A CN202311119914.5A CN202311119914A CN117076786A CN 117076786 A CN117076786 A CN 117076786A CN 202311119914 A CN202311119914 A CN 202311119914A CN 117076786 A CN117076786 A CN 117076786A
Authority
CN
China
Prior art keywords
user
travel
information
roaming
hot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311119914.5A
Other languages
Chinese (zh)
Other versions
CN117076786B (en
Inventor
陈曦
潘建忠
王鹏亮
胡伟龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Richstone Technology Co ltd
Original Assignee
Guangzhou Richstone Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Richstone Technology Co ltd filed Critical Guangzhou Richstone Technology Co ltd
Priority to CN202311119914.5A priority Critical patent/CN117076786B/en
Publication of CN117076786A publication Critical patent/CN117076786A/en
Application granted granted Critical
Publication of CN117076786B publication Critical patent/CN117076786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • G06Q10/047Optimisation of routes or paths, e.g. travelling salesman problem
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/14Travel agencies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Game Theory and Decision Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of mobile internet, in particular to a trans-provincial travel hot line recommending method based on roaming information; the method comprises the following steps: s1, acquiring roaming track information of a user, and storing the roaming track information in a database, wherein the roaming track information comprises number information, position information, time information and track information of the user; s2, constructing a line recommendation algorithm project, which specifically comprises the following steps: s3, collecting user feedback information, wherein the feedback information comprises satisfaction degree and score of the user, and then using the collected user feedback information to optimize and improve a line recommendation algorithm; according to the travel route recommendation method and the travel route recommendation device, travel routes meeting the requirements of the user are automatically recommended according to the travel preference and the historical roaming track of the user.

Description

Cross-province travel hot line recommendation method based on roaming information
Technical Field
The invention relates to the technical field of mobile internet, in particular to a cross-provincial travel hot line recommending method based on roaming information.
Background
Along with the improvement of the living standard of people, more and more people choose to travel to relax body and mind and increase knowledge, the tourism industry gradually occupies very important positions in national economy of China, and in the tourism industry, a good tourism line is designed, so that more tourists can be brought to a tourism agency or other tourism operators, and better economic benefits are brought to the tourist agency or other tourism operators. The tourist route is an important component of tourist products and is an important tie for connecting tourists, tourist enterprises, related departments and tourist destinations.
At present, many travel agencies, OTA and other institutions provide travel route recommendation services in the market, but the traditional travel route recommendation modes are based on information such as geographic positions, travel time, budget and the like of users, are subjective and not objective enough, and meanwhile personalized demands and preferences of the users cannot be fully considered.
In recent years, the popularization and development of mobile internet technology have greatly changed the travel mode of users, users can acquire information at any time and any place through mobile equipment, share travel experience, record own roaming tracks in travel, better understand the travel demands and preferences of users based on the roaming data, and recommend more personalized travel routes for users.
Disclosure of Invention
The invention solves the technical problems in the prior art, and provides a cross-provincial travel hot line recommending method based on roaming information.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a cross-province travel hot line recommending method based on roaming information comprises the following steps:
s1, acquiring roaming track information of a user, and storing the roaming track information in a database, wherein the roaming track information comprises number information, position information, time information and track information of the user;
s2, constructing a line recommendation algorithm project, which specifically comprises the following steps:
s201, extracting stay points and tracks in the user roaming track information in the data, and analyzing and processing the stay points and track information by using a geographic information system tool to obtain stay time and frequency information of the user at different scenic spots;
s202, analyzing stay points and track information of a user through data, identifying hot tourist attractions and areas in each province, and determining hot degree according to stay time and frequency;
s203, combining the hot spots and the roaming track information of the user, calculating an optimal route by using a graph algorithm, removing cold spots on the optimal route, adding the hot spots within a set distance of the optimal route, and connecting to generate a hot route;
s204, matching the travel destination of the user with scenic spots contained in the hot line to provide a travel line meeting the user requirement;
and S3, collecting user feedback information, wherein the feedback information comprises satisfaction degree and score of the user, and using the collected user feedback information for optimizing and improving a line recommendation algorithm.
Further, in step S202, the method adopts a K-means algorithm to perform cluster analysis on the popularity of the scenic spots, and divides tourists into different clusters according to the residence time and the frequency, and specifically comprises the following steps:
s2021, taking out residence time and frequency data of tourists from the cluster, and normalizing the data;
s2022, determining the number K of clusters by using an Elbow method analysis index;
s2023, carrying out K-means cluster analysis, wherein the expression of the K-means algorithm is as follows:
J=∑(d(x i ,c i )) 2
in the above, x i Representing the sample point, c i Representing the center point of the cluster to which the sample belongs, d being a distance metric function;
s2024, calculating the size of the clusters and the distance between the clusters according to the clustering result, and determining the scenic spot hot degree by combining the residence time and the frequency.
Still further, the cluster size is calculated by the following formula:
J i =∑(d(x i ,c i )) 2
in the above, x i Representing the sample point, c i Representing the center point of the cluster to which the sample belongs, d representing a distance metric function;
the scenic spot popularity is calculated by the following formula:
in the above description, the polarity is the hot degree of the scenic spots, and specifically is:
when the polarity is more than or equal to 0.25, the scene is a hot scene;
when the polarity is less than or equal to 0.1 and less than 0.25, the scene is a common hot scene;
when the polarity is less than 0.1, the scene point is a cold scene point.
Further, S2023 specifically includes the following steps:
the method comprises the steps of firstly, randomly initializing central points of K clusters, calculating the distance between each sample and each central point, and distributing the samples to the cluster where the nearest central point is located;
the distance between each sample and its center point is calculated by:
d(x,y)=sqrt((x 1 -x 2 ) 2 +(y 2 -y 2 ) 2 )
in the above formula, sqrt represents square root sign, x 1 、y 1 Respectively representing the longitude and latitude of the sample, x 2 、y 2 Longitude and latitude respectively representing the center point;
step two, updating the central point of each cluster, and setting the central point as the average value of all samples in the cluster; updating of cluster centers is done by:
in the above, x i Sample points representing the ith cluster, n i Representing the number of samples of the ith cluster;
and thirdly, repeating the two steps until a stopping condition is reached, wherein the stopping condition is that samples in the cluster are not changed any more.
Further, the optimal route in step S203 is obtained by the following steps:
s2031, calculating the distances between all adjacent scenic spots in the related area, and simultaneously acquiring the hot degree data of the scenic spots;
s2032, constructing a graph model, wherein each scenic spot is used as a node in the graph, and the distance between the nodes is used as the weight of the edge;
s2033, determining a starting point and an end point and the number of scenic spots required to pass through;
s2034, calculating the best path using Dijkstra algorithm.
Further, the method for calculating the optimal path by Dijkstra algorithm is as follows: starting from the starting point, the values to each node are calculated separately, no route between two nodes is calculated as + -infinity, each time the minimum value in this graph is fixed, the step-by-step forward progress is carried out until a final value is determined, the path generating the minimum value is the optimal path, and the value at the end point is the distance of the optimal path.
Further, S204 specifically includes the following steps:
s2041, extracting roaming track data from a database, wherein the roaming track data comprise the past travel destination (scenic spot), stay time and travel track of a user;
s2042, according to the historical roaming data of the user and the travel hot line generated in the step S203, matching the travel destination of the user with scenic spots contained in the line, and calculating the hot travel line matched with the user;
s2043, generating travel routes of interest to the user, and recommending the proper travel routes for the user by combining the preference of the user and a recommendation algorithm.
Further, S2043 is specifically performed by:
1) Calculating the similarity between users, wherein the similarity is calculated by using cosine similarity, and two users are set to be respectively represented by i and k, and the similarity between the two users is s (i and k);
2) Calculating the score of the travel route of each user, calculating the score of each travel route of each user, and setting the score of the travel route j of the user i as r (i, j);
3) Predicting a travel route score for user k similar to user i, expressed by:
in the above formula, p (k, j) represents the predictive score of the travel route j by the user k, s (i, k) represents the similarity between the user i and the user k, and r (i, j) represents the score of the travel route j by the user i;
4) Recommending the travel route, recommending the travel route with the highest score according to the predictive score of the user, and setting the recommended travel route of the user k as J * Then:
J * =max(p(k,j))
in the above equation, max (p (k, j)) represents the highest predictive score of user k for line j.
Further, the method for analyzing and processing the stay points and the tracks by the geographic information system tool comprises the following steps: the track data in the roaming track data is input into a GIS to generate a Shapefile vector data file, the Shapefile is converted into GeoJson format data through python codes, relevant information is extracted from the GeoJson data, corresponding scenic spots are found through longitude and latitude, and stay time and frequency information are extracted.
Further, S3 specifically includes the following steps:
s301, a notice is issued on a website of the travel platform, and a user is informed of providing feedback information through a specific mailbox, a telephone or an online form;
s202, formulating a questionnaire containing problems related to travel recommended lines, wherein the questionnaire comprises satisfaction, recommended degree, price rationality, scenic spot richness and tour guide explanation level;
s203, sorting and analyzing the collected user feedback data, and finding out the favorites and demands of users on the travel recommended line, and the problems and improvement places;
s204, the analysis result is released through the website of the travel platform in a report form, so that the user can know the advantages and disadvantages and the improvement direction of the travel recommended line.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, stay points and frequencies of tourists are obtained through roaming track information, a clustering algorithm is combined to obtain hot spots, an optimal line is obtained according to a graph algorithm, a plurality of spots which are positioned at the front in the optimal line are selected, the spots are connected to form the hot line, and matching is carried out according to the destination of a user and the spots contained in the hot line, so that a travel line conforming to the user is provided, and the problem of automatically recommending the travel line conforming to the user requirement according to the travel preference and the historical roaming track of the user is solved.
Drawings
Fig. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of an example of the Dijkstra algorithm of the present invention calculating the shortest path.
Detailed Description
The technical solutions of the present invention will be clearly described below with reference to the accompanying drawings, and it is obvious that the described embodiments are not all embodiments of the present invention, and all other embodiments obtained by a person skilled in the art without making any inventive effort are within the scope of protection of the present invention.
As shown in fig. 1, the invention provides a method for recommending a cross-provincial travel hot line based on roaming information, which comprises the following steps:
s1, collecting roaming data of a user, reporting the roaming data by a mobile device of the user, acquiring roaming track information of the user by purchasing a roaming data packet interface of an operator, wherein the roaming track information comprises number information, position information, time information, track information and the like of the user, a data structure of the roaming track information is shown in a table 1, a data sample of the roaming track information is shown in a table 2 for example, and storing the information into a database after acquiring the roaming track information of the user.
Table 1 data structure of roaming trail information
Field name Data type Description of the invention
id int Main key
mobile varchar Mobile phone number
timestamp datetime Time stamp
latitude decimal Latitude of latitude
longitude decimal Longitude and latitude
location name varchar Place name
stay time int Residence time (seconds)
trajectory geometry Movement track
Table 2 data sample of roaming trajectory information
S2, constructing a line recommendation algorithm project, which specifically comprises the following steps:
s201, extracting roaming track data from the database, and processing geographic information in the roaming track data.
And extracting roaming track data from the database, carrying out geographic information processing on the collected roaming track data, extracting main stay points and tracks of users in each province, and analyzing and processing the main stay points and tracks by using a Geographic Information System (GIS) tool.
Specifically, track data in a roaming track data table is input into a GIS to generate a Shapefile vector data file, the Shapefile is converted into GeoJson format data through python codes, the GeoJson data contains attribute information in the original Shapefile, relevant information is extracted from the GeoJson data, corresponding scenic spots are found through longitude and latitude, stay time and frequency information are extracted, and chart display is output.
S202, identifying hot scenic spots, analyzing stay points and track information of users through GeoJson data, identifying hot tourist scenic spots and areas in each province, and determining the hot degree according to stay time and frequency.
Specifically, the longitude and latitude corresponding to each scenic spot in the GeoJson data represent one data point where each tourist is located, wherein the stay time is one dimension, the frequency is the other dimension, the stay time and the frequency data are subjected to standardized processing, and the data in the two dimensions are ensured to have the same scale; then using K-means algorithm to make cluster analysis, and dividing tourist into different clusters according to residence time and frequency data, and according to specific condition, every cluster represents a group of tourist whose residence time and frequency mode are similar; for each cluster, an average value of the internal residence time and frequency is calculated as an index for measuring the residence time and frequency of guests in the cluster, with a higher average value representing the group of guests whose cluster represents a popular attraction.
Specifically, the standardized processing was performed by Python program using standard scaler function of sklearn library, specifically expressed by the following formula:
scaler=StandardScaler()
X scaled=scaler.fit transform(X)
Y scaled=scaler.fit transform(Y)
in the above expression, X represents a residence time set, Y represents a frequency set, X scaled represents a residence time data set after normalization processing, and Y scaled represents a frequency data set after normalization processing.
According to the residence time and the residence frequency of tourists, the specific steps for calculating the scenic spot popularity degree by using a K-means algorithm are as follows:
(1) Taking out the residence time and frequency data of tourists from the cluster, normalizing the data by the normalization method, and preparing the data;
(2) Determining the number K of clusters, and determining the proper number K of clusters by using indexes such as an Elbow method analysis and the like;
(3) Performing K-means cluster analysis, wherein the objective function of the K-means algorithm is represented by the following formula:
in the above, x i Representing the sample point, c i Represents the center point of the cluster to which the sample belongs, d being a distance metric function.
The method specifically comprises the following steps:
the first step, randomly initializing the center points of K clusters, calculating the distance between each sample and each center point, and distributing the samples to the cluster where the nearest center point is located.
The distance between each sample and its center point is calculated by:
d(x,y)=sqrt((x 1 -x 2 ) 2 +(y 1 -y 2 ) 2 )
in the above formula, sqrt represents square root sign, x 1 、y 1 Respectively representing the longitude and latitude of the sample, x 2 、y 2 Representing the longitude and latitude, respectively, of the center point.
Step two, updating the central point of each cluster, and setting the central point as the average value of all samples in the cluster; updating of cluster centers is done by:
in the above, x i Sample points representing the ith cluster, n i Representing the number of samples for the i-th cluster.
Third, repeating the above two steps until reaching a stop condition, wherein the stop condition comprises that the samples in the cluster are not changed any more
(4) And analyzing the clustering result, and calculating statistical indexes such as average indexes and variances of the residence time and the frequency of each cluster, so as to calculate the size of the clusters and the distance between the clusters.
Specifically: the cluster size is calculated by the following formula:
J i =∑(d(x i ,c i )) 2
in the above, x i Representing the sample point, c i Represents the center point of the cluster to which the sample belongs, and d represents the distance metric function.
(5) Determining the hot degree of scenic spots according to the sizes of the clusters and the distances among the clusters, defining the clusters with long stay time and high frequency as a group with high hot degree, defining the clusters with short stay time and low frequency as a group with low hot degree, defining the clusters with long stay time exceeding 3 hours, and conversely defining the clusters with short stay time; more than 3 times with high frequency and vice versa with low frequency.
Specifically, based on cluster size J i Inter-cluster distance d (x) i ,c i ) The residence time st, frequency fr, the scenic spot popularity is calculated by:
in the above description, the polarity is the hot degree of the scenic spots, and specifically is:
when the polarity is more than or equal to 0.25, the scene is a hot scene;
when the polarity is less than or equal to 0.1 and less than 0.25, the scene is a common hot scene;
when the polarity is less than 0.1, the scene point is a cold scene point.
S203, generating a hot line, and calculating an optimal route by using a graph algorithm in combination with roaming data of the hot spots and the users, so as to generate a cross-province hot travel line.
The specific steps for calculating the optimal route using the graph algorithm are as follows:
(1) And calculating the distance between all adjacent scenic spots in the related range, and obtaining the popularity degree data of the scenic spots.
(2) And constructing a graph model, wherein each scenic spot is taken as a node in the graph, and the distance between the nodes is taken as the weight of the edge.
(3) The starting point and the end point, and the number of points to be passed through are determined.
(4) And calculating the shortest path, namely the optimal path by using a Dijkstra algorithm, specifically, taking a certain node as a starting point, and obtaining the shortest distance between the certain node and any point by using the Dijkstra algorithm.
Taking Guangzhou as an example, selecting a plurality of scenic spots as shown in fig. 2, taking a 'lotus island' scenic spot as a starting point, taking a 'from a chemical stream hot spring' as an end point, inserting 5 scenic spots in the middle, namely A, B, C, D, E, confirming weight values among each scenic spot, constructing a Dijkstra algorithm schematic diagram (shown in fig. 2), wherein the numerical values on the line segments in fig. 2 represent distance weight values of the scenic spots at two ends, after the numerical value on each line segment is confirmed, calculating the value of each node from the starting point, taking the fact that a route is not arranged between the two points as + -infinity calculation, fixing the minimum value in the map each time, gradually advancing until a final value is confirmed, wherein the path generating the minimum value is the shortest path, and the numerical value at the end point is the distance of the shortest path.
By the method, the optimal route from the lotus islands to the Kaolin spa is calculated (the lotus islands-C-D-Kaolin spa), the distance weight value is 12, and a plurality of travel routes are generated according to the arrangement of the distance weight value from high to low.
And removing the scenic spots of the cold doors in the optimal path, adding hot scenic spots near the optimal path, namely connecting the hot scenic spots within 3 km from the scenic spots related in the optimal path to generate a hot line.
S204, performing personalized recommendation on the generated popular route according to interests and preferences of the user, matching the generated popular route according to historical roaming track data of the user, calculating a cross-provincial travel route suitable for the user, and finally providing travel route recommendation meeting the user requirements, wherein the method comprises the following steps:
(1) The roaming trail data is extracted from the database, including information of past travel destination (scenic spot), stay time, travel trail, etc. of the user.
(2) And (3) according to the historical roaming data of the user and the travel hot line generated in the step (S203), matching the travel destination of the user with scenic spots contained in the line, and calculating the hot travel line matched with the user.
(3) Generating travel routes of interest to the user, and recommending the appropriate travel routes for the user by combining the preference of the user with a recommendation algorithm, wherein the recommendation algorithm is a collaborative filtering algorithm. The method specifically comprises the following steps of
1) And calculating the similarity between the users, wherein the similarity is calculated by using cosine similarity, and setting two users respectively represented by i and k, wherein the similarity between the two users is s (i and k).
2) The travel route score for each user is calculated, for each user their score for each travel route is calculated, and user i's score for travel route j is set to r (i, j).
3) Predicting a travel route score for user k similar to user i, expressed by:
in the above equation, p (k, j) represents the predictive score of user k for tour j, s (i, k) represents the similarity between user i and user k, and r (i, j) represents the score of user i for tour j.
4) Recommending the travel route, recommending the travel route with the highest score according to the predictive score of the user, and setting the recommended travel route of the user k as J * Then:
J * =max(p(k,j))
in the above equation, max (p (k, j)) represents the highest predictive score of user k for line j.
S3, collecting user feedback information, namely user satisfaction, scoring and the like, of the recommended route, wherein the feedback information is used for optimizing and improving a recommendation algorithm and providing more accurate and personalized travel route recommendation, and specifically comprises the following steps of:
(1) Announcements are published on the web site of the travel platform informing the user that feedback information may be provided through a specific mailbox, telephone, or online form.
(2) A questionnaire is formulated containing questions about the travel recommended route, such as satisfaction, recommended level, price justification, spot richness, tour guide explanation level, etc., to facilitate feedback provided by the user.
(3) And (3) sorting and analyzing the collected user feedback data to find out the favorites and demands of the user on the travel recommended route, and the problems and improvement places.
(4) And the analysis result is released through the website of the travel platform in the form of a report, so that the user can know the advantages and disadvantages and the improvement direction of the travel recommended line.
Through the steps, feedback information of the user, including satisfaction, scores and the like, is conveniently collected, so that a travel recommendation line is better optimized, and the user satisfaction is improved.
Finally, it should be noted that the above description is only for illustrating the technical solution of the present invention, and not for limiting the scope of the present invention, and that the simple modification and equivalent substitution of the technical solution of the present invention can be made by those skilled in the art without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. The cross-province travel hot line recommending method based on roaming information is characterized by comprising the following steps of:
s1, acquiring roaming track information of a user, and storing the roaming track information in a database, wherein the roaming track information comprises number information, position information, time information and track information of the user;
s2, constructing a line recommendation algorithm project, which specifically comprises the following steps:
s201, extracting stay points and tracks in the user roaming track information in the data, and analyzing and processing the stay points and track information by using a geographic information system tool to obtain stay time and frequency information of the user at different scenic spots;
s202, analyzing stay points and track information of a user through data, identifying hot tourist attractions and areas in each province, and determining hot degree according to stay time and frequency;
s203, combining the hot spots and the roaming track information of the user, calculating an optimal route by using a graph algorithm, removing cold spots on the optimal route, adding the hot spots within a set distance of the optimal route, and connecting to generate a hot route;
s204, matching the travel destination of the user with scenic spots contained in the hot line to provide a travel line meeting the user requirement;
and S3, collecting user feedback information, wherein the feedback information comprises satisfaction degree and score of the user, and using the collected user feedback information for optimizing and improving a line recommendation algorithm.
2. The method for recommending a cross-province travel hot line based on roaming information according to claim 1, wherein in step S202, a K-means algorithm is adopted to perform cluster analysis on the hot degree of scenic spots, and tourists are divided into different clusters according to residence time and frequency, and the method specifically comprises the following steps:
s2021, taking out residence time and frequency data of tourists from the cluster, and normalizing the data;
s2022, determining the number K of clusters by using an Elbow method analysis index;
s2023, carrying out K-means cluster analysis, wherein the expression of the K-means algorithm is as follows:
J=∑(d(x i ,c i )) 2
in the above, x i Representing the sample point, c i Representing the center point of the cluster to which the sample belongs, d being a distance metric function;
s2024, calculating the size of the clusters and the distance between the clusters according to the clustering result, and determining the scenic spot hot degree by combining the residence time and the frequency.
3. The method for cross-province travel hot line recommendation based on roaming information according to claim 2, wherein the cluster size is calculated by the following formula:
J i =∑(d(x i ,c i )) 2
in the above, x i Representing the sample point, c i Representing the center point of the cluster to which the sample belongs, d representing a distance metric function;
the scenic spot popularity is calculated by the following formula:
in the above, the polarity is the hot degree of the scenic spot, J i Is the size of the cluster, d (x i ,c i ) For the distance between clusters, st denotes the residence time, fr denotes the frequency, specifically:
when the polarity is more than or equal to 0.25, the scene is a hot scene;
when the polarity is less than or equal to 0.1 and less than 0.25, the scene is a common hot scene;
when the polarity is less than 0.1, the scene point is a cold scene point.
4. The method for provincial travel hot line recommendation based on roaming information according to claim 2, wherein S2023 specifically comprises the steps of:
the method comprises the steps of firstly, randomly initializing central points of K clusters, calculating the distance between each sample and each central point, and distributing the samples to the cluster where the nearest central point is located;
the distance between each sample and its center point is calculated by:
d(x,y)=sqrt((x 1 -x 2 ) 2 +(y 1 -y 2 ) 2 )
in the above formula, sqrt represents square root sign, x 1 、y 1 Respectively representing the longitude and latitude of the sample, x 2 、y 2 Longitude and latitude respectively representing the center point;
step two, updating the central point of each cluster, and setting the central point as the average value of all samples in the cluster; updating of cluster centers is done by:
in the above, x i Sample points representing the ith cluster, n i Representing the number of samples of the ith cluster;
and thirdly, repeating the two steps until a stopping condition is reached, wherein the stopping condition is that samples in the cluster are not changed any more.
5. The method for cross-province travel hot line recommendation based on roaming information according to claim 2, wherein the optimal route in step S203 is obtained by:
s2031, calculating the distances between all adjacent scenic spots in the related area, and simultaneously acquiring the hot degree data of the scenic spots;
s2032, constructing a graph model, wherein each scenic spot is used as a node in the graph, and the distance between the nodes is used as the weight of the edge;
s2033, determining a starting point and an end point and the number of scenic spots required to pass through;
s2034, calculating the best path using Dijkstra algorithm.
6. The method for recommending a cross-province travel hot line based on roaming information according to claim 5, wherein the method for calculating the optimal path by Dijkstra algorithm is as follows: starting from the starting point, the values to each node are calculated separately, no route between two nodes is calculated as + -infinity, each time the minimum value in this graph is fixed, the step-by-step forward progress is carried out until a final value is determined, the path generating the minimum value is the optimal path, and the value at the end point is the distance of the optimal path.
7. The method for provincial travel hot line recommendation based on roaming information of claim 1, wherein S204 specifically comprises the steps of:
s2041, extracting roaming track data from a database, wherein the roaming track data comprise the past travel destination (scenic spot), stay time and travel track of a user;
s2042, according to the historical roaming data of the user and the travel hot line generated in the step S203, matching the travel destination of the user with scenic spots contained in the line, and calculating the hot travel line matched with the user;
s2043, generating travel routes of interest to the user, and recommending the proper travel routes for the user by combining the preference of the user and a recommendation algorithm.
8. The method for provincial travel hot line recommendation based on roaming information according to claim 7, wherein S2043 is specifically performed by:
1) Calculating the similarity between users, wherein the similarity is calculated by using cosine similarity, and two users are set to be respectively represented by i and k, and the similarity between the two users is s (i and k);
2) Calculating the score of the travel route of each user, calculating the score of each travel route of each user, and setting the score of the travel route j of the user i as r (i, j);
3) Predicting a travel route score for user k similar to user i, expressed by:
in the above formula, p (k, j) represents the predictive score of the travel route j by the user k, s (i, k) represents the similarity between the user i and the user k, and r (i, j) represents the score of the travel route j by the user i;
4) Recommending the travel route, recommending the travel route with the highest score according to the predictive score of the user, and setting the recommended travel route of the user k as J * Then:
J * =max(p(k,j))
in the above equation, max (p (k, j)) represents the highest predictive score of user k for line j.
9. The method for recommending a cross-province travel hot line based on roaming information according to claim 1, wherein the method for analyzing and processing the stay points and the tracks by the geographic information system tool is as follows: the track data in the roaming track data is input into a GIS to generate a Shapefile vector data file, the Shapefile is converted into GeoJson format data through python codes, relevant information is extracted from the GeoJson data, corresponding scenic spots are found through longitude and latitude, and stay time and frequency information are extracted.
10. The method for recommending a cross-province travel hot line based on roaming information according to claim 1, wherein S3 specifically comprises the following steps:
s301, a notice is issued on a website of the travel platform, and a user is informed of providing feedback information through a specific mailbox, a telephone or an online form;
s202, formulating a questionnaire containing problems related to travel recommended lines, wherein the questionnaire comprises satisfaction, recommended degree, price rationality, scenic spot richness and tour guide explanation level;
s203, sorting and analyzing the collected user feedback data, and finding out the favorites and demands of users on the travel recommended line, and the problems and improvement places;
s204, the analysis result is released through the website of the travel platform in a report form, so that the user can know the advantages and disadvantages and the improvement direction of the travel recommended line.
CN202311119914.5A 2023-08-31 2023-08-31 Cross-province travel hot line recommendation method based on roaming information Active CN117076786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311119914.5A CN117076786B (en) 2023-08-31 2023-08-31 Cross-province travel hot line recommendation method based on roaming information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311119914.5A CN117076786B (en) 2023-08-31 2023-08-31 Cross-province travel hot line recommendation method based on roaming information

Publications (2)

Publication Number Publication Date
CN117076786A true CN117076786A (en) 2023-11-17
CN117076786B CN117076786B (en) 2024-04-16

Family

ID=88702252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311119914.5A Active CN117076786B (en) 2023-08-31 2023-08-31 Cross-province travel hot line recommendation method based on roaming information

Country Status (1)

Country Link
CN (1) CN117076786B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120110031A1 (en) * 2010-10-28 2012-05-03 Tomi Lahcanski System for locating nearby picture hotspots
CN105956951A (en) * 2016-05-05 2016-09-21 杭州诚智天扬科技有限公司 Method of identifying hot tourist route based on mobile signaling
CN110598154A (en) * 2019-09-16 2019-12-20 新疆银狐数据科技有限公司 Tourism comprehensive statistics big data platform based on fusion of multi-channel data
CN110874780A (en) * 2018-09-01 2020-03-10 昆山炫生活信息技术股份有限公司 Scenic spot playing system and recommendation method based on big data statistics
CN111177572A (en) * 2020-01-16 2020-05-19 西北大学 Personalized tour route recommendation method based on dynamic interest of user
CN111429220A (en) * 2020-03-25 2020-07-17 西安交通大学 Travel route recommendation system and method based on operator big data
CN112084401A (en) * 2020-08-18 2020-12-15 桂林理工大学 Tour route customizing device and method
CN112991008A (en) * 2021-03-04 2021-06-18 北京嘀嘀无限科技发展有限公司 Position recommendation method and device and electronic equipment
CN113225260A (en) * 2021-04-25 2021-08-06 湖南大学 Mixed clustering opportunistic routing implementation method based on machine learning
US11120349B1 (en) * 2018-03-06 2021-09-14 Intuit, Inc. Method and system for smart detection of business hot spots

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120110031A1 (en) * 2010-10-28 2012-05-03 Tomi Lahcanski System for locating nearby picture hotspots
CN105956951A (en) * 2016-05-05 2016-09-21 杭州诚智天扬科技有限公司 Method of identifying hot tourist route based on mobile signaling
US11120349B1 (en) * 2018-03-06 2021-09-14 Intuit, Inc. Method and system for smart detection of business hot spots
CN110874780A (en) * 2018-09-01 2020-03-10 昆山炫生活信息技术股份有限公司 Scenic spot playing system and recommendation method based on big data statistics
CN110598154A (en) * 2019-09-16 2019-12-20 新疆银狐数据科技有限公司 Tourism comprehensive statistics big data platform based on fusion of multi-channel data
CN111177572A (en) * 2020-01-16 2020-05-19 西北大学 Personalized tour route recommendation method based on dynamic interest of user
CN111429220A (en) * 2020-03-25 2020-07-17 西安交通大学 Travel route recommendation system and method based on operator big data
CN112084401A (en) * 2020-08-18 2020-12-15 桂林理工大学 Tour route customizing device and method
CN112991008A (en) * 2021-03-04 2021-06-18 北京嘀嘀无限科技发展有限公司 Position recommendation method and device and electronic equipment
CN113225260A (en) * 2021-04-25 2021-08-06 湖南大学 Mixed clustering opportunistic routing implementation method based on machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
岳秋霞: "基于桂林景点数据的智能旅游推荐研究", 中国优秀硕士学位论文全文数据库信息科技辑, pages 36 - 53 *

Also Published As

Publication number Publication date
CN117076786B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
Schuessler et al. Processing raw data from global positioning systems without additional information
CN109101493B (en) Intelligent house purchasing assistant based on conversation robot
CN107423837A (en) The Intelligent planning method and system of tourism route
CN110347814B (en) Lawyer accurate recommendation method and system
Frejinger Route choice analysis: data, models, algorithms and applications
CN110472066A (en) A kind of construction method of urban geography semantic knowledge map
CN105448292A (en) Scene-based real-time voice recognition system and method
CN103995837A (en) Personalized tourist track planning method based on group footprints
CN104317865B (en) A kind of social network search making friends method based on music emotion characteristic matching
CN111931998B (en) Individual travel mode prediction method and system based on mobile positioning data
CN111382224A (en) Urban area function intelligent identification method based on multi-source data fusion
CN110889092A (en) Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data
CN115526590A (en) Efficient human-sentry matching and re-pushing method combining expert knowledge and algorithm
CN110597945B (en) Cognitive site feature identification method and system for urban subway station domain
CN112836146B (en) Geographic space coordinate information acquisition method and device based on network message
CN110415053A (en) A kind of user experience monitoring system and method based on big data
CN117076786B (en) Cross-province travel hot line recommendation method based on roaming information
CN114090898A (en) Information recommendation method and device, terminal equipment and medium
CN111898043B (en) Urban travel route planning method
Lu et al. A machine learning approach to trip purpose imputation in GPS-based travel surveys
CN111026957A (en) Recommendation system and method based on multi-dimensional similarity
Johnson et al. The wider value of rural rail provision
CN111523614B (en) Cell similarity judging method and device
Yameqani et al. Evaluating a location distortion model to improve reverse geocoding through temporal semantic signatures
CN109388649A (en) Intelligent land recommendation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant