WO2016107417A1 - 基于旅游目标地域来挖掘旅游路线的方法和设备 - Google Patents

基于旅游目标地域来挖掘旅游路线的方法和设备 Download PDF

Info

Publication number
WO2016107417A1
WO2016107417A1 PCT/CN2015/097599 CN2015097599W WO2016107417A1 WO 2016107417 A1 WO2016107417 A1 WO 2016107417A1 CN 2015097599 W CN2015097599 W CN 2015097599W WO 2016107417 A1 WO2016107417 A1 WO 2016107417A1
Authority
WO
WIPO (PCT)
Prior art keywords
attraction
sequence
attractions
travel
client
Prior art date
Application number
PCT/CN2015/097599
Other languages
English (en)
French (fr)
Inventor
李天宁
Original Assignee
广州神马移动信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州神马移动信息科技有限公司 filed Critical 广州神马移动信息科技有限公司
Publication of WO2016107417A1 publication Critical patent/WO2016107417A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/14Travel agencies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of the Internet, and in particular to a method and apparatus for mining a travel route based on a tourist target area.
  • Some travel websites collect information on a large number of tourist attractions and have designed some tourist routes. Users can enter some conditions on the website, such as the places they want to travel, travel time, etc., and then design the tour route with the help of the website.
  • the collection and arrangement of these tourist attractions information and the pre-design of the tourist routes require a lot of labor; on the other hand, some of the artificially set parameters (such as the popularity of the attractions, the recommendation index, etc.) are based on the staff experience. And the feelings set, may deviate from the experience and feelings of the actual tourists.
  • Traditional search engines can meet the general needs of users by providing aggregated structured data. For example, for a tourist attraction of a certain attraction and a certain city, traditional search engines can organize and express through some structured data, which can basically reach users. More convenient access to information. However, these structured data still have a large cost of use for users. User travel needs to be self-integrated and organized from hundreds of thousands of Raiders travels provided by search engines during the journey mining. This is not very good. Meet the needs of users.
  • One technical problem to be solved by the present invention is to provide a method and a device for excavating a tourist route based on a tourism target region, which can self-extract and analyze high-quality tourist routes.
  • travel routes are represented in the form of a sequence of attractions.
  • a method for mining a travel route based on a travel target area is provided.
  • the travel route is represented in the form of a sequence of attractions, the method comprising: searching for a travel article related to a travel target area; and for each travel article, A sequence of attractions consisting of the attractions included therein is obtained separately; a sequence of attractions containing one or more specific attractions is screened as a sequence of alternative attractions for the target area of the tour.
  • the alternate attraction sequence can be used as a travel route that can be recommended to the user.
  • the method may further comprise: for each candidate attraction according to a predetermined sequence of scoring rules The sequence sets the attraction sequence scores, and sorts the plurality of candidate attraction sequences in descending order of the attraction sequence scores to provide to the client in this order in response to a request from the client.
  • the efficiency of recommending the travel route for the user can be further improved according to actual needs.
  • the predetermined sequence scoring rule may be based on at least one of the following characteristics: time rationality; whether there are duplicate attractions; proportion of popular attractions; proportion of unpopular attractions; travel intensity; and line length.
  • the method may further include: determining a travel target area in response to a search request from the client including the travel feature word indicating the travel intention and the travel target area word indicating the travel target area; and based on the determined travel target area, Providing at least one of the alternate list of attractions to the client.
  • the travel route can be recommended to the user by communicating with the client.
  • the method may further comprise: calculating a relevance score of the candidate attraction sequence based on the tourism condition information from the client, the number of the specific scenic spots, and the scenic spot score, wherein the correlation score is used to the client At least one of the alternative attraction sequences is provided.
  • the travel route can be recommended to the user more specifically.
  • the method may further comprise: filtering out a sequence of attractions comprising attractions on the blacklist of attractions.
  • the step of obtaining a sequence of attractions may include: searching for travel time information related to the tour time of the attraction in the travel article; and not finding the tour time information, according to the two attractions adjacent to the appearance order in the travel article a distance between the distance and/or a tour time recommendation obtained from a third party, estimating travel time information corresponding to each attraction; and in the case of finding the tour time information, extracting tour time information corresponding to each attraction from the travel article; And forming a sequence of attractions by associating each attraction with its corresponding tour time information.
  • the method may further include: separately setting an attraction score for the attraction according to the predetermined attraction rating rule, and setting a specific attraction based on the attraction score; and/or providing the client with a list of at least some attractions included in the travel article, The spots selected by the user from the list are set as specific attractions.
  • the recommended attraction sequence (tourism route) includes the most recommended attraction and/or the attraction that the user particularly wants to play.
  • the predetermined attraction rating rule may be based on at least one of the following features: a search page view amount of the attraction; a search amount for the attraction; a number of attraction sequences including the attraction; and a third party rating of the attraction.
  • an apparatus for excavating a travel route based on a travel target area comprising: a travel article retrieval device for searching for a tour involving a tourist target area An article sequence point obtaining device for respectively obtaining a sequence of attractions composed of scenic spots included in each travel article; a sequence of sights screening device for screening a sequence of attractions including one or more specific attractions as a target for tourism A sequence of alternative attractions in the area.
  • the apparatus may further comprise: an attraction sequence scoring device, configured to set a point of interest sequence score for each candidate attraction sequence according to a predetermined sequence scoring rule, and to have multiple points in descending order of the point of interest sequence score The alternate attraction sequences are ordered to be provided to the client in this order in response to a request from the client.
  • an attraction sequence scoring device configured to set a point of interest sequence score for each candidate attraction sequence according to a predetermined sequence scoring rule, and to have multiple points in descending order of the point of interest sequence score The alternate attraction sequences are ordered to be provided to the client in this order in response to a request from the client.
  • the apparatus may further include: target area determining means for determining a travel target area; and an attraction point in response to a search request from the client including the travel feature word indicating the travel intention and the travel target area word indicating the travel target area;
  • the sequence providing means is configured to provide the client with at least one of the candidate attraction sequences based on the travel target area determined by the target area determining means.
  • the device may further comprise: a relevance score calculation device for based on the client Calculating the relevance score of the candidate attraction sequence by using the tourism condition information, the number of the specific scenic spots, and the scenic spot score, wherein the attraction sequence providing device provides the client with at least the candidate attraction sequence based on the relevance score One.
  • a relevance score calculation device for based on the client Calculating the relevance score of the candidate attraction sequence by using the tourism condition information, the number of the specific scenic spots, and the scenic spot score, wherein the attraction sequence providing device provides the client with at least the candidate attraction sequence based on the relevance score One.
  • the apparatus may further comprise: a sight sequence filtering device for filtering out a sequence of attractions including attractions on the blacklist of the attraction.
  • the attraction sequence obtaining means may include: searching means for searching for travel time information related to the tour time of the attraction in the travel article; and tour time information estimating means for not finding the tour time information Estimating the tour time information corresponding to each attraction according to the distance between the two scenic spots adjacent to the appearance order in the travel article and/or the tour time recommendation obtained from the third party; the tour time information extracting device is used to find the tour In the case of the time information, the tour time information corresponding to each attraction is extracted from the travel article; and the attraction sequence generating means is configured to form the attraction sequence by associating the respective attractions with the tour time information corresponding thereto.
  • the device may further include: an attraction scoring device, configured to respectively set an attraction score for the attraction according to the predetermined attraction scoring rule; and/or a sight list providing device, configured to provide the client with at least some attractions included in the travel article And a specific attraction setting device for setting a specific attraction based on the attraction score, and/or setting the attraction selected by the user from the list as a specific attraction.
  • an attraction scoring device configured to respectively set an attraction score for the attraction according to the predetermined attraction scoring rule
  • a sight list providing device configured to provide the client with at least some attractions included in the travel article
  • a specific attraction setting device for setting a specific attraction based on the attraction score, and/or setting the attraction selected by the user from the list as a specific attraction.
  • FIG. 1 is a schematic flow chart of a method of digging a travel route in accordance with one embodiment of the present invention.
  • step 200 of FIG. 2 is a schematic flow diagram of steps that may be further included in step 200 of FIG.
  • 3A and 3B are schematic flow diagrams of two manners of setting a particular attraction used in step 300 of FIG.
  • FIG. 4 is a schematic flow diagram of a method of digging a travel route in accordance with a modified embodiment of the present invention.
  • FIG. 5 is a schematic flow diagram of a method of recommending a travel route to a user by communicating with a client.
  • FIG. 6 is a schematic block diagram of an apparatus for digging a travel route in accordance with one embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of an apparatus that the attraction sequence obtaining apparatus 200 of FIG. 6 may further include.
  • Figure 8 is a schematic block diagram of an apparatus that can be used to set a particular attraction.
  • FIG. 9 is a schematic block diagram of an apparatus for digging a travel route in accordance with a modified embodiment of the present invention.
  • FIG. 10 is a schematic block diagram of an apparatus for recommending a travel route to a user by communicating with a client.
  • FIG. 1 schematically shows a schematic flow chart of a method of digging a travel route in accordance with one embodiment of the present invention.
  • step S100 a travel article relating to a travel target area is retrieved.
  • the tourism target area can be a geographical area with a certain spatial range, which includes several tourist attractions, and people need to select the most worthwhile attractions among the attractions and travel according to the optimized route.
  • the tourist target area can be a larger scenic spot, such as Yuanmingyuan, Jiuzhaigou, and Bashang Grassland.
  • the tourism target area may also be a city or province, or a part of a city or province, such as Hangzhou, Hainan, and northern Xinjiang.
  • the tourism target area may also be a large area including several provinces, such as South China and Northeast China.
  • Japan when it comes to overseas travel, it can also be one or several countries, such as Japan, the United States, and so on. It can even be a continent or a part of a continent, such as Africa, Eastern Europe, etc.
  • travel articles on the Internet such as travel-related articles, such as various travel notes and travel guides written by tourists, as well as travel itineraries on travel websites.
  • step S200 for each of the retrieved travel articles, a sequence of attractions composed of the attractions included in each of the travel articles is respectively obtained.
  • Such a sequence of attractions may represent a corresponding travel route.
  • step S200 in addition to finding various scenic spots from the travel articles, it is also possible to try to analyze the travel time information of each scenic spot from the travel articles, for example, the time spent on visiting various scenic spots, whether it is divided into several Days to visit a variety of attractions, each attraction in the first few days. This information can also be used to help users make travel itineraries.
  • FIG. 2 is a flow chart that schematically illustrates the steps that may be used to obtain tour time information.
  • the tour time information related to the tour time of the attraction is searched for in the travel article.
  • the tour time information includes, for example, the number of days of the plurality of attractions, the duration of the tour of each attraction, and the like, and information related to the tour time of the attraction.
  • step S215 It is judged at step S215 whether or not the tour time information is found in the travel article.
  • step S220 the tour time information corresponding to each attraction is directly extracted from the travel article.
  • step S230 according to the distance between the two scenic spots adjacent to the appearance order in the travel article and/or the tour time suggestion obtained from the third party, it is estimated that corresponding to each attraction Tour time information.
  • step S240 a sequence of attractions is formed by associating each attraction with tour time information corresponding to the respective attractions.
  • the sequence of attractions thus formed not only lists the attractions involved in the tour route, but also indicates the tour time information corresponding to each attraction, such as whether the two attractions are divided into two days, and how long the tour duration of one attraction is. In this way, it is possible to provide a more reasonable travel plan for the user, and it is also more helpful for the user to select a travel plan from the recommended travel route (alternative spot sequence).
  • step S200 after completion of step S200, a plurality of tourist routes represented by a sequence of attractions can be obtained.
  • the sequence of the attractions may be screened based on one or more specific scenic spots, that is, the sequence of attractions including one or more specific scenic spots may be selected as the target for the tourist target region. Select a sequence of attractions.
  • 3A and 3B are schematic flow diagrams of two manners of setting a particular attraction used in step 300 of FIG.
  • Figure 3A shows the way in which a particular attraction is set based on the attraction score.
  • step S310 the attraction scores are respectively set for each attraction according to the predetermined attraction rating rule.
  • the predetermined attraction scoring rule may be based on at least one of the following features:
  • a third party s evaluation of the attraction.
  • the number of search page views (PV) and the number of searches for attractions (reflecting the popularity of the attraction) can be obtained by analyzing the search logs of the search site.
  • Vspot For example, if the search page view (PV) of an attraction is x1, the rating of the attraction (ie, the third-party rating of the attraction) is x2, and the popularity of the attraction (search volume for the attraction) is x3, the route covered by the attraction The number (the number of attractions that contain the attraction) is x4, and the final score for the attraction is Vspot:
  • Vspot a1*x1+a2*x2+a3*x3+a4*x4
  • a1, a2, a3, and a4 are the weights of each feature, and this weight can be obtained by training using training data.
  • a specific attraction is set based on the attraction score. For example, one or several attractions with the highest scores of attractions can be set as specific attractions.
  • the user does not set an attraction that is particularly intended to be played, or that the user wants to play less than enough (eg, less than a predetermined reservation)
  • the highest score of the attraction that is, the most popular, is generally considered to be worthwhile to be included in the sequence of alternative attractions, to avoid the user missing the attraction worth visiting.
  • FIG. 3B shows the manner in which a particular attraction is set according to the user's selection.
  • the client is provided with a list of at least some of the attractions included in the travel article.
  • step S340 in response to the user selection from the client, the attraction selected by the user from the list is set as a specific attraction.
  • the user can participate in the planning process of the tourist route (the sequence of attractions), making the entire tourism route planning scheme more flexible. For example, if the user particularly wants to go to a relatively unpopular attraction, and there are few travel articles involving this unpopular attraction, it is possible to avoid the user's great effort to find the tourist route (the sequence of attractions) including the attraction.
  • one or a plurality of attractions having a higher score of one or more attractions and attractions selected by the user may be public as the specific attraction to be considered in step S300.
  • FIG. 4 is a flow chart that schematically illustrates a method of digging a travel route in accordance with a modified embodiment of the present invention.
  • the steps S100, S200, and S300 may be the same as those described above with reference to FIG. 1.
  • an attraction sequence score may be further set for each candidate attraction sequence according to a predetermined sequence scoring rule. And the obtained plurality of candidate attraction sequences may be further sorted in descending order of the attraction sequence scores to be provided to the client in this order in response to the request from the client.
  • the predetermined sequence scoring rule may be based on at least one of the following features:
  • the time rationality of a sequence of attractions is y1, whether there are duplicate attractions in the line is y2 (0/1), the proportion of popular spots covered by the line is y3, the proportion of line coverage of unpopular attractions is y4, travel The intensity is y5 and the route length is y6, then the attraction sequence score is spotValue:
  • b1, b2, b3, b4, b5, and b6 are the weights of each feature, and this weight can be obtained by training using training data.
  • the efficiency of recommending the travel route for the user can be further improved according to actual needs.
  • step S400 the sequence of attractions including the spots on the blacklist of the attraction may be further filtered out.
  • the blacklist of this attraction can be set according to various standards in various ways. For example, it may be set by the user and uploaded from the client to the server. It may also be set according to the feedback of a large number of users, or may be set according to the experience of the staff, or may be an evaluation of the user's evaluation of the attraction on the Internet. And set.
  • Step S400 is shown in FIG. 4 to be performed after step S300, and step S500 is performed after step S400. In fact, it is possible to perform only step S500 without performing step S400.
  • the order between steps S300, S400, and S500 is also adjustable (hence, step S400 and step S500 are shown by a broken line frame in Fig. 4).
  • step S500 may be performed first, and the attraction sequence scores are separately set and sorted for each sequence of attractions obtained in step S200, and then the screening and/or filtering operations in steps S300 and/or S400 are performed.
  • step S400 may be performed first, filtering the relevant scenic spot sequence according to the blacklist of the attraction, and then performing step S300 to filter the sequence of the scenic spot including the specific scenic spot as the candidate scenic spot sequence.
  • the above method may be performed in advance on the server side, thereby preparing an alternative attraction sequence for a plurality of travel target areas.
  • the user can directly provide the candidate sequence of alternative attractions that have been prepared without performing operations such as retrieving the travel article, obtaining the sequence of attractions, filtering the sequence of attractions, and the like from the beginning.
  • it is only necessary to further filter the candidate attraction sequence in the case where the specific attraction is further set based on the sequence of the candidate attractions that the user has obtained.
  • FIG. 1 or FIG. 4 The operation of FIG. 1 or FIG. 4 is performed from the beginning in response to the user's search request only if the server does not prepare the candidate attraction sequence for the travel target area searched by the user in advance.
  • a method of recommending a travel route to a user by communicating with a client is described below with reference to FIG.
  • FIG. 5 schematically illustrates a flow diagram of a method of recommending a travel route to a user by communicating with a client.
  • the keyword of the search query can be cut (ie, the keyword input by the user is divided into a single word) to determine whether the representation is matched in the result of the word. Tourism characteristics of tourism intentions.
  • a tourism target area word indicating a tourism target area such as a city or an attraction. If not, the method ends. If so, it can be considered that the user's request is likely to be a search request for planning a travel route based on the travel target area. At this point you can start the method described below.
  • the travel target area is determined in step S600 in response to a search request from the client including the travel feature word indicating the travel intention and the travel target area word indicating the travel target area.
  • the server may directly obtain the prepared candidate attraction sequence and provide the client with the sequence at step S800. At least one of the alternative attraction sequences.
  • the candidate attraction sequence can be provided to the client based on the level of the attraction sequence score.
  • the client may be provided with a number of alternate attraction sequences with the highest score of the attraction sequence, or the client may be provided with an alternate sequence of attractions whose attraction sequence scores are above a predetermined threshold.
  • step S700 in step S700 (optional step, indicated by a dashed box in FIG. 5), based on travel condition information from the client (eg, length of travel time, etc.), number of specific attractions, and sequence of attractions Value, which calculates the relevance score for the candidate attraction sequence.
  • travel condition information eg, length of travel time, etc.
  • number of specific attractions e.g., number of specific attractions
  • sequence of attractions Value which calculates the relevance score for the candidate attraction sequence.
  • the relevance score Score of the candidate attraction sequence can be calculated by the following formula.
  • Score represents the relevance score of the sequence of attractions
  • Hit_spot indicates the number of specific attractions included in the attraction sequence.
  • Spot_of_line indicates the number of attractions included in the tour route (spot sequence).
  • Expday indicates the number of days the user expects to play (tourage information).
  • Lineday indicates the number of days involved in the tourist route (spot sequence).
  • Abs(%) means taking the absolute value
  • A1 is a parameter and can also be obtained through training.
  • the spotValue is the attraction sequence score set in step S500.
  • At step S800 at least one of the candidate attraction sequences may be provided to the client based on the relevance score.
  • the client may be provided with a number of alternate attraction sequences with the highest relevance score, or the client may be provided with an alternate attraction sequence with a relevance score above a predetermined threshold.
  • the user's travel condition information such as the expected number of days of play
  • the condition information of the travel route itself for example, the number of days involved in the travel route
  • the proportion of the scenic spots included in the attraction sequence and the score of the attraction sequence itself, etc.
  • the method may enter the steps in FIG. 1 or FIG. 4 after step S600. S100, start to obtain an alternate attraction sequence. After completing the method steps of FIG. 1 or FIG. 4, the process proceeds to step S700 or S800 to provide an alternate sequence of attractions to the client.
  • the travel route (attraction sequence) can be recommended to the user more efficiently in response to the user's search request. Users can easily select and adjust (for example by adjusting specific attractions and travel conditions, etc.).
  • FIG. 6 is a schematic block diagram of an apparatus for digging a travel route in accordance with one embodiment of the present invention.
  • the apparatus includes a travel article retrieval device 100, a sight sequence obtaining device 200, and a sight sequence screening device 300.
  • the travel article search device 100 is for searching for a travel article related to a travel target region.
  • the attraction sequence obtaining device 200 is configured to obtain a sequence of attractions composed of the scenic spots included therein for each of the travel articles.
  • the attraction sequence screening device 300 screens a sequence of attractions containing one or more particular attractions as an alternate sequence of attractions for the target area of the tour.
  • FIG. 7 is a schematic block diagram of an apparatus that the attraction sequence obtaining apparatus 200 of FIG. 6 may further include. Through these sub-devices, the attraction sequence obtaining means 200 can associate the tour time information with the attraction.
  • the attraction sequence obtaining device 200 may include a search device 210, a tour time information extracting device 220, a tour time information estimating device 230, and an attraction sequence generating device 240.
  • the search device 210 is configured to search for travel time information related to the tour time of the attraction in the travel article.
  • the tour time information extracting means 220 is configured to extract tour time information corresponding to each attraction from the travel article in the case where the tour time information is found.
  • the tour time information estimating means 230 is configured to estimate the distance between the two scenic spots adjacent to each other in the order of appearance in the travel article and/or the travel time recommendation obtained from the third party, without finding the tour time information, Tourist time information corresponding to the attraction.
  • the attraction sequence generating means 240 is for forming a sequence of attractions by associating the respective attractions with the tour time information corresponding thereto.
  • Figure 8 is a schematic block diagram of an apparatus that can be used to set a particular attraction. These devices may be included in the attraction sequence screening device 300 or may be located outside the attraction sequence screening device 300.
  • the means for setting a specific attraction may include an attraction scoring device 310 and/or a sight list providing device 320, and a specific attraction setting device 330.
  • the attraction scoring device 310 is configured to separately set the attraction scores for the scenic spots according to the predetermined attraction scoring rules.
  • the attraction list providing means 320 is configured to provide the client with a list of at least some of the attractions included in the travel article.
  • the specific attraction setting means 330 is for setting a specific attraction based on the attraction score, and/or setting the attraction selected by the user from the list as a specific attraction.
  • FIG. 9 is a schematic block diagram of an apparatus for digging a travel route in accordance with a modified embodiment of the present invention.
  • the device may further include a sight sequence filtering device 400 and an attraction sequence scoring device 500.
  • the Sight Sequence Filtering Device 400 is used to filter out a sequence of attractions that include attractions on the blacklist of attractions.
  • the attraction sequence scoring device 500 is configured to set an attraction sequence score for each candidate attraction sequence according to a predetermined sequence scoring rule.
  • the attraction sequence scoring device 500 may further sort the obtained plurality of candidate attraction sequences in descending order of the attraction sequence scores to be provided to the client in this order in response to a request from the client. .
  • FIG. 10 is a schematic block diagram of an apparatus for recommending a travel route to a user by communicating with a client.
  • the apparatus may further include a target area determining device 600 and an attraction sequence providing device 800.
  • the target area determining means 600 is configured to determine the travel target area in response to a search request from the client including the travel feature word indicating the travel intention and the travel target area word indicating the travel target area.
  • the attraction sequence providing device 800 is configured to provide at least one of the candidate attraction sequences to the client based on the travel target region determined by the target region determining device 600.
  • the attraction sequence providing device 800 can directly provide the prepared candidate attraction sequence to the client.
  • the device shown in Figure 6 or 9 can be invoked to obtain an alternate attraction sequence.
  • the device may also include a relevance score calculation device 700 (preferably, but not necessarily, shown in dashed lines in Figure 10).
  • a relevance score calculation device 700 preferably, but not necessarily, shown in dashed lines in Figure 10.
  • the relevance score calculation device 700 is configured to calculate a relevance score of the candidate attraction sequence based on the travel condition information from the client, the number of the specific attractions, and the attraction sequence score.
  • the attraction sequence providing device 800 provides at least one of the candidate attraction sequences to the client based on the relevance score.
  • the method according to the invention may also be embodied as a computer program product comprising a computer readable medium having processor-executable non-volatile program code on which is stored for A computer program that performs the above-described functions defined in the method of the present invention.
  • the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both.
  • each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the Executable instructions.
  • the functions marked in the boxes may also differ.
  • the order marked in the figures occurs. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Abstract

提供一种基于旅游目标地域来挖掘旅游路线的方法和设备。其中,旅游路线以景点序列的形式表示。该方法包括:检索涉及旅游目标地域的旅游文章;针对各篇旅游文章,分别获得由其中包含的景点构成的景点序列;筛选包含一个或多个特定景点的景点序列,作为针对旅游目标地域的备选景点序列(可以推荐给用户的旅游线路)。通过该方法和设备,可以从互联网上海量的旅游文章中获取由于大量用户采用而值得向用户推荐的旅游线路。

Description

基于旅游目标地域来挖掘旅游路线的方法和设备
本申请要求于2014年12月29日提交中国专利局、申请号为CN201410848598.X、发明名称为“基于旅游目标地域来挖掘旅游路线的方法和设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网领域,特别涉及基于旅游目标地域来挖掘旅游路线的方法和设备。
发明背景
人们对于旅游的需求已经不能满足于跟随旅行团的出游。与跟团游的诸多限制相比,人们更加喜欢自己设计旅行路线,然后自由出行。
一些旅游网站采集了大量旅游景点的信息,并且自行设计了一些旅游路线。用户可以在网站上输入一些条件,例如希望旅游的景点、出游时间等,然后在网站的协助下,设计旅游路线。
但是,一方面,这些旅游景点信息的收集和整理以及旅游路线的预先设计需要大量人工;另一方面,这里人为设定的一些参数(例如景点热门程度、推荐指数等)是基于工作人员的经验和感受设置的,有可能偏离实际游客的经验和感受。
因此,人们会更多的通过搜索引擎来搜索其他人在网上提供的旅游信息,例如游记、旅游攻略等,当然也包括旅游网站上提供的相关信息(可以统称为“旅游文章”)。人们期望通过浏览所检索到的相关网页或文章来确定自己的旅游路线。
事实上,在搜索引擎中,用户对于旅游的搜索需求也的确占据了一个很大的比例。然而,在这个信息爆炸的时代,即便是通过使用搜索引擎,所 能得到的旅游文章也很繁杂。用户需要大量的阅读和再加工,才能确定旅游路线。这将是费时费力的。
从搜索查寻(Query)分析看,有的用户是针对某个景点的旅游需求,例如:用户搜索“西湖旅游”;有的用户是针对某个城市的旅游需求,例如:用户搜索“杭州旅游”。这一类可以算用户对于要游玩的目的(或者说,“旅游目标地域”)是非常明确的。
传统搜索引擎通过提供聚合结构化数据,可以满足用户的常规需求,例如:对于某个景点和某个城市的旅游需求,传统搜索引擎通过一些结构化的数据进行组织表达,基本可以达到能让用户更便捷获取信息的目的。但是这些结构化数据对于用户来说仍然有较大的使用成本,用户旅游在进行行程挖掘时,需要从搜索引擎提供的成百上千条攻略游记中自行归纳和整理,这并不能很好的满足用户的需求。
因此,仍需要一种能够为用户推荐高质量的旅游线路的方法和设备
发明内容
本发明所要解决的一个技术问题是提供一种基于旅游目标地域来挖掘旅游路线的方法和设备,其能够自行挖掘分析高质量的旅游线路。在本公开的上下文中,旅游路线以景点序列的形式表示。
根据本发明的一个方面,提供了一种基于旅游目标地域来挖掘旅游路线的方法,旅游路线以景点序列的形式表示,该方法包括:检索涉及旅游目标地域的旅游文章;针对各篇旅游文章,分别获得由其中包含的景点构成的景点序列;筛选包含一个或多个特定景点的景点序列,作为针对旅游目标地域的备选景点序列。备选景点序列可以作为能够推荐给用户的旅游线路。
通过本发明的方法,可以从互联网上海量的旅游文章中获取由于大量用户采用而值得向用户推荐的旅游线路(备选景点序列)。
优选地,该方法还可以包括:根据预定序列评分规则,为每个备选景点 序列设置景点序列分值,并且按景点序列分值从高到低的顺序将多个所述备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。
通过为每个备选景点序列设置景点序列分值,可以根据实际需要,进一步提高为用户推荐旅游路线的效率。
优选地,预定序列评分规则可以基于以下特征中的至少一项:时间合理性;是否存在重复景点;热门景点所占比例;冷门景点所占比例;旅行强度;以及线路长度。
优选地,该方法还可以包括:响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,确定旅游目标地域;以及基于所确定的旅游目标地域,向客户端提供备选景点序列中的至少一个。
由此,可以通过与客户端通信来向用户推荐旅游路线。
优选地,该方法还可以包括:基于来自客户端的旅游条件信息、包含特定景点的数量以及景点序列分值,计算备选景点序列的相关性分值,其中,基于相关性分值,向客户端提供备选景点序列中的至少一个。
由此,可以更有针对性地向用户推荐旅游线路。
优选地,该方法还可以包括:滤除包含景点黑名单上的景点的景点序列。
通过滤除景点黑名单上的景点所涉及的景点序列,可以进一步提高为用户推荐旅游路线的效率。
优选地,获得景点序列的步骤可以包括:在旅游文章中查找与景点的游览时间相关的游览时间信息;在没有查找到游览时间信息的情况下,根据旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息;在查找到游览时间信息的情况下,从旅游文章中提取与各景点对应的游览时间信息;以及通过使各景点和与其对应的游览时间信息相关联来形成景点序列。
通过使得在景点序列中包含了游览时间信息,更有助于用户规划旅游方案。
优选地,该方法还可以包括:根据预定景点评分规则为景点分别设置景点分值,并基于景点分值设定特定景点;和/或向客户端提供旅游文章中包含的至少部分景点的列表,并将用户从列表中选择的景点设定为特定景点。
通过基于景点分值设定特定景点和/或根据用户选择设定特定景点,可以确保在所推荐的景点序列(旅游线路)中包含有最值得推荐的景点和/或用户特别希望游玩的景点。
优选地,预定景点评分规则可以基于以下特征中的至少一项:景点的搜索页面浏览量;针对景点的搜索量;包含该景点的景点序列数量;以及第三方对该景点的评价。
通过根据景点评分来设定特定景点,可以自动准备好包括值得推荐的景点的景点序列(旅游路线)。通过根据用户的选择来设定特定景点,可以更加有针对性地准备好用户感兴趣的景点序列(旅游路线)。
根据本发明的另一个方面,提供了一种基于旅游目标地域来挖掘旅游路线的设备,旅游路线以景点序列的形式表示,该设备包括:旅游文章检索装置,用于检索涉及旅游目标地域的旅游文章;景点序列获得装置,用于针对各篇旅游文章,分别获得由其中包含的景点构成的景点序列;景点序列筛选装置,用于筛选包含一个或多个特定景点的景点序列,作为针对旅游目标地域的备选景点序列。
优选地,该设备还可以包括:景点序列评分装置,用于根据预定序列评分规则,为每个备选景点序列设置景点序列分值,并且按景点序列分值从高到低的顺序将多个所述备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。
优选地,该设备还可以包括:目标地域确定装置,用于响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,确定旅游目标地域;以及景点序列提供装置,用于基于目标地域确定装置确定的旅游目标地域,向客户端提供备选景点序列中的至少一个。
优选地,该设备还可以包括:相关性分值计算装置,用于基于来自客户 端的旅游条件信息、包含特定景点的数量以及景点序列分值,计算备选景点序列的相关性分值,其中,景点序列提供装置基于相关性分值,向客户端提供备选景点序列中的至少一个。
优选地,该设备还可以包括:景点序列滤除装置,用于滤除包含景点黑名单上的景点的景点序列。
优选地,景点序列获得装置可以包括:查找装置,用于在旅游文章中查找与景点的游览时间相关的游览时间信息;游览时间信息估计装置,用于在没有查找到游览时间信息的情况下,根据旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息;游览时间信息提取装置,用于在查找到了游览时间信息的情况下,从旅游文章中提取与各景点对应的游览时间信息;以及景点序列生成装置,用于通过使各景点和与其对应的游览时间信息相关联来形成景点序列。
优选地,该设备还可以包括:景点评分装置,用于根据预定景点评分规则为景点分别设置景点分值;和/或景点列表提供装置,用于向客户端提供旅游文章中包含的至少部分景点的列表;以及特定景点设定装置,用于基于景点分值设定特定景点,和/或将用户从列表中选择的景点设定为特定景点。
通过本发明的设备,可以从互联网上海量的旅游文章中获取由于大量用户采用而值得向用户推荐的旅游线路(备选景点序列)。
附图简要说明
通过结合附图对本公开示例性实施方式进行更详细的描述,本公开的上述以及其它目的、特征和优势将变得更加明显,其中,在本公开示例性实施方式中,相同的参考标号通常代表相同部件。
图1是根据本发明的一个实施例的挖掘旅游路线的方法的示意性流程图。
图2是图1中的步骤200中可以进一步包括的步骤的示意性流程图。
图3A和图3B是图1中的步骤300中所用到的特定景点的两种设定方式的示意性流程图。
图4是根据本发明的改进实施例的挖掘旅游路线的方法的示意性流程图。
图5是通过与客户端通信来向用户推荐旅游路线的方法的示意性流程图。
图6是根据本发明的一个实施例的挖掘旅游路线的设备的示意性方框图。
图7是图6中的景点序列获得装置200可以进一步包括的装置的示意性方框图。
图8是可用于设定特定景点的装置的示意性方框图。
图9是根据本发明的改进实施例的挖掘旅游路线的设备的示意性方框图。
图10是通过与客户端通信来向用户推荐旅游路线的装置的示意性方框图。
实施本发明的方式
下面将参照附图更详细地描述本公开的优选实施方式。虽然附图中显示了本公开的优选实施方式,然而应该理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了使本公开更加透彻和完整,并且能够将本公开的范围完整地传达给本领域的技术人员。
图1示意性地示出了根据本发明的一个实施例的挖掘旅游路线的方法的示意性流程图。
通过图1所示的方法,可以从互联网上海量的旅游文章中获取由于大量用户采用而值得向用户推荐的旅游线路(备选景点序列)。
首先,在步骤S100,检索涉及旅游目标地域的旅游文章。
旅游目标地域可以是具有一定空间范围的地理区域,其中包含有若干个旅游景点,人们需要在这些景点中挑选最值得游玩的景点,并按照最优化的路线来旅游。
例如,旅游目标地域可以是一个较大的景区,例如圆明园、九寨沟、坝上草原等。
或者,旅游目标地域也可以是一个城市或省,或者一个城市或省的部分,例如杭州、海南、新疆北部等。
或者,旅游目标地域也可以是一个包括几个省份的大区域,例如华南、东北等。
或者,当涉及境外旅游时,还可以是一个或几个国家,例如日本、美国等。甚至还可以是一个洲或一个洲的一部分,例如非洲、东欧等。
互联网上有海量的旅游文章,即与旅游有关的文章,例如游客用户撰写的各种游记和旅游攻略,还有旅游网站上提供的旅游行程介绍等。
这些旅游文章,特别是游记和旅游攻略,描述了人们的旅游行程。通过分析大量旅游文章,可以了解大量用户的旅游经历,从而确定最值得推荐的景点和路线。
然后,在步骤S200,针对所检索到的各篇旅游文章,分别获得由每篇所述旅游文章中包含的景点构成的景点序列。
这样的景点序列可以代表对应的旅游路线。
在步骤S200中获得景点序列的过程中,除了从旅游文章中找出各个景点之外,还可以尝试从旅游文章中分析各个景点的游览时间信息,例如游览各个景点所耗费的时间,是否分几天来游览多个景点,每个景点在第几天等。这些信息也可以用来为用户制定旅游路线提供帮助。
图2是示意性地示出了可以用来获取游览时间信息的步骤的流程图。
首先,针对所检索到的旅游文章,在步骤S210,在旅游文章中查找与景点的游览时间相关的游览时间信息。游览时间信息包括例如多个景点的天数分割、各个景点的游览持续时间等与景点的游览时间有关的信息。
在步骤S215判断是否在旅游文章中查找到了游览时间信息。
如果在旅游文章中查找到了游览时间信息,则在步骤S220,从旅游文章中直接提取与各景点对应的游览时间信息。
如果在旅游文章中没有查找到游览时间信息,则在步骤S230,根据旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息。
当使用两个景点之间的距离来估计游览时间信息时,可以训练拟合一个映射到时间的公式,通过这个公式,可以计算出两个景点的时间。必要时,在借鉴第三方提供的游览时间建议,可以计算整个线路的游览持续时间,从而可以实现天数分割,并确定每个景点的游览持续时间。
然后,在步骤S240,通过使各景点和与所述各景点对应的游览时间信息相关联来形成景点序列。
这样形成的景点序列不但依次罗列了旅游线路所涉及的景点,还可以注明每个景点对应的游览时间信息,例如两个景点是否分在两天游览,一个景点需要多长游览持续时间等。这样,可以为用户提供更为合理的旅游方案,也更有助于用户从所推荐的旅游路线(备选景点序列)中选择旅游方案。
返回图1,在完成步骤S200之后,可以获得若干条用景点序列表示的旅游线路。
为了更有效地向用户推荐旅游线路,可以在步骤S300,基于一个或多个特定景点来对这些景点序列进行筛选,即筛选包含一个或多个特定景点的景点序列,作为针对旅游目标地域的备选景点序列。
下面参考图3A和图3B来描述这里的特定景点的设定方法。
图3A和图3B是图1中的步骤300中所用到的特定景点的两种设定方式的示意性流程图。
图3A所示出的是根据景点评分来设定特定景点的方式。
在步骤S310,根据预定景点评分规则为各个景点分别设置景点分值。
这里,预定景点评分规则可以基于以下特征中的至少一项:
景点的搜索页面浏览量(PV);
针对景点的搜索量(体现景点热度);
包含该景点的景点序列数量;以及
第三方对该景点的评价。
景点的搜索页面浏览量(PV)和针对景点的搜索量(体现景点热度)可以通过分析搜索网站的搜索日志来得到。
例如:如果某个景点的搜索页面浏览量(PV)为x1,景点的评分(即第三方对该景点的评价)为x2,景点的热度(针对景点的搜索量)为x3,景点覆盖的线路数量(包含该景点的景点序列数量)为x4,则景点的最终得分为Vspot:
Vspot=a1*x1+a2*x2+a3*x3+a4*x4
其中,a1,a2,a3,a4为各个特征的权重,这个权重可以通过使用训练数据进行训练得到。
在步骤S320,基于景点分值设定特定景点。例如,可以设定景点分值最高的一个或若干个景点为特定景点。
通过基于景点分值来设定备选景点序列中要包含的特定景点,可以在用户没有设定特别想要去游玩的景点的情况下,或者用户想要游玩的景点不够多(例如少于预定数值,例如2个或3个景点)的情况下,将景点分值最高,也就是最热门的,广大用户普遍认为值得游玩的景点包含在备选景点序列中,避免用户错过值得游玩的景点。
图3B示出的是根据用户的选择来设定特定景点的方式。
在步骤S330,向客户端提供旅游文章中包含的至少部分景点的列表。
然后,在步骤S340,响应于来自客户端的用户选择,将用户从列表中选择的景点设定为特定景点。
通过将用户选择的景点设定为特定景点,可以让用户参与到旅游线路(景点序列)的规划过程中,使得整个旅游线路规划方案更加灵活。例如在用户特别想要去某个较为冷门的景点,而涉及这个冷门景点的旅游文章较少的情况下,可以避免用户耗费很大的精力才找到包含这个景点的旅游线路(景点序列)。
事实上,图3A和3B的两种方式可以结合使用。例如,可以以用户选择的一个或几个景点和景点分值较高的一个或几个景点公共作为在步骤S300中需要考虑的特定景点。
如上所述,通过图1所示的方法,可以从互联网上海量的旅游文章中获取由于大量用户采用而值得向用户推荐的旅游线路(备选景点序列)。
图4示意性地示出了根据本发明的改进实施例的挖掘旅游路线的方法的流程图。
其中步骤S100、S200、S300可以与上文中参考图1描述的内容相同。
如图4所示,在步骤S500,还可以进一步根据预定序列评分规则,为每个备选景点序列设置景点序列分值。并且还可以进一步按景点序列分值从高到低的顺序将所获得的多个备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。
这里,预定序列评分规则可以基于以下特征中的至少一项:
时间合理性;
是否存在重复景点;
热门景点所占比例;
冷门景点所占比例;
旅行强度;以及
线路长度。
例如,某个景点序列(旅游线路)的时间合理性为y1、线路中是否出现重复景点为y2(0/1)、线路覆盖热门景点的比例为y3、线路覆盖冷门景点的比例为y4、旅行强度为y5与路线长度为y6,则景点序列分值为spotValue:
spotValue=b1*y1+b2*y2+b3*y3+b4*y4+b5*y5+b6*y6
其中,b1,b2,b3,b4,b5,b6为各个特征的权重,这个权重可以通过使用训练数据进行训练得到。
通过为每个备选景点序列设置景点序列分值,可以根据实际需要,进一步提高为用户推荐旅游路线的效率。
另外,如图4所示,还可以在步骤S400,进一步滤除包含景点黑名单上的景点的景点序列。
这个景点黑名单可以是通过各种方式根据各种标准设置的。例如可以是由用户自行设置并从客户端上传到服务器端的,也可以是根据大量用户的反馈设置的,也可以是根据工作人员的经验设置的,也可以是通过分析互联网上用户对景点的评价而设置的。
通过滤除景点黑名单上的景点所涉及的景点序列,可以进一步提高为用户推荐旅游路线的效率。
图4中示出步骤S400在步骤S300之后执行,而步骤S500在步骤S400之后执行。事实上,可以只执行步骤S500而不执行步骤S400。另一方面,步骤S300、S400、S500之间的顺序也是可以调整的(因此,图4中采用虚线框来示出步骤S400和步骤S500)。
例如,可以先执行步骤S500,为步骤S200中获得的各景点序列分别设置景点序列分值并排序,然后再执行步骤S300和/或S400中的筛选和/或滤除操作。
或者,也可以先执行步骤S400,根据景点黑名单滤除相关的景点序列,然后再执行步骤S300筛选包含特定景点的景点序列作为备选景点序列。
上文中描述的操作中,只有图3B的步骤涉及了与客户端的交互。而事实上,在需要设定特定景点时,可以先采用图3A的方式。待到与客户端交互时,再进一步按照图3B的方式进一步设定特定景点,然后进一步筛选备选景点序列。
换句话说,为了提高用户搜索的速度,可以预先在服务器端执行上述方法,从而准备好针对若干旅游目标区域的备选景点序列。当用户在客户端执行搜索时,可以直接向用户提供已经准备好的备选景点序列,而不用从头开始执行检索旅游文章、获得景点序列、筛选景点序列等操作。而只需要在用户已经得到的备选景点序列的基础上,进一步设定了特定景点的情况下,进一步筛选备选景点序列即可。
只有当服务器上没有预先准备好针对用户所搜索的旅游目标区域的备选景点序列的情况下,才响应于用户的搜索请求而从头开始执行图1或图4的操作。
这样,既可以减少用户的搜索等待时间,也可以减少服务器的运算量。
下面参考图5描述通过与客户端通信来向用户推荐旅游路线的方法。
图5示意性地示出了通过与客户端通信来向用户推荐旅游路线的方法的流程图。
在用户在客户端输入搜索查询(query)请求后,可以对搜索查询的关键词进行切词(即将用户输入的关键词切分成一个一个单独的词),判断是否在切词结果中匹配到了表示旅游意向的旅游特征词。
如果没有匹配到,则该方法结束。
如果匹配到了,则进一步判断是否匹配到了城市或景点等表示旅游目标地域的旅游目标地域词。如果没有则该方法也结束。如果有,则可以认为,用户发出的很可能是基于旅游目标地域来规划旅游线路的搜索请求。此时可以启动下面描述的方法。
响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,在步骤S600确定旅游目标地域。
在服务器上已经通过上面参考图1或图4的方法针对该旅游目标地域准备了备选景点序列的情况下,服务器可以直接获取已经准备好的备选景点序列,并在步骤S800向客户端提供备选景点序列中的至少一个。
在如图4所示设置了景点序列分值的情况下,可以基于景点序列分值的高低来向客户端提供备选景点序列。例如,可以向客户端提供景点序列分值最高的若干个备选景点序列,或者可以向客户端提供景点序列分值高于预定阈值的备选景点序列。
优选地,可以在步骤S800之前,在步骤S700(可选步骤,图5中用虚线框表示),基于来自客户端的旅游条件信息(例如旅游时间长短等)、包含特定景点的数量以及景点序列分值,计算备选景点序列的相关性分值。
例如,可以通过下面的公式来计算备选景点序列的相关性分值Score。
Figure PCTCN2015097599-appb-000001
其中,Score表示景点序列的相关性分值,
hit_spot表示景点序列中包含的特定景点的数量,
spot_of_line表示旅游线路(景点序列)中包含的景点数,
expday表示用户期望游玩天数(旅游条件信息),
lineday表示旅游线路(景点序列)涉及的天数,
abs(…)表示取绝对值,
a1是参数,也可以通过训练得到,
spotValue是步骤S500中设置的景点序列分值。
然后,在步骤S800,可以基于相关性分值,向客户端提供备选景点序列中的至少一个。例如,可以向客户端提供相关性分值最高的若干个备选景点序列,或者可以向客户端提供相关性分值高于预定阈值的备选景点序列。
通过基于用户的旅游条件信息(例如期望游玩天数)和旅游线路本身的条件信息(例如旅游线路涉及的天数)之间的匹配关系、景点序列中包含特定景点的比例以及景点序列本身的分值等方面来计算相关性分值,并按照相关性分值来提供备选景点序列,可以向用户推荐既适合用户的旅游条件同时又值得推荐的旅游线路(景点序列)。
另一方面,在服务器上已经通过上面参考图1或图4的方法针对该旅游目标地域准备了备选景点序列的情况下,该方法在步骤S600之后,可以进入图1或图4中的步骤S100,开始获取备选景点序列。在完成图1或图4的方法步骤之后,再进入步骤S700或S800,向客户端提供备选景点序列。
由此,可以响应于用户的搜索请求,更加高效地向用户推荐旅游路线(景点序列)。用户可以方便地加以选择和调整(例如通过调整特定景点和旅游条件信息等)。
上面详细描述了根据本发明的基于旅游目标地域来挖掘旅游路线的方法。下面参考图6至10描述根据本发明的基于旅游目标地域来挖掘旅游线路的设备。
下面描述的设备的很多装置的功能分别与上面参考图1至5描述的相应步骤的功能相同。为了避免重复,这里重点描述该设备可以具有的装置结构,而对于一些细节则不再赘述,可以参考上文中的相应描述。
图6是根据本发明的一个实施例的挖掘旅游路线的设备的示意性方框图。
如图6所示,该设备包括旅游文章检索装置100、景点序列获得装置200和景点序列筛选装置300。
旅游文章检索装置100用于检索涉及旅游目标地域的旅游文章。
景点序列获得装置200用于针对各篇旅游文章,分别获得由其中包含的景点构成的景点序列。
景点序列筛选装置300筛选包含一个或多个特定景点的景点序列,作为针对旅游目标地域的备选景点序列。
图7是图6中的景点序列获得装置200可以进一步包括的装置的示意性方框图。通过这些子装置,景点序列获得装置200可以将游览时间信息与景点关联起来。
如图7所示,景点序列获得装置200可以包括查找装置210、游览时间信息提取装置220、游览时间信息估计装置230、景点序列生成装置240。
查找装置210用于在旅游文章中查找与景点的游览时间相关的游览时间信息。
游览时间信息提取装置220用于在查找到游览时间信息的情况下,从旅游文章中提取与各景点对应的游览时间信息。
游览时间信息估计装置230用于在没有查找到游览时间信息的情况下,根据旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息。
景点序列生成装置240用于通过使各景点和与其对应的游览时间信息相关联来形成景点序列。
图8是可用于设定特定景点的装置的示意性方框图。这些装置可以包括在景点序列筛选装置300中,也可以位于景点序列筛选装置300之外。
如图8所示,用于设定特定景点的装置可以包括景点评分装置310和/或景点列表提供装置320、以及特定景点设定装置330。
景点评分装置310用于根据预定景点评分规则为景点分别设置景点分值。
景点列表提供装置320用于向客户端提供旅游文章中包含的至少部分景点的列表。
特定景点设定装置330用于基于景点分值设定特定景点,和/或将用户从列表中选择的景点设定为特定景点。
图9是根据本发明的改进实施例的挖掘旅游路线的设备的示意性方框图。
如图9所示,除了图6所示的旅游文章检索装置100、景点序列获得装置200和景点序列筛选装置300,该设备还可以包括景点序列滤除装置400和景点序列评分装置500。
景点序列滤除装置400用于滤除包含景点黑名单上的景点的景点序列。
景点序列评分装置500用于根据预定序列评分规则,为每个备选景点序列设置景点序列分值。景点序列评分装置500还可以进一步按景点序列分值从高到低的顺序将所获得的多个备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。。
图10是通过与客户端通信来向用户推荐旅游路线的装置的示意性方框图。
如图10所示,该设备还可以包括目标地域确定装置600和景点序列提供装置800。
目标地域确定装置600用于响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,确定旅游目标地域。
景点序列提供装置800用于基于目标地域确定装置600确定的旅游目标地域,向客户端提供备选景点序列中的至少一个。
当服务器上已经准备好了针对旅游目标地域的备选景点序列时,景点序列提供装置800可以直接将准备好的备选景点序列提供给客户端。
当服务器上还没有准备好针对旅游目标地域的备选景点序列时,可以调用图6或9所示的装置,以便获得备选景点序列。
可选地,该设备还可以包括相关性分值计算装置700(优选但非必须,图10中用虚线框示出)。
相关性分值计算装置700用于基于来自客户端的旅游条件信息、包含特定景点的数量以及景点序列分值,计算备选景点序列的相关性分值。
景点序列提供装置800基于相关性分值,向客户端提供备选景点序列中的至少一个。
上文中已经参考附图详细描述了根据本发明的基于旅游目标地域来挖掘旅游路线的方法和设备。
此外,根据本发明的方法还可以实现为一种计算机程序产品,该计算机程序产品包括具有处理器可执行的非易失的程序代码的计算机可读介质,在该计算机可读介质上存储有用于执行本发明的方法中限定的上述功能的计算机程序。本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。
附图中的流程图和框图显示了根据本发明的多个实施例的系统和方法的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标记的功能也可以以不同 于附图中所标记的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (18)

  1. 一种基于旅游目标地域来挖掘旅游路线的方法,其特征在于,所述旅游路线以景点序列的形式表示,所述方法包括:
    检索涉及所述旅游目标地域的旅游文章;
    针对所检索到的各篇旅游文章,分别获得由每篇所述旅游文章中包含的景点构成的景点序列;
    筛选包含一个或多个特定景点的所述景点序列,作为针对所述旅游目标地域的备选景点序列。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据预定序列评分规则,为每个备选景点序列设置景点序列分值,并且按景点序列分值从高到低的顺序将多个所述备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。
  3. 根据权利要求2所述的方法,其特征在于,所述预定序列评分规则基于以下特征中的至少一项:
    时间合理性;
    是否存在重复景点;
    热门景点所占比例;
    冷门景点所占比例;
    旅行强度;以及
    线路长度。
  4. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,确定所述旅游目标地域;以及
    基于所确定的旅游目标地域,向所述客户端提供所述备选景点序列中的至少一个。
  5. 根据权利要求4所述的方法,其特征在于,所述向所述客户端提供所述备选景点序列中的至少一个,包括:
    基于来自客户端的旅游条件信息、包含所述特定景点的数量以及所述景点序列分值,计算所述备选景点序列的相关性分值,
    基于所述相关性分值,向所述客户端提供所述备选景点序列中的至少一个。
  6. 根据权利要求1所述的方法,其特征在于,所述向所述客户端提供所述备选景点序列中的至少一个之前,还包括:
    滤除包含景点黑名单上的景点的景点序列。
  7. 根据权利要求1所述的方法,其特征在于,所述获得景点序列的步骤包括:
    在所述旅游文章中查找与景点的游览时间相关的游览时间信息;
    在没有查找到游览时间信息的情况下,根据所述旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息;
    在查找到游览时间信息的情况下,从所述旅游文章中提取与各景点对应的游览时间信息;以及
    通过使各景点和与所述各景点对应的游览时间信息相关联来形成所述景点序列。
  8. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据预定景点评分规则为所述景点分别设置景点分值,并基于所述景点分值设定所述特定景点;和/或
    向客户端提供所述旅游文章中包含的至少部分景点的列表,并将用户从所述列表中选择的景点设定为所述特定景点。
  9. 根据权利要求8所述的方法,其特征在于,所述预定景点评分规则基于以下特征中的至少一项:
    景点的搜索页面浏览量;
    针对景点的搜索量;
    包含该景点的景点序列数量;以及
    第三方对该景点的评价。
  10. 一种基于旅游目标地域来挖掘旅游路线的设备,其特征在于,所述旅游路线以景点序列的形式表示,所述设备包括:
    旅游文章检索装置,用于检索涉及所述旅游目标地域的旅游文章;
    景点序列获得装置,用于针对所检索到的各篇旅游文章,分别获得由每篇所述旅游文章中包含的景点构成的景点序列;
    景点序列筛选装置,用于筛选包含一个或多个特定景点的所述景点序列,作为针对所述旅游目标地域的备选景点序列。
  11. 根据权利要求10所述的设备,其特征在于,所述设备还包括:
    景点序列评分装置,用于根据预定序列评分规则,为每个所述备选景点序列设置景点序列分值,并且按景点序列分值从高到低的顺序将多个所述备选景点序列排序,以便响应于来自客户端的请求而按此顺序提供给客户端。
  12. 根据权利要求10或11所述的设备,其特征在于,所述设备还包括:
    目标地域确定装置,用于响应于来自客户端的包含表示旅游意向的旅游特征词和表示旅游目标地域的旅游目标地域词的搜索请求,确定所述旅游目标地域;以及
    景点序列提供装置,用于基于所述目标地域确定装置确定的旅游目标地域,向所述客户端提供所述备选景点序列中的至少一个。
  13. 根据权利要求12所述的设备,其特征在于,所述设备还包括:
    相关性分值计算装置,用于基于来自客户端的旅游条件信息、包含所述特定景点的数量以及所述景点序列分值,计算所述备选景点序列的相关性分值,
    其中,所述景点序列提供装置基于所述相关性分值,向所述客户端提供所述备选景点序列中的至少一个。
  14. 根据权利要求10所述的设备,其特征在于,所述设备还包括:
    景点序列滤除装置,用于滤除包含景点黑名单上的景点的景点序列。
  15. 根据权利要求10所述的设备,其特征在于,所述景点序列获得装置包括:
    查找装置,用于在所述旅游文章中查找与景点的游览时间相关的游览时间信息;
    游览时间信息估计装置,用于在没有查找到游览时间信息的情况下,根据所述旅游文章中出现次序相邻的两个景点之间的距离和/或从第三方获取的游览时间建议,估计与各景点对应的游览时间信息;
    游览时间信息提取装置,用于在查找到游览时间信息的情况下,从所述旅游文章中提取与各景点对应的游览时间信息;以及
    景点序列生成装置,用于通过使各景点和与其对应的游览时间信息相关联来形成所述景点序列。
  16. 根据权利要求10所述的设备,其特征在于,所述设备还包括:
    景点评分装置,用于根据预定景点评分规则为所述景点分别设置景点分值;和/或
    景点列表提供装置,用于向客户端提供所述旅游文章中包含的至少部分景点的列表;以及
    特定景点设定装置,用于基于所述景点分值设定所述特定景点,和/或将用户从所述列表中选择的景点设定为所述特定景点。
  17. 一种服务器,其特征在于,所述服务器包括处理器,存储器,总线和通信接口,所述处理器、通信接口和存储器通过所述总线连接;
    所述存储器用于存储程序;
    所述处理器,用于通过所述总线调用存储在所述存储器中的程序,执行所述权利要求1-9任一项所述的方法。
  18. 一种具有处理器可执行的非易失的程序代码的计算机可读介质,其特征在于,所述程序代码使所述处理器执行所述权利要求1-9任一项所述方法。
PCT/CN2015/097599 2014-12-29 2015-12-16 基于旅游目标地域来挖掘旅游路线的方法和设备 WO2016107417A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410848598.XA CN104463730A (zh) 2014-12-29 2014-12-29 基于旅游目标地域来挖掘旅游路线的方法和设备
CN201410848598.X 2014-12-29

Publications (1)

Publication Number Publication Date
WO2016107417A1 true WO2016107417A1 (zh) 2016-07-07

Family

ID=52909719

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097599 WO2016107417A1 (zh) 2014-12-29 2015-12-16 基于旅游目标地域来挖掘旅游路线的方法和设备

Country Status (2)

Country Link
CN (1) CN104463730A (zh)
WO (1) WO2016107417A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019201A (zh) * 2017-10-09 2019-07-16 阿里巴巴集团控股有限公司 一种生成结构化数据的方法、装置及系统
CN110717668A (zh) * 2019-09-30 2020-01-21 上饶市中科院云计算中心大数据研究院 一种旅游景区互联网影响力评估及景区自动管理调度方法
CN111695920A (zh) * 2019-03-11 2020-09-22 新疆丝路大道信息科技有限责任公司 汽车租赁平台的旅游景区推荐系统、方法及电子设备
CN112149010A (zh) * 2020-11-01 2020-12-29 云境商务智能研究院南京有限公司 基于注意力机制的群体旅游路线推荐方法
CN113112058A (zh) * 2021-03-30 2021-07-13 西安理工大学 一种基于知识图谱与蚁群算法的旅游路线推荐方法
CN113705847A (zh) * 2021-10-29 2021-11-26 环球数科集团有限公司 导游预约及服务联动信息处理方法、系统及存储介质
CN114510651A (zh) * 2022-04-19 2022-05-17 深圳本地宝新媒体技术有限公司 基于本地区域特性的旅游攻略推送方法及系统
CN114896523A (zh) * 2022-04-13 2022-08-12 广州市白云区城市规划设计研究所 一种基于乡村旅游线路的道路规划方法及装置
CN117131268A (zh) * 2023-08-28 2023-11-28 浪潮智慧科技有限公司 一种基于街景地图的旅程推荐方法、设备及介质

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463730A (zh) * 2014-12-29 2015-03-25 广州神马移动信息科技有限公司 基于旅游目标地域来挖掘旅游路线的方法和设备
CN104794663A (zh) * 2015-05-15 2015-07-22 北京景行技术有限公司 一种旅行路线的自动生成系统及方法
CN104881472B (zh) * 2015-05-28 2018-09-14 华南理工大学 一种基于网络数据收集的旅游线路景点组合推荐方法
KR101627976B1 (ko) * 2015-08-20 2016-06-08 심기평 여행자를 위한 맞춤형 서비스 제공 시스템 및 방법
CN105389751A (zh) * 2015-10-27 2016-03-09 北京妙计科技有限公司 一种行程服务方法和装置
CN105468679B (zh) * 2015-11-13 2019-04-12 中国人民解放军国防科学技术大学 一种旅游信息处理与方案提供方法
CN106776659B (zh) * 2015-11-25 2021-06-11 腾讯科技(深圳)有限公司 基于景点成分识别的检索结果排序方法、装置、用户终端
WO2017127997A1 (zh) * 2016-01-25 2017-08-03 常平 生成导航路线图时的信息推送方法以及路线规划系统
CN106157192A (zh) * 2016-05-09 2016-11-23 北京妙计科技有限公司 一种行程服务方法和装置
CN106197444B (zh) * 2016-06-29 2020-01-10 厦门趣处网络科技有限公司 一种路线规划方法、系统
CN106202500A (zh) * 2016-07-20 2016-12-07 上海斐讯数据通信技术有限公司 一种旅游线路推送方法和系统
CN106408115A (zh) * 2016-08-31 2017-02-15 北京百度网讯科技有限公司 出行线路的推荐方法及装置
CN107025254A (zh) * 2016-09-26 2017-08-08 阿里巴巴集团控股有限公司 一种航线目的地搜索方法及装置
CN107045810B (zh) * 2017-03-03 2019-05-10 河南职业技术学院 一种在旅游中古诗词教学设备及方法
CN107436950B (zh) * 2017-08-07 2020-12-29 苏州大学 一种旅行路线推荐方法及系统
CN107729610B (zh) * 2017-09-15 2019-12-10 华南理工大学 一种基于网络游记的旅游推荐线路图生成方法
CN107679661B (zh) * 2017-09-30 2021-03-19 桂林电子科技大学 一种基于知识图谱的个性化旅游路线规划方法
CN108268613B (zh) * 2017-12-29 2022-07-08 广州都市圈网络科技有限公司 基于语义分析的旅游行程生成方法、电子设备及存储介质
CN110119822B (zh) * 2018-02-06 2024-03-15 阿里巴巴集团控股有限公司 景区管理、行程规划方法、客户端和服务器
CN110188339B (zh) * 2018-02-23 2020-08-11 清华大学 景点评价方法、装置、计算机设备和存储介质
CN111859194B (zh) * 2020-08-03 2024-01-30 哈尔滨文投控股集团有限公司 智慧旅游服务平台及基于平台的旅游路径自动规划方法
CN111966929A (zh) * 2020-08-17 2020-11-20 携程旅游信息技术(上海)有限公司 基于标签的旅游路线推送方法、系统、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010110523A (ko) * 2000-06-07 2001-12-13 권태현 인터넷을 이용한 여행 상품 검색 시스템 및 이를 이용한여행 상품 추천 방법
CN103020308A (zh) * 2013-01-07 2013-04-03 北京趣拿软件科技有限公司 旅游攻略项目的推荐方法及装置
CN103064924A (zh) * 2012-12-17 2013-04-24 浙江鸿程计算机系统有限公司 一种基于地理标注照片挖掘的旅游地点情境化推荐方法
CN104463730A (zh) * 2014-12-29 2015-03-25 广州神马移动信息科技有限公司 基于旅游目标地域来挖掘旅游路线的方法和设备
CN104537070A (zh) * 2014-12-29 2015-04-22 广州神马移动信息科技有限公司 挖掘旅游目的地景点的方法和设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678429B (zh) * 2012-09-26 2018-03-20 阿里巴巴集团控股有限公司 一种旅游线路的推荐方法以及装置
CN103885983B (zh) * 2012-12-21 2017-09-01 阿里巴巴集团控股有限公司 一种旅游线路的确定方法、优化方法以及装置
CN103995840B (zh) * 2014-04-30 2017-08-08 洛阳众意信息技术服务有限公司 一种面向客制化的国内旅游线路智慧生成方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010110523A (ko) * 2000-06-07 2001-12-13 권태현 인터넷을 이용한 여행 상품 검색 시스템 및 이를 이용한여행 상품 추천 방법
CN103064924A (zh) * 2012-12-17 2013-04-24 浙江鸿程计算机系统有限公司 一种基于地理标注照片挖掘的旅游地点情境化推荐方法
CN103020308A (zh) * 2013-01-07 2013-04-03 北京趣拿软件科技有限公司 旅游攻略项目的推荐方法及装置
CN104463730A (zh) * 2014-12-29 2015-03-25 广州神马移动信息科技有限公司 基于旅游目标地域来挖掘旅游路线的方法和设备
CN104537070A (zh) * 2014-12-29 2015-04-22 广州神马移动信息科技有限公司 挖掘旅游目的地景点的方法和设备

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019201A (zh) * 2017-10-09 2019-07-16 阿里巴巴集团控股有限公司 一种生成结构化数据的方法、装置及系统
CN111695920B (zh) * 2019-03-11 2023-06-13 新疆丝路大道信息科技有限责任公司 汽车租赁平台的旅游景区推荐系统、方法及电子设备
CN111695920A (zh) * 2019-03-11 2020-09-22 新疆丝路大道信息科技有限责任公司 汽车租赁平台的旅游景区推荐系统、方法及电子设备
CN110717668A (zh) * 2019-09-30 2020-01-21 上饶市中科院云计算中心大数据研究院 一种旅游景区互联网影响力评估及景区自动管理调度方法
CN112149010A (zh) * 2020-11-01 2020-12-29 云境商务智能研究院南京有限公司 基于注意力机制的群体旅游路线推荐方法
CN113112058A (zh) * 2021-03-30 2021-07-13 西安理工大学 一种基于知识图谱与蚁群算法的旅游路线推荐方法
CN113112058B (zh) * 2021-03-30 2023-07-18 西安理工大学 一种基于知识图谱与蚁群算法的旅游路线推荐方法
CN113705847A (zh) * 2021-10-29 2021-11-26 环球数科集团有限公司 导游预约及服务联动信息处理方法、系统及存储介质
CN114896523A (zh) * 2022-04-13 2022-08-12 广州市白云区城市规划设计研究所 一种基于乡村旅游线路的道路规划方法及装置
CN114896523B (zh) * 2022-04-13 2023-02-28 广州市白云区城市规划设计研究所 一种基于乡村旅游线路的道路规划方法及装置
CN114510651A (zh) * 2022-04-19 2022-05-17 深圳本地宝新媒体技术有限公司 基于本地区域特性的旅游攻略推送方法及系统
CN117131268A (zh) * 2023-08-28 2023-11-28 浪潮智慧科技有限公司 一种基于街景地图的旅程推荐方法、设备及介质
CN117131268B (zh) * 2023-08-28 2024-03-26 浪潮智慧科技有限公司 一种基于街景地图的旅程推荐方法、设备及介质

Also Published As

Publication number Publication date
CN104463730A (zh) 2015-03-25

Similar Documents

Publication Publication Date Title
WO2016107417A1 (zh) 基于旅游目标地域来挖掘旅游路线的方法和设备
CN104537027B (zh) 信息推荐方法及装置
WO2016107371A1 (zh) 检索旅游目的地景点的方法和设备
Zheng et al. GeoLife: A collaborative social networking service among user, location and trajectory.
CN109977283A (zh) 一种基于知识图谱和用户足迹的旅游推荐方法和系统
Jiang et al. Learning from contextual information of geo-tagged web photos to rank personalized tourism attractions
CN104881472B (zh) 一种基于网络数据收集的旅游线路景点组合推荐方法
JP7023821B2 (ja) 情報検索システム
CN106197444B (zh) 一种路线规划方法、系统
Popescu et al. Mining social media to create personalized recommendations for tourist visits
JP2010506335A (ja) 場所に関するサイトの識別
JP2010039710A (ja) 情報収集装置、旅行案内装置、旅行案内システム及びコンピュータプログラム
Xu et al. A dynamic topic model and matrix factorization-based travel recommendation method exploiting ubiquitous data
Li Multi-day and multi-stay travel planning using geo-tagged photos
Kisilevich et al. A novel approach to mining travel sequences using collections of geotagged photos
CN108197241B (zh) 一种基于用户偏好的路径搜索方法、系统、存储介质和处理器
CN106248096B (zh) 路网权重的获取方法和装置
WO2011082628A1 (zh) 查找信息的方法和装置
CN109614558B (zh) 一种多定位旅游日志自动生成方法及系统
JP4929225B2 (ja) ルートに関連付けられたコンテンツを自動的に選択する方法、装置及びプログラム
JP5639549B2 (ja) 情報検索装置及び方法及びプログラム
CN113112058B (zh) 一种基于知识图谱与蚁群算法的旅游路线推荐方法
KR20200052786A (ko) 소셜웨어 데이터를 이용한 사용자 별 라이프스타일 판별 및 관광지 추천 방법 및 시스템
TWI524281B (zh) 地名排序方法及地名排序系統與電腦可讀取記錄媒體
CN103177053B (zh) 教案编辑的动态资源推荐方法以及其教案编辑系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15875095

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15875095

Country of ref document: EP

Kind code of ref document: A1