CN106681996B - The method and apparatus for determining interest region in geographic range, point of interest - Google Patents

The method and apparatus for determining interest region in geographic range, point of interest Download PDF

Info

Publication number
CN106681996B
CN106681996B CN201510746851.5A CN201510746851A CN106681996B CN 106681996 B CN106681996 B CN 106681996B CN 201510746851 A CN201510746851 A CN 201510746851A CN 106681996 B CN106681996 B CN 106681996B
Authority
CN
China
Prior art keywords
information point
information
point
value
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510746851.5A
Other languages
Chinese (zh)
Other versions
CN106681996A (en
Inventor
司向辉
孟凡超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201510746851.5A priority Critical patent/CN106681996B/en
Publication of CN106681996A publication Critical patent/CN106681996A/en
Application granted granted Critical
Publication of CN106681996B publication Critical patent/CN106681996B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides the method and apparatus in interest region in a kind of determining geographic range, point of interest, and the method in interest region includes: all information points obtained in geographic range in the determining geographic range;It chooses comprising same keyword and quantity is greater than multiple information points of the first preset threshold;By calculating the distance the multiple information point, the information point a for being located at center is determined, and determine the information point b farthest apart from the middle site;Distance value between information point a and information point b is segmented, the information point that quantity in each section is greater than the second preset threshold is filtered out;The quantity of the information point around the above-mentioned information point filtered out comprising the same keyword is calculated, and retains the information point that quantity is greater than third predetermined threshold value;Interest region is determined from the information point of the reservation.

Description

The method and apparatus for determining interest region in geographic range, point of interest
Technical field
The present invention relates to the data processing fields of electronic map, and in particular to determines interest region, interest in geographic range The method and apparatus of point.
Background technique
Would generally mark geographic range in electronic map data, user can be identified according to electronic map such as province, city, The geographic ranges such as district, but existing geographic range is larger, and multiple more specific regions, example are also typically included in geographic range Such as Xicheng District of Beijing, the range of Xicheng District data can be identified according to the map, but further include in Xicheng District The more specific region such as Xidan area, user can not determine the range in such region.
Interest region partitioning method is with some objects (such as the road, an information in map datum at present Point) based on, a certain distance is spread as interest region to the objects perimeter, or according to preset range (such as default net Lattice) it by map partitioning is multiple regions, each predeterminable area may each be interest region, and above two mode is suitably applied Information point less area or remote districts, more for information points such as cities, arrangement situation complexity area are existing emerging Interesting region partitioning method accuracy is poor.
Also, above- mentioned information point screening technique lacks Regional Analysis, usually in geographic range, directly according to information The temperature of point is screened, so that the information point in non-hot spot areas lacks, screening efficiency is lower for this meeting.
Summary of the invention
In consideration of it, the present invention provides a kind of method in interest region in determining geographic range, this method comprises: obtaining geographical Information point in range;It is chosen from above- mentioned information point comprising same keyword and quantity is greater than multiple letters of the first preset threshold Breath point;Distance the multiple information point is calculated, the information point for being located at center is determined, is denoted as a, and pass through The determining maximum information point b of distance value between information point a of the result that the distance calculates;To between information point a and information point b Distance value be segmented, filter out the information point c that quantity in each section is greater than the second preset threshold;Calculate above-mentioned filter out The quantity of information point around each information point c comprising the same keyword, and retain quantity more than or equal to third predetermined threshold value Information point c;Interest region is determined according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.
Correspondingly, the present invention provides a kind of device in interest region in determining geographic range, comprising: acquiring unit is used for Obtain the information point in geographic range;Selection unit, for selection to include same keyword from above- mentioned information point and quantity is big In multiple information points of the first preset threshold;Information point determination unit, for calculating the distance the multiple information point, really The information point for being located at center is made, is denoted as a, and determining between information point a by the result that the distance calculates The maximum information point b of distance value;First screening unit, for being segmented to the distance value between information point a and information point b, Filter out the information point c that quantity in each section is greater than the second preset threshold;Second screening unit, for calculating above-mentioned filter out The quantity of information point around each information point c comprising the same keyword, and retain quantity more than or equal to third predetermined threshold value Information point c;Area determination unit, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation are true Dingxing interest region.
In addition, the present invention provides a kind of point of interest screening technique, this method comprises: using emerging in above-mentioned determining geographic range The method in interesting region determines interest region;Go out at least one according to the information sifting that information point is included in the interest region A information point, at least one information point filtered out are point of interest.
Correspondingly, the present invention also provides a kind of point of interest screening plants, comprising: interest area determination unit, for utilizing The method in interest region determines interest region in above-mentioned determining geographic range;Screening unit, in the interest region Go out at least one information point according to the information sifting that information point is included, which is point of interest.
Interest region in determining geographic range provided in an embodiment of the present invention, point of interest method and apparatus can be by map Data are divided into multiple semi-cylindrical hills, then filter out that feature is relatively obvious, popularity is relatively high in interest region Information point, the information point can be used as the landmark in interest region, and the information point filtered out can be applied to a variety of fields Scape, such as standard information point can be used as in certain service class application programs, or highlighted in map datum, User is set more easily to search information point.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of the method in interest region in the determination geographic range provided according to embodiments of the present invention;
Fig. 2 is the arrangement situation schematic diagram for the multiple information points chosen;
Fig. 3 is the schematic diagram handled information point shown in Fig. 2;
Fig. 4 is the schematic diagram that processing result shown in Fig. 3 is further processed;
Fig. 5 is the interest area schematic determined after handling information point shown in Fig. 4;
Fig. 6 is the flow chart of the point of interest screening technique provided according to embodiments of the present invention;
Fig. 7 is the structure chart of the device in interest region in the determination geographic range provided according to embodiments of the present invention;
Fig. 8 is the structure chart of the point of interest screening plant provided according to embodiments of the present invention.
Specific embodiment
Technical solution of the present invention is clearly and completely described below in conjunction with attached drawing, it is clear that described implementation Example is a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
It is succinct and intuitive in order to what is described, hereafter by describing several representative embodiments come to the solution of the present invention It is illustrated.A large amount of details is only used for helping to understand the solution of the present invention in embodiment.However, it will be apparent that technology of the invention Scheme can be not limited to these details when realizing.In order to avoid unnecessarily having obscured the solution of the present invention, some embodiment party Formula is not described meticulously, but only gives frame.Hereinafter, " comprising " refers to " including but not limited to ", " root According to ... " refer to " according at least to ..., but be not limited to according only to ... ".Hereinafter it is not specifically stated the quantity of an ingredient When, it is meant that the ingredient is either one or more, or can be regarded as at least one.
The method in interest region, information point screening technique and device can be in determination geographic range in the embodiment of the present invention The calculating equipment of the various methods and software systems that can be realized the embodiment of the present invention by one is realized.The calculating equipment can be It can be realized the calculating equipment of method and software systems provided in an embodiment of the present invention.The calculating equipment can be PC or Portable equipment, such as laptop, tablet computer, mobile phone or smart phone etc..The calculating equipment can also be to be set with above-mentioned The standby server being connected by network.
The calculating equipment can have different performance and feature.Various possible implementations are all in the protection of this paper In range.For example, calculating equipment may include key zone/keyboard, it can also include a display, such as liquid crystal display Or the display with Premium Features, such as touch sensible 2RD or 3RD display (LCRD),.In one example, a tool The calculating equipment for having web function may include one or more physical keyboards or dummy keyboard and mass storage device.
Calculate equipment also and may include or allow various operating systems including or run various application programs, such as coding/ Decoding application.Application program can carry out coded communication by network and other equipment.
In addition, calculate equipment can also include the readable non-volatile memory medium of one or more processors and one or Multiple processors communicated with storage medium.For example, the readable non-volatile memory medium of processor can be RAM, flash memory, ROM, EPROM, EEPROM, register, hard disk, mobile hard disk, CRD-ROM or various other forms non-volatile memories be situated between Matter.Storage medium can store series of instructions or unit and/or module comprising instruction, for completing the various implementations of the present invention The operation of example.Processor can execute above-metioned instruction, complete the operation in various embodiments.
The embodiment of the present invention provides a kind of method in interest region in determining geographic range, as shown in Figure 1, this method includes Following steps:
S1 obtains the information point in geographic range.Geographic range can be administrative division, such as Chaoyang District, Beijing City, sea Shallow lake area is also possible to the region of the known boundaries such as state, town.Information point data is existing and opening data, each Information point data includes at least the information (reference information) such as title, classification, latitude, longitude, neighbouring other information point, Mou Xiexin Breath point also has rich information, such as hotel information point, including the class information (such as 0-5 star) in hotel, for another example For residential quarter information point comprising the information such as resident's quantity, flat price.
S2, chooses from above- mentioned information point comprising same keyword and quantity is greater than multiple information of the first preset threshold Point.It specifically can use vocabulary identification technology, identify keyword from the title of information point, address respectively, which is then determined A little information points include same keyword.Keyword can be that there is no the titles of the region on specific boundary, such as Beijing west city " Dongdan " etc. in " Xidan ", Dongcheng District in area.Multipacket message point would generally be filtered out for some geographic range, it is each Group information point includes same keyword, but the information point quantity in certain groups may be less, this situation indicates the group information The cognition degree for the keyword that point includes is not high enough, therefore should choose a fairly large number of information point and carry out subsequent processing.With geography For range-Beijing " Xicheng District ", it is assumed that comprising keyword " Xidan " information point quantity be greater than preset threshold, according to comprising Certain information point Relatively centralizeds known to the arrangement situation in map of the information point of keyword " Xidan ", certain information points may Relative distribution.Assuming that have chosen multiple information points as shown in Figure 2 after the processing of step S2, intuitively according to fig. 2 in Information point arranges situation it can be found that information point P1 and information point P2 are more dispersed information points.
The information point of arrangement dispersion will will affect final region division result, it is therefore desirable to from above-mentioned multiple information points Find out and remove the information point of position dispersion.It will be understood by those skilled in the art that determining whether multiple points are concentrated between each other Or there are many ways to dispersion, such as the biggish point of distance value can be got rid of according to the distance between every two point.It ties below It closes Fig. 3 and Fig. 4 and the process of removal dispersed information point is discussed in detail.
S3 calculates the distance the multiple information point, determines the information point for being located at center, is denoted as a, And the result determination calculated by the distance and the distance value maximum information point b, information point a between information point a are institute There is the middle site of information point.It will be understood by those skilled in the art that in being found in the fixed multiple points in position site method There are many, middle site can be found out by calculating the distance between each point value.For the situation shown in Fig. 3, information point P0 is Information point positioned at center, is denoted as a;Information point P1 range information point P0 is farthest, therefore information point P1 and the information The maximum information point of distance value between point a, is denoted as b.,
S4 is segmented the distance value between information point a and information point b, filters out quantity in each section and is greater than second in advance If the information point c of threshold value.As shown in figure 3, the distance value Rmax of information point a and information point b points are 5 sections by the present embodiment, segmentation 5 distance value sections have been determined afterwards, has then then calculated the information point quantity in each section and is judged, if quantity is big Then retain in the second preset value, otherwise removes.Assuming that the second preset threshold is 2, then it may determine that the section where information point b In only its 1 information point, the information point quantity in other sections be all larger than or be equal to 2, thereby determine that information point b is removed, The P1 in Fig. 3 is removed, other information point P0, P2, P3, P4, P5, P6, P7, P8, P9, P10 are retained them temporarily, by the information of reservation Point is denoted as c.
S5 calculates the quantity of the information point around the above-mentioned each information point c filtered out comprising the same keyword, if Negligible amounts, then it represents that equally the information point comprising the keyword is less around the information point, and information point arrangement is more dispersed, It is on the contrary then indicate the information point arrangement concentrate, thus, it is possible to and retain quantity be more than or equal to third predetermined threshold value information point c. Above-mentioned " surrounding " is a value range, such as can be radius, a value range can be set in actual use, then in the model Enclose the quantity that interior judgement includes above-mentioned keyword.It is assumed that it is 1 that value range, which takes N*Rd, third predetermined threshold value value, Then there was only the information point quantity around information point P2 is 0, so further removal information point P2, retains other qualified letters Cease point P0, P3, P4, P5, P6, P7, P8, P9, P10.
S6 determines interest region according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.By upper It states step S2-S5, information point P1 and information point P2 to be removed, then for example the information retained can be utilized according to algorithm of convex hull Point P0, P3, P4, P5, P6, P7, P8, P9, P10 construct minimum external chimb row region, and Fig. 5 shows determining according to this method Interest region, the multiple information points for having in the interest region arrangement intensive.
The method in interest region in the determination geographic range provided according to embodiments of the present invention, by obtaining in geographic range All information points simultaneously filter out multiple information points comprising same keyword, can determine multiple letters containing same keyword Point is ceased, the information point for then whether being greater than preset threshold to quantity is handled, the removal information point that wherein position disperses, finally Interest region is determined according to multiple information points of reservation, and more specific region of interest can be further determined that out in geographic range Domain is the information point including same keyword in interest region, it is possible thereby to improve the accuracy of ground interest regional scope.
Above-mentioned steps S3 may include following sub-step as a preferred implementation manner:
S31, calculate separately each information point and all information points in addition to itself sum of the distance (preferably straight line away from With a distance from but it is also possible to be the route in map, hereafter by taking linear distance as an example), and determine the smallest information of sum of the distance Point a.
S32 calculates separately the information point in addition to information point a at a distance from information point a, and determines there is maximum range value Information point b;As shown in figure 3, maximum distance is Rmax (the distance between information point P0 and information point P1).
Above-mentioned preferred embodiment, can be with by calculating the sum of the distance of each information point with all information points in addition to itself Site in more accurately being found in multiple information points, and find the information point that site is farthest in distance.
Above-mentioned steps S4 may include following sub-step as a preferred implementation manner:
S41 is segmented maximum distance Rmax, it is preferred to use the method for equal part determines number of fragments N, by Rmax points It is N sections, Ri=i/N Rmax, it will be understood by those skilled in the art that using other segmentation methods, such as golden section is all It is feasible.
S42 calculates separately the quantity of the information point in each section, the i.e. number of information point of the calculating distance between [0, R1] Measure x1, the quantity x2 of information point between [R1, R2], information point between [R2, R3] quantity x3, until calculate [Rn, Rmax] between information point quantity xn, above-mentioned each section can be expressed as [Rmax/N* (n-1), Rmax/N* (n)], Middle n is positive integer, and value range is [1, n] and n≤N, N are the positive integer greater than 1, such as above-mentioned N value is 5, be can be obtained Quantity X2, [Rmax/5*2, the Rmax/ of the quantity X1 of the information point of [0, Rmax/5], the information point of [Rmax/5, Rmax/5*2] 5*3] the quantity X2 of information point, the quantity X3 of information point of [Rmax/5*3, Rmax/5*4], [Rmax/5*4, Rmax] letter Cease the quantity X4 of point;
S43 filters out quantity greater than the information point in the section of the second preset threshold.Such as judge whether X1-X4 is big respectively In preset threshold, it is assumed that the second preset threshold value is 2, then can determine that information point P1 is removed according to judgement result, other Information point is retained, that is, retains information point of the distance between [0, Rmax/5*4], and removal distance is [Rmax/5*4, Rmax] Information point.
Above-mentioned preferred embodiment is segmented maximum range value, and successively judges the quantity of the information point in each segmentation, And retain the information point that quantity is greater than preset threshold, it is possible thereby to get rid of the information point of distribution relative distribution.
Above-mentioned steps S5 may include following sub-step as a preferred implementation manner:
S51, the average distance Rd between information point that calculating sifting goes out;
S52 calculates separately the information point around the information point filtered out within the scope of N*Rd comprising the same keyword Quantity, if number is less, then it represents that equally the information point comprising the keyword is less around the information point, information point arrangement It is more dispersed, it is on the contrary then indicate the information point arrangement concentrate;
S53, the quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword are less than third and preset threshold The information point of value, as shown in figure 4, such as preset threshold value is 1, then the information point only around information point P2 within the scope of N*Rd Quantity is 0, so removal information point P2, retains other qualified information points.
Above-mentioned preferred embodiment calculates the average distance Rd between information point first, then with average distance Rd and coefficient N For according to judging within the scope of each information point N*Rd with the presence or absence of enough information points, it is possible thereby to the further row of getting rid of The information point of cloth dispersion keeps the information point finally retained arrangement density sufficiently high.
Another embodiment of the present invention provides a kind of point of interest screening techniques, and this method comprises the following steps as shown in Figure 6:
S ' 1 determines interest region, region of interest using the method in interest region in the determination geographic range in previous embodiment Usually there is more information point in domain;
S ' 2 goes out at least one information point according to the information sifting that information point is included in the interest region, the screening At least one information point out is point of interest (Point of Interest, POI), which compares its in interest region His information point has higher popularity or more obvious feature.It will be understood by those skilled in the art that the side of filter information point There are many methods, such as can be according to the temperature of information point (by user query, the number of selection, submission in various application systems) It is ranked up, selects the higher information point of temperature.
Map datum can be divided into multiple region of interest by the point of interest screening technique provided according to embodiments of the present invention Then domain filters out the information point that feature is relatively obvious, popularity is relatively high in interest region, the information point filtered out can It using as the landmark in interest region, or is point of interest, the information point filtered out can be applied to several scenes, example It can such as be used as standard information point in certain service class application programs, or highlighted in map datum, use Family can more easily search information point.
As a preferred embodiment, the step S ' 2 in the present embodiment be can specifically include:
S ' 21 determines the characteristic value of all information points in the interest region, it will be understood by those skilled in the art that really There are many ways to determining the characteristic value of information point, for example, can according to information point by attention degree etc. because usually determining, it is existing Feature value calculating method is all feasible;
S ' 22 carries out information point using the characteristic value of the information point as input value using support vector cassification model Screening.Support vector machines (SVM, Support Vector Machine) is supervised learning mould related to relevant learning algorithm Type can analyze data, recognition mode, for classification and regression analysis.Supporting vector machine model can pass through given sample Data are trained to obtain, and the sample data in the present invention can be classified as two classes, and one kind is qualified target sample, another kind of It is ineligible non-targeted samples.Given letter can be judged using the supporting vector machine model obtained by training Whether breath point is target information point.
It is preferable to use multiple characteristic values to screen to information point for the present embodiment, i.e., the characteristic value of information point includes first special Value indicative, Second Eigenvalue and third feature value, wherein the calculation method of the First Eigenvalue include:
The type and number that information point is cited are calculated according to the address information of information point, assigns power for each type Then weight calculates the First Eigenvalue of information point according to reference type weight and reference number.Preferably, above-mentioned reference type is extremely Less include three classes:
The first kind is cited in the different another information point in geographical location.Such as there are two information points: information point 1- Hai Dianqiao, information point 2- China's technology business mansion, the address information of information point 2 include that " technology business of Haidian Qiao Dong China is big Tall building ", then the Haidian information point 1- bridge is quoted primary by the China's technology business mansion information point 2-, which is properly termed as outside and draws With;
Second class is cited in public transport category information point.Such as information point 1- Hai Dianqiao, information point 2- Haidian Qiao Dong public transport It stands, then the Haidian information point 1- bridge is quoted primary by the Haidian information point 2- Qiao Dong bus station, which is properly termed as public transport reference;
Third class is cited in the identical another information point in geographical location.Such as there are two information points: in information point 1- State's technology business mansion, information point 2- beta coffee shop, the address information of information point 2 include " Chinese technology business mansion B1 floor Beta coffee ", the China's technology business mansion information point 1- are quoted once by information point 2- beta coffee shop, which is properly termed as Internal reference.
The type and number that each information point is cited may be all different, and certain information points may belong to simultaneously it is more Class reference, therefore following equation can be used to calculate the First Eigenvalue:
Y1=information point AX1+BX2+CX3,
Wherein Y1 is the First Eigenvalue, and information point A, B, C are the weight of three kinds of reference types, and X1, X2, X3 are corresponding reference The reference number of type, information point A, B, C can take arbitrary value, which is used to embody the importance of reference type, such as can To be information point A > B > C or information point A=B=C.It will be understood by those skilled in the art that above-mentioned formula simply to illustrate that The physical meaning of the First Eigenvalue and the specific embodiment enumerated can also be by more simply or more when practical application Complicated algorithm calculates the First Eigenvalue of information point using reference type weight and number the two information.
The calculation method of above-mentioned Second Eigenvalue includes:
The numerical information and maximum preset numerical value and minimum default value for being included according to information point calculate information point Second Eigenvalue, it will be understood by those skilled in the art that in information point, in addition to name information, address information, reference information, Also there is rich information, the rich information of various information point is all different, such as hotel's class, hospital's class, scenic spot class, government bodies have Corresponding class information, house class have the information such as inhabitation amount, price, and food and drink class has the information such as comment quantity or point score value, Above-mentioned richness information is numerical information, and when practical application can be with the numerical information in Extracting Information point, then according to information point Type determines the maximum value and minimum value of the numerical information of such information point.It is maximum such as hotel's category information point Value can take 5, and minimum value can take 0, then can be according to the actual grade information and maximum value, minimum value of the hotel information point Ratio determine Second Eigenvalue.Other kinds of information point can also be calculated in this manner.Art technology Personnel be appreciated that the calculation of above-mentioned Second Eigenvalue simply to illustrate that the physical meaning of Second Eigenvalue and enumerate One specific embodiment can also use simpler or more complicated mode be calculated when practical application.
The calculation method of above-mentioned third feature value includes:
According to user use selected operation of the different application to information point, determine application program used by a user and Selected number accordingly assigns weight for each application program, then calculates the of information point according to weight and selected number Three characteristic values.Such as some information point, user selectes the information point X1 times using application program 1, and user uses application Program 2 selectes the information point X2 times ... user selectes the information point Xn times using application program n, and third feature value can as a result, To be calculated according to the following formula:
Y3=information point AX1+BX2+ ...+NXn,
Wherein Y3 is third feature value, and information point A, B ... N is the weight of n kind application program, and above-mentioned weight can be any Value, and higher weight etc. can be assigned for specific application program.It will be understood by those skilled in the art that above-mentioned formula is only It is the physical meaning in order to illustrate third feature value and the specific embodiment enumerated, when practical application, can also passes through Simpler or more complicated algorithm is special using the third that Application Type weight and number the two information calculate information point Value indicative.
The case where above-mentioned preferred embodiment is cited using information point, user submit the rich information of situation and information point as foundation, And information point is screened using machine learning model, it is possible thereby to keep screening mode more targeted, and improve screening effect Rate.
It will be understood by those skilled in the art that support vector cassification model is constantly trained using a large amount of sample data Obtained from.Mesh in order to improve the classification performance of support vector cassification model, as the support vector cassification model The characteristic value (the First Eigenvalue, Second Eigenvalue and third feature value) of mark information point training sample is all larger than default feature threshold Value, the default characteristic threshold value include in the average characteristics threshold value and/or the interest region of all information points in the interest region Same category information point average characteristics threshold value.
Due to may include the higher information point of many popularity in an interest region, such as have in the region of Xidan very much Market category information point, may all be judged as target information point, but this is not inconsistent when model discrimination is stated in many markets in use Family is shared to the direct feel of popularity.In some region, user is typically only capable to remember former to the information point of some classification Name, thus regional area in some classification target information point be excessively it is unreasonable, in order to further to target information click through Row screening, this method can also include the following steps:
S ' 3 chooses the identical multiple information points of type;
S ' 4 carries out sequence from high to low according to features described above value multiple information points identical to type, before then retaining N number of same type of target information point improves the experience of user it is possible thereby to keep the resolution of the information point filtered out higher.
Another embodiment of the invention also provides a kind of device in interest region in determining geographic range, as shown in fig. 7, The device includes:
Acquiring unit 71, for obtaining the information point in geographic range;
Selection unit 72, for selection to be comprising same keyword from above- mentioned information point and quantity is greater than the first preset threshold Multiple information points;
Information point determination unit 73 determines that is located at a center for calculating the distance the multiple information point The information point of position is denoted as a, and the determining maximum letter of distance value between information point a of result for passing through the distance calculating Cease point b;
First screening unit 74 filters out each section for being segmented to the distance value between information point a and information point b Interior quantity is greater than the information point c of the second preset threshold;
Second screening unit 75 includes the same keyword around the above-mentioned each information point c filtered out for calculating The quantity of information point, and retain the information point c that quantity is more than or equal to third predetermined threshold value;
Area determination unit 76, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation are true Dingxing interest region.
Above-mentioned apparatus is by obtaining all information points in geographic range and filtering out multiple information comprising same keyword Point can determine multiple information points containing same keyword, and the information for then whether being greater than preset threshold to quantity clicks through Row processing, the removal information point that wherein position disperses, finally determines interest region according to multiple information points of reservation, can be on ground More specific interest region is further determined that out in reason range, is the information point including same keyword in interest region, by The accuracy of ground interest regional scope can be improved in this.
Preferably, the information point determination unit 73 includes:
Middle site determination unit, for calculate separately each information point between all information points in addition to itself at a distance from The sum of, and determine the smallest information point a of sum of the distance;
Farthest point determination unit, for calculate separately the information point in addition to information point a between information point a at a distance from, and Determine the information point b with maximum range value.
Above-mentioned preferred embodiment, can be with by calculating the sum of the distance of each information point with all information points in addition to itself Site in more accurately being found in multiple information points, and find the information point that site is farthest in distance.
Preferably, first screening unit 74 includes:
Segmenting unit, for being segmented to the maximum range value;
First amount calculation unit, for calculating separately the quantity of the information point in each section;
Sub- screening unit is denoted as information point c for filtering out quantity greater than the information point in the section of the second preset threshold.
Preferably, the segmenting unit includes:
Number of fragments determination unit, for determining number of fragments;
Equal sub-units, for carrying out equal part to the maximum range value according to the number of fragments.
Above-mentioned preferred embodiment is segmented maximum range value, and successively judges the quantity of the information point in each segmentation, And retain the information point that quantity is greater than preset threshold, it is possible thereby to get rid of the information point of distribution relative distribution.
Preferably, second screening unit 75 includes:
Average distance computing unit, the average distance Rd between information point c gone out for calculating sifting;
Second amount calculation unit includes described same within the scope of N*Rd for calculating separately around each information point filtered out The quantity of the information point of one keyword;
Removal unit, for removing the quantity of the information point within the scope of surrounding N*Rd comprising the same keyword less than the The information point c of three preset thresholds.
Above-mentioned preferred embodiment calculates the average distance Rd between information point first, then with average distance Rd and coefficient N For according to judging within the scope of each information point N*Rd with the presence or absence of enough information points, it is possible thereby to the further row of getting rid of The information point of cloth dispersion keeps the information point finally retained arrangement density sufficiently high.
Another embodiment of the invention also provides a kind of point of interest screening plant, as shown in figure 8, the device includes:
Interest area determination unit 81, for region of interest in the determination geographic range using the offer of above-mentioned one embodiment The method in domain determines interest region;
Screening unit 82, for going out at least one letter according to the information sifting that information point is included in the interest region Point is ceased, which is point of interest.
Map datum can be divided into multiple semi-cylindrical hills by above-mentioned point of interest screening plant, then in interest region The information point that feature is relatively obvious, popularity is relatively high is filtered out, which can be used as the terrestrial reference in interest region Building, the information point filtered out can be applied to several scenes, such as standard can be used as in certain service class application programs Information point, or highlighted in map datum, so that user is more easily searched information point.
Preferably, above-mentioned screening unit 82 may include:
Characteristic value determines subelement, for determining the characteristic value of all information points in the interest region;
Classification subelement, for utilizing support vector cassification model, using the characteristic value of the information point as input value pair Information point is screened.
Preferably, the characteristic value includes fisrt feature, second feature and third feature value, wherein
The First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number calculate It arrives;
The characteristic value further includes Second Eigenvalue, and the Second Eigenvalue is the numerical information for being included according to information point And maximum preset numerical value and minimum default value are calculated;
The third feature value is that the application program according to used in user's submission information point is corresponding using weight It is calculated with submission number.
Preferably, the type being cited includes cited in the different another information point in geographical location, by public transport Cited in category information point, cited in the identical another information point in geographical location.
The case where above-mentioned preferred embodiment is cited with information point, the rich information of information point, user select in different scenes The case where information point is foundation, and is screened using machine learning model to information point, it is possible thereby to there is screening mode more Specific aim, and improve screening efficiency.
Preferably, the characteristic value of the information point of the training sample for meeting screening conditions of support vector cassification model is big In default characteristic threshold value, default characteristic threshold value include all information points to be screened in interest region average characteristics threshold value and/ Or the average characteristics threshold value of the same category of information point to be screened in interest region.Above-mentioned training sample can be improved support to The classification performance of amount machine disaggregated model.
Preferably, above- mentioned information point screening plant can also include:
Selection unit 83, for choosing the identical multiple information points of type;
Removal unit 84, for carrying out sequence from high to low according to characteristic value multiple information points identical to type, so Retain at least one the forward information point that sorts afterwards.Above-mentioned preferred embodiment can make the resolution of the information point filtered out higher, Improve the experience of user.
Obviously, the above embodiments are merely examples for clarifying the description, and does not limit the embodiments.It is right For those of ordinary skill in the art, can also make on the basis of the above description it is other it is various forms of variation or It changes.There is no necessity and possibility to exhaust all the enbodiments.And it is extended from this it is obvious variation or It changes still within the protection scope of the invention.

Claims (22)

1. a kind of method in interest region in determining geographic range characterized by comprising
Obtain the information point in geographic range;
It is chosen from above- mentioned information point comprising same keyword and quantity is greater than multiple information points of the first preset threshold;
Distance the multiple information point is calculated, the information point for being located at center is determined, is denoted as a, and pass through The determining maximum information point b of distance value between information point a of the result that the distance calculates;
Distance value between information point a and information point b is segmented, quantity in each section is filtered out and is greater than the second preset threshold Information point c;
The quantity of the information point around the above-mentioned each information point c filtered out comprising the same keyword is calculated, and retains quantity More than or equal to the information point c of third predetermined threshold value;
Interest region is determined according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.
2. the method according to claim 1, wherein the distance between the multiple information point calculates, really The information point for being located at center is made, is denoted as a, and determining between information point a by the result that the distance calculates The maximum information point b of distance value, comprising:
The sum of the distance between each information point and all information points in addition to itself is calculated separately, and determines that sum of the distance is minimum Information point a;
Calculate separately the information point in addition to information point a between information point a at a distance from, and determine have maximum range value information Point b.
3. according to the method described in claim 2, it is characterized in that, the distance value between information point a and information point b carries out Segmentation filters out the information point c that quantity in each section is greater than the second preset threshold, comprising:
The maximum range value is segmented;
Calculate separately the quantity of the information point in each section;
Quantity is filtered out greater than the information point in the section of the second preset threshold, is denoted as information point c.
4. according to the method described in claim 3, it is characterized in that, it is described to the maximum range value carry out segmentation include:
Determine number of fragments;
Equal part is carried out to the maximum range value according to the number of fragments.
5. the method according to claim 1, wherein described calculate is wrapped around the above-mentioned each information point c filtered out The quantity of information point containing the same keyword, and retain the information point c that quantity is more than or equal to third predetermined threshold value, comprising:
The average distance Rd between information point c that calculating sifting goes out;
Calculate separately the quantity of the information point around each information point filtered out within the scope of N*Rd comprising the same keyword;
The quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword is less than the information of third predetermined threshold value Point c.
6. a kind of point of interest screening technique characterized by comprising
Interest region is determined using the method in interest region in determination geographic range described in any one of claim 1-5;
At least one information point is gone out according to the information sifting that information point is included in the interest region, this is filtered out at least One information point is point of interest.
7. point of interest screening technique according to claim 6, which is characterized in that it is described in the interest region according to letter The included information sifting of breath point goes out at least one information point and includes:
Determine the characteristic value of all information points in the interest region;
Using support vector cassification model, information point is screened using the characteristic value of the information point as input value.
8. point of interest screening technique according to claim 7, which is characterized in that the characteristic value include the First Eigenvalue, Second Eigenvalue and third feature value,
Wherein, the First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number calculate It arrives;
The Second Eigenvalue is the numerical information for being included and maximum preset numerical value and minimum default value according to information point It is calculated;
The third feature value is the corresponding application weight of application program according to used in user's submission information point and mentions Number is handed over to be calculated.
9. point of interest screening technique according to claim 8, which is characterized in that the type being cited includes geographical The different another information point in position is cited, cited in public transport category information point, by the identical another information point in geographical location It is cited.
10. the point of interest screening technique according to any one of claim 7-9, which is characterized in that the support vector machines The characteristic value of the sample data of disaggregated model is all larger than default characteristic threshold value, and the default characteristic threshold value includes the interest region The average characteristics threshold value and/or the same category of information point to be screened in the interest region of interior all information points to be screened Average characteristics threshold value.
11. the point of interest screening technique according to any one of claim 7-9, which is characterized in that in the interest region It is interior at least one information point is gone out according to the information sifting that information point is included after, further includes:
Choose the identical multiple information points of type;
Sequence from high to low is carried out according to the characteristic value multiple information points identical to type, it is forward then to retain sequence At least one information point.
12. the device in interest region in a kind of determining geographic range characterized by comprising
Acquiring unit, for obtaining the information point in geographic range;
Selection unit, for selection to be comprising same keyword from above- mentioned information point and quantity is greater than the multiple of the first preset threshold Information point;
Information point determination unit determines that is located at a center for calculating the distance the multiple information point Information point is denoted as a, and the determining maximum information point b of distance value between information point a of result for passing through the distance calculating;
First screening unit filters out quantity in each section for being segmented to the distance value between information point a and information point b Greater than the information point c of the second preset threshold;
Second screening unit, for calculating the information point around the above-mentioned each information point c filtered out comprising the same keyword Quantity, and retain quantity be more than or equal to third predetermined threshold value information point c;
Area determination unit, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation determine interest Region.
13. device according to claim 12, which is characterized in that the information point determination unit includes:
Middle site determination unit, for calculate separately each information point between all information points in addition to itself at a distance from it With, and determine the smallest information point a of sum of the distance;
Farthest point determination unit, for calculate separately the information point in addition to information point a between information point a at a distance from, and determine Information point b with maximum range value.
14. device according to claim 13, which is characterized in that first screening unit includes:
Segmenting unit, for being segmented to the maximum range value;
First amount calculation unit, for calculating separately the quantity of the information point in each section;
Sub- screening unit is denoted as information point c for filtering out quantity greater than the information point in the section of the second preset threshold.
15. device according to claim 14, which is characterized in that the segmenting unit includes:
Number of fragments determination unit, for determining number of fragments;
Equal sub-units, for carrying out equal part to the maximum range value according to the number of fragments.
16. device according to claim 12, which is characterized in that second screening unit includes:
Average distance computing unit, the average distance Rd between information point c gone out for calculating sifting;
Second amount calculation unit includes the same pass within the scope of N*Rd around each information point filtered out for calculating separately The quantity of the information point of keyword;
Removal unit, it is pre- that the quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword is less than third If the information point c of threshold value.
17. a kind of point of interest screening plant characterized by comprising
Interest area determination unit, for utilizing interest region in determination geographic range described in any one of claim 1-5 Method determine interest region;
Screening unit, for going out at least one information point according to the information sifting that information point is included in the interest region, At least one information point filtered out is point of interest.
18. device according to claim 17, which is characterized in that the screening unit includes:
Characteristic value determines subelement, for determining the characteristic value of all information points in the interest region;
Classification subelement is input value to information using the characteristic value of the information point for utilizing support vector cassification model Point is screened.
19. device according to claim 18, which is characterized in that the characteristic value includes the First Eigenvalue, second feature Value and third feature value, wherein
The First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number are calculated;
The characteristic value further includes Second Eigenvalue, the Second Eigenvalue be the numerical information for being included according to information point and What maximum preset numerical value and minimum default value were calculated;
The third feature value is the corresponding application weight of application program according to used in user's submission information point and mentions Number is handed over to be calculated.
20. device according to claim 19, which is characterized in that the type being cited includes by geographical location not phase With another information point it is cited, cited in public transport category information point, cited in the identical another information point in geographical location.
21. device described in any one of 8-20 according to claim 1, which is characterized in that the support vector cassification model The characteristic value of sample data be all larger than default characteristic threshold value, the default characteristic threshold value includes all in the interest region The average spy of same category of information point to be screened in the average characteristics threshold value of information point to be screened and/or the interest region Levy threshold value.
22. device described in any one of 8-20 according to claim 1, which is characterized in that further include:
Selection unit, for choosing the identical multiple information points of type;
Removal unit, for carrying out sequence from high to low according to the characteristic value multiple information points identical to type, then Retain at least one the forward information point that sorts.
CN201510746851.5A 2015-11-05 2015-11-05 The method and apparatus for determining interest region in geographic range, point of interest Active CN106681996B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510746851.5A CN106681996B (en) 2015-11-05 2015-11-05 The method and apparatus for determining interest region in geographic range, point of interest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510746851.5A CN106681996B (en) 2015-11-05 2015-11-05 The method and apparatus for determining interest region in geographic range, point of interest

Publications (2)

Publication Number Publication Date
CN106681996A CN106681996A (en) 2017-05-17
CN106681996B true CN106681996B (en) 2019-03-26

Family

ID=58857525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510746851.5A Active CN106681996B (en) 2015-11-05 2015-11-05 The method and apparatus for determining interest region in geographic range, point of interest

Country Status (1)

Country Link
CN (1) CN106681996B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960778B (en) * 2017-12-26 2023-06-27 北京金风慧能技术有限公司 Method and device for calculating theoretical power of wind power plant
CN108520070B (en) * 2018-04-13 2022-08-02 百度在线网络技术(北京)有限公司 Method and device for screening interest points of electronic map
CN108875013B (en) * 2018-06-19 2022-05-27 百度在线网络技术(北京)有限公司 Method and device for processing map data
CN108875032B (en) * 2018-06-25 2022-02-25 讯飞智元信息科技有限公司 Region type determination method and device
CN110889048B (en) * 2018-08-20 2022-09-09 阿里巴巴(中国)有限公司 Map data query method, system, terminal and server
CN109710709A (en) * 2018-12-13 2019-05-03 北京百度网讯科技有限公司 Interest point data processing method, device, electronic equipment and storage medium
CN110348525B (en) * 2019-07-15 2022-02-22 北京百度网讯科技有限公司 Map interest point acquisition method, device, equipment and storage medium
CN112711719B (en) * 2019-10-25 2024-06-04 北京搜狗科技发展有限公司 Point-of-interest searching method and device and readable storage medium
CN111552750B (en) * 2020-04-13 2023-05-30 深圳震有科技股份有限公司 Dynamic tracking display method, terminal and storage medium
CN111523061B (en) * 2020-04-23 2023-03-21 北京百度网讯科技有限公司 Method and apparatus for generating interest plane
CN111797184A (en) * 2020-05-29 2020-10-20 北京百度网讯科技有限公司 Information display method, device, equipment and medium
CN111797183A (en) * 2020-05-29 2020-10-20 汉海信息技术(上海)有限公司 Method and device for mining road attribute of information point and electronic equipment
CN114077979B (en) * 2020-08-18 2024-05-28 北京三快在线科技有限公司 Method and device for determining distribution service range
CN112132460A (en) * 2020-09-22 2020-12-25 京东城市(北京)数字科技有限公司 Method, device and system for identifying potential danger area and storage medium
CN114363796A (en) * 2020-09-30 2022-04-15 中移(成都)信息通信科技有限公司 Livestock outlier judgment method and device, electronic equipment and computer storage medium
CN113626729B (en) * 2021-07-30 2024-04-16 高德软件有限公司 Method and equipment for determining interest point information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010040400A1 (en) * 2008-10-08 2010-04-15 Tomtom International B.V. Navigation apparatus and method of providing points of interest
CN102176206A (en) * 2011-01-18 2011-09-07 宇龙计算机通信科技(深圳)有限公司 Periphery searching method and device of points of interest
CN103167404A (en) * 2011-12-14 2013-06-19 北京千橡网景科技发展有限公司 Method and device used for confirming interest points
CN103533501A (en) * 2013-10-15 2014-01-22 厦门雅迅网络股份有限公司 Geofence generating method
US20140071170A1 (en) * 2011-05-11 2014-03-13 Nokia Corporation Non-uniformly scaling a map for emphasizing areas of interest
CN104102637A (en) * 2013-04-02 2014-10-15 高德软件有限公司 Method and device for generating hot spot region

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010040400A1 (en) * 2008-10-08 2010-04-15 Tomtom International B.V. Navigation apparatus and method of providing points of interest
CN102176206A (en) * 2011-01-18 2011-09-07 宇龙计算机通信科技(深圳)有限公司 Periphery searching method and device of points of interest
US20140071170A1 (en) * 2011-05-11 2014-03-13 Nokia Corporation Non-uniformly scaling a map for emphasizing areas of interest
CN103167404A (en) * 2011-12-14 2013-06-19 北京千橡网景科技发展有限公司 Method and device used for confirming interest points
CN104102637A (en) * 2013-04-02 2014-10-15 高德软件有限公司 Method and device for generating hot spot region
CN103533501A (en) * 2013-10-15 2014-01-22 厦门雅迅网络股份有限公司 Geofence generating method

Also Published As

Publication number Publication date
CN106681996A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106681996B (en) The method and apparatus for determining interest region in geographic range, point of interest
Miah et al. A big data analytics method for tourist behaviour analysis
Liu et al. Classifying urban land use by integrating remote sensing and social media data
CN107291888B (en) Machine learning statistical model-based living recommendation system method near living hotel
Zheng et al. Mining interesting locations and travel sequences from GPS trajectories
Huang et al. Road centreline extraction from high‐resolution imagery based on multiscale structural features and support vector machines
CN103795613B (en) Method for predicting friend relationships in online social network
US11966424B2 (en) Method and apparatus for dividing region, storage medium, and electronic device
CN102163214B (en) Numerical map generation device and method thereof
CN112861972B (en) Site selection method and device for exhibition area, computer equipment and medium
CN104182517A (en) Data processing method and data processing device
CN111401692A (en) Method for measuring urban space function compactness
Deng et al. A density-based approach for detecting network-constrained clusters in spatial point events
CN110647607A (en) POI data verification method and device based on picture identification
CN104516980B (en) The output method and server system of search result
Abbruzzo et al. A pre-processing and network analysis of GPS tracking data
CN117217872A (en) Method for intelligently generating scenic spot playing scheme based on tourist portrait
CN113590940A (en) Article generation method and device based on knowledge graph
Ganguly et al. Optimization of spatial statistical approaches to identify land use/land cover change hot spots of Pune region of Maharashtra using remote sensing and GIS techniques
JP4906705B2 (en) Method and apparatus for automatically identifying a region of interest in a digital map
CN107437367A (en) One kind mark system of selection and device
CN113704373B (en) User identification method, device and storage medium based on movement track data
Ensari et al. Web scraping and mapping urban data to support urban design decisions
CN110263250A (en) A kind of generation method and device of recommended models
Shende et al. Analyzing changes in travel patterns due to Covid-19 using Twitter data in India

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210930

Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 518000 Room 403, East Building 2, SEG Science Park, Zhenxing Road, Shenzhen, Guangdong

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.