CN106681996B - The method and apparatus for determining interest region in geographic range, point of interest - Google Patents
The method and apparatus for determining interest region in geographic range, point of interest Download PDFInfo
- Publication number
- CN106681996B CN106681996B CN201510746851.5A CN201510746851A CN106681996B CN 106681996 B CN106681996 B CN 106681996B CN 201510746851 A CN201510746851 A CN 201510746851A CN 106681996 B CN106681996 B CN 106681996B
- Authority
- CN
- China
- Prior art keywords
- information point
- information
- point
- value
- interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides the method and apparatus in interest region in a kind of determining geographic range, point of interest, and the method in interest region includes: all information points obtained in geographic range in the determining geographic range;It chooses comprising same keyword and quantity is greater than multiple information points of the first preset threshold;By calculating the distance the multiple information point, the information point a for being located at center is determined, and determine the information point b farthest apart from the middle site;Distance value between information point a and information point b is segmented, the information point that quantity in each section is greater than the second preset threshold is filtered out;The quantity of the information point around the above-mentioned information point filtered out comprising the same keyword is calculated, and retains the information point that quantity is greater than third predetermined threshold value;Interest region is determined from the information point of the reservation.
Description
Technical field
The present invention relates to the data processing fields of electronic map, and in particular to determines interest region, interest in geographic range
The method and apparatus of point.
Background technique
Would generally mark geographic range in electronic map data, user can be identified according to electronic map such as province, city,
The geographic ranges such as district, but existing geographic range is larger, and multiple more specific regions, example are also typically included in geographic range
Such as Xicheng District of Beijing, the range of Xicheng District data can be identified according to the map, but further include in Xicheng District
The more specific region such as Xidan area, user can not determine the range in such region.
Interest region partitioning method is with some objects (such as the road, an information in map datum at present
Point) based on, a certain distance is spread as interest region to the objects perimeter, or according to preset range (such as default net
Lattice) it by map partitioning is multiple regions, each predeterminable area may each be interest region, and above two mode is suitably applied
Information point less area or remote districts, more for information points such as cities, arrangement situation complexity area are existing emerging
Interesting region partitioning method accuracy is poor.
Also, above- mentioned information point screening technique lacks Regional Analysis, usually in geographic range, directly according to information
The temperature of point is screened, so that the information point in non-hot spot areas lacks, screening efficiency is lower for this meeting.
Summary of the invention
In consideration of it, the present invention provides a kind of method in interest region in determining geographic range, this method comprises: obtaining geographical
Information point in range;It is chosen from above- mentioned information point comprising same keyword and quantity is greater than multiple letters of the first preset threshold
Breath point;Distance the multiple information point is calculated, the information point for being located at center is determined, is denoted as a, and pass through
The determining maximum information point b of distance value between information point a of the result that the distance calculates;To between information point a and information point b
Distance value be segmented, filter out the information point c that quantity in each section is greater than the second preset threshold;Calculate above-mentioned filter out
The quantity of information point around each information point c comprising the same keyword, and retain quantity more than or equal to third predetermined threshold value
Information point c;Interest region is determined according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.
Correspondingly, the present invention provides a kind of device in interest region in determining geographic range, comprising: acquiring unit is used for
Obtain the information point in geographic range;Selection unit, for selection to include same keyword from above- mentioned information point and quantity is big
In multiple information points of the first preset threshold;Information point determination unit, for calculating the distance the multiple information point, really
The information point for being located at center is made, is denoted as a, and determining between information point a by the result that the distance calculates
The maximum information point b of distance value;First screening unit, for being segmented to the distance value between information point a and information point b,
Filter out the information point c that quantity in each section is greater than the second preset threshold;Second screening unit, for calculating above-mentioned filter out
The quantity of information point around each information point c comprising the same keyword, and retain quantity more than or equal to third predetermined threshold value
Information point c;Area determination unit, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation are true
Dingxing interest region.
In addition, the present invention provides a kind of point of interest screening technique, this method comprises: using emerging in above-mentioned determining geographic range
The method in interesting region determines interest region;Go out at least one according to the information sifting that information point is included in the interest region
A information point, at least one information point filtered out are point of interest.
Correspondingly, the present invention also provides a kind of point of interest screening plants, comprising: interest area determination unit, for utilizing
The method in interest region determines interest region in above-mentioned determining geographic range;Screening unit, in the interest region
Go out at least one information point according to the information sifting that information point is included, which is point of interest.
Interest region in determining geographic range provided in an embodiment of the present invention, point of interest method and apparatus can be by map
Data are divided into multiple semi-cylindrical hills, then filter out that feature is relatively obvious, popularity is relatively high in interest region
Information point, the information point can be used as the landmark in interest region, and the information point filtered out can be applied to a variety of fields
Scape, such as standard information point can be used as in certain service class application programs, or highlighted in map datum,
User is set more easily to search information point.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below
Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor
It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the flow chart of the method in interest region in the determination geographic range provided according to embodiments of the present invention;
Fig. 2 is the arrangement situation schematic diagram for the multiple information points chosen;
Fig. 3 is the schematic diagram handled information point shown in Fig. 2;
Fig. 4 is the schematic diagram that processing result shown in Fig. 3 is further processed;
Fig. 5 is the interest area schematic determined after handling information point shown in Fig. 4;
Fig. 6 is the flow chart of the point of interest screening technique provided according to embodiments of the present invention;
Fig. 7 is the structure chart of the device in interest region in the determination geographic range provided according to embodiments of the present invention;
Fig. 8 is the structure chart of the point of interest screening plant provided according to embodiments of the present invention.
Specific embodiment
Technical solution of the present invention is clearly and completely described below in conjunction with attached drawing, it is clear that described implementation
Example is a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill
Personnel's every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
It is succinct and intuitive in order to what is described, hereafter by describing several representative embodiments come to the solution of the present invention
It is illustrated.A large amount of details is only used for helping to understand the solution of the present invention in embodiment.However, it will be apparent that technology of the invention
Scheme can be not limited to these details when realizing.In order to avoid unnecessarily having obscured the solution of the present invention, some embodiment party
Formula is not described meticulously, but only gives frame.Hereinafter, " comprising " refers to " including but not limited to ", " root
According to ... " refer to " according at least to ..., but be not limited to according only to ... ".Hereinafter it is not specifically stated the quantity of an ingredient
When, it is meant that the ingredient is either one or more, or can be regarded as at least one.
The method in interest region, information point screening technique and device can be in determination geographic range in the embodiment of the present invention
The calculating equipment of the various methods and software systems that can be realized the embodiment of the present invention by one is realized.The calculating equipment can be
It can be realized the calculating equipment of method and software systems provided in an embodiment of the present invention.The calculating equipment can be PC or
Portable equipment, such as laptop, tablet computer, mobile phone or smart phone etc..The calculating equipment can also be to be set with above-mentioned
The standby server being connected by network.
The calculating equipment can have different performance and feature.Various possible implementations are all in the protection of this paper
In range.For example, calculating equipment may include key zone/keyboard, it can also include a display, such as liquid crystal display
Or the display with Premium Features, such as touch sensible 2RD or 3RD display (LCRD),.In one example, a tool
The calculating equipment for having web function may include one or more physical keyboards or dummy keyboard and mass storage device.
Calculate equipment also and may include or allow various operating systems including or run various application programs, such as coding/
Decoding application.Application program can carry out coded communication by network and other equipment.
In addition, calculate equipment can also include the readable non-volatile memory medium of one or more processors and one or
Multiple processors communicated with storage medium.For example, the readable non-volatile memory medium of processor can be RAM, flash memory,
ROM, EPROM, EEPROM, register, hard disk, mobile hard disk, CRD-ROM or various other forms non-volatile memories be situated between
Matter.Storage medium can store series of instructions or unit and/or module comprising instruction, for completing the various implementations of the present invention
The operation of example.Processor can execute above-metioned instruction, complete the operation in various embodiments.
The embodiment of the present invention provides a kind of method in interest region in determining geographic range, as shown in Figure 1, this method includes
Following steps:
S1 obtains the information point in geographic range.Geographic range can be administrative division, such as Chaoyang District, Beijing City, sea
Shallow lake area is also possible to the region of the known boundaries such as state, town.Information point data is existing and opening data, each
Information point data includes at least the information (reference information) such as title, classification, latitude, longitude, neighbouring other information point, Mou Xiexin
Breath point also has rich information, such as hotel information point, including the class information (such as 0-5 star) in hotel, for another example
For residential quarter information point comprising the information such as resident's quantity, flat price.
S2, chooses from above- mentioned information point comprising same keyword and quantity is greater than multiple information of the first preset threshold
Point.It specifically can use vocabulary identification technology, identify keyword from the title of information point, address respectively, which is then determined
A little information points include same keyword.Keyword can be that there is no the titles of the region on specific boundary, such as Beijing west city
" Dongdan " etc. in " Xidan ", Dongcheng District in area.Multipacket message point would generally be filtered out for some geographic range, it is each
Group information point includes same keyword, but the information point quantity in certain groups may be less, this situation indicates the group information
The cognition degree for the keyword that point includes is not high enough, therefore should choose a fairly large number of information point and carry out subsequent processing.With geography
For range-Beijing " Xicheng District ", it is assumed that comprising keyword " Xidan " information point quantity be greater than preset threshold, according to comprising
Certain information point Relatively centralizeds known to the arrangement situation in map of the information point of keyword " Xidan ", certain information points may
Relative distribution.Assuming that have chosen multiple information points as shown in Figure 2 after the processing of step S2, intuitively according to fig. 2 in
Information point arranges situation it can be found that information point P1 and information point P2 are more dispersed information points.
The information point of arrangement dispersion will will affect final region division result, it is therefore desirable to from above-mentioned multiple information points
Find out and remove the information point of position dispersion.It will be understood by those skilled in the art that determining whether multiple points are concentrated between each other
Or there are many ways to dispersion, such as the biggish point of distance value can be got rid of according to the distance between every two point.It ties below
It closes Fig. 3 and Fig. 4 and the process of removal dispersed information point is discussed in detail.
S3 calculates the distance the multiple information point, determines the information point for being located at center, is denoted as a,
And the result determination calculated by the distance and the distance value maximum information point b, information point a between information point a are institute
There is the middle site of information point.It will be understood by those skilled in the art that in being found in the fixed multiple points in position site method
There are many, middle site can be found out by calculating the distance between each point value.For the situation shown in Fig. 3, information point P0 is
Information point positioned at center, is denoted as a;Information point P1 range information point P0 is farthest, therefore information point P1 and the information
The maximum information point of distance value between point a, is denoted as b.,
S4 is segmented the distance value between information point a and information point b, filters out quantity in each section and is greater than second in advance
If the information point c of threshold value.As shown in figure 3, the distance value Rmax of information point a and information point b points are 5 sections by the present embodiment, segmentation
5 distance value sections have been determined afterwards, has then then calculated the information point quantity in each section and is judged, if quantity is big
Then retain in the second preset value, otherwise removes.Assuming that the second preset threshold is 2, then it may determine that the section where information point b
In only its 1 information point, the information point quantity in other sections be all larger than or be equal to 2, thereby determine that information point b is removed,
The P1 in Fig. 3 is removed, other information point P0, P2, P3, P4, P5, P6, P7, P8, P9, P10 are retained them temporarily, by the information of reservation
Point is denoted as c.
S5 calculates the quantity of the information point around the above-mentioned each information point c filtered out comprising the same keyword, if
Negligible amounts, then it represents that equally the information point comprising the keyword is less around the information point, and information point arrangement is more dispersed,
It is on the contrary then indicate the information point arrangement concentrate, thus, it is possible to and retain quantity be more than or equal to third predetermined threshold value information point c.
Above-mentioned " surrounding " is a value range, such as can be radius, a value range can be set in actual use, then in the model
Enclose the quantity that interior judgement includes above-mentioned keyword.It is assumed that it is 1 that value range, which takes N*Rd, third predetermined threshold value value,
Then there was only the information point quantity around information point P2 is 0, so further removal information point P2, retains other qualified letters
Cease point P0, P3, P4, P5, P6, P7, P8, P9, P10.
S6 determines interest region according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.By upper
It states step S2-S5, information point P1 and information point P2 to be removed, then for example the information retained can be utilized according to algorithm of convex hull
Point P0, P3, P4, P5, P6, P7, P8, P9, P10 construct minimum external chimb row region, and Fig. 5 shows determining according to this method
Interest region, the multiple information points for having in the interest region arrangement intensive.
The method in interest region in the determination geographic range provided according to embodiments of the present invention, by obtaining in geographic range
All information points simultaneously filter out multiple information points comprising same keyword, can determine multiple letters containing same keyword
Point is ceased, the information point for then whether being greater than preset threshold to quantity is handled, the removal information point that wherein position disperses, finally
Interest region is determined according to multiple information points of reservation, and more specific region of interest can be further determined that out in geographic range
Domain is the information point including same keyword in interest region, it is possible thereby to improve the accuracy of ground interest regional scope.
Above-mentioned steps S3 may include following sub-step as a preferred implementation manner:
S31, calculate separately each information point and all information points in addition to itself sum of the distance (preferably straight line away from
With a distance from but it is also possible to be the route in map, hereafter by taking linear distance as an example), and determine the smallest information of sum of the distance
Point a.
S32 calculates separately the information point in addition to information point a at a distance from information point a, and determines there is maximum range value
Information point b;As shown in figure 3, maximum distance is Rmax (the distance between information point P0 and information point P1).
Above-mentioned preferred embodiment, can be with by calculating the sum of the distance of each information point with all information points in addition to itself
Site in more accurately being found in multiple information points, and find the information point that site is farthest in distance.
Above-mentioned steps S4 may include following sub-step as a preferred implementation manner:
S41 is segmented maximum distance Rmax, it is preferred to use the method for equal part determines number of fragments N, by Rmax points
It is N sections, Ri=i/N Rmax, it will be understood by those skilled in the art that using other segmentation methods, such as golden section is all
It is feasible.
S42 calculates separately the quantity of the information point in each section, the i.e. number of information point of the calculating distance between [0, R1]
Measure x1, the quantity x2 of information point between [R1, R2], information point between [R2, R3] quantity x3, until calculate [Rn,
Rmax] between information point quantity xn, above-mentioned each section can be expressed as [Rmax/N* (n-1), Rmax/N* (n)],
Middle n is positive integer, and value range is [1, n] and n≤N, N are the positive integer greater than 1, such as above-mentioned N value is 5, be can be obtained
Quantity X2, [Rmax/5*2, the Rmax/ of the quantity X1 of the information point of [0, Rmax/5], the information point of [Rmax/5, Rmax/5*2]
5*3] the quantity X2 of information point, the quantity X3 of information point of [Rmax/5*3, Rmax/5*4], [Rmax/5*4, Rmax] letter
Cease the quantity X4 of point;
S43 filters out quantity greater than the information point in the section of the second preset threshold.Such as judge whether X1-X4 is big respectively
In preset threshold, it is assumed that the second preset threshold value is 2, then can determine that information point P1 is removed according to judgement result, other
Information point is retained, that is, retains information point of the distance between [0, Rmax/5*4], and removal distance is [Rmax/5*4, Rmax]
Information point.
Above-mentioned preferred embodiment is segmented maximum range value, and successively judges the quantity of the information point in each segmentation,
And retain the information point that quantity is greater than preset threshold, it is possible thereby to get rid of the information point of distribution relative distribution.
Above-mentioned steps S5 may include following sub-step as a preferred implementation manner:
S51, the average distance Rd between information point that calculating sifting goes out;
S52 calculates separately the information point around the information point filtered out within the scope of N*Rd comprising the same keyword
Quantity, if number is less, then it represents that equally the information point comprising the keyword is less around the information point, information point arrangement
It is more dispersed, it is on the contrary then indicate the information point arrangement concentrate;
S53, the quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword are less than third and preset threshold
The information point of value, as shown in figure 4, such as preset threshold value is 1, then the information point only around information point P2 within the scope of N*Rd
Quantity is 0, so removal information point P2, retains other qualified information points.
Above-mentioned preferred embodiment calculates the average distance Rd between information point first, then with average distance Rd and coefficient N
For according to judging within the scope of each information point N*Rd with the presence or absence of enough information points, it is possible thereby to the further row of getting rid of
The information point of cloth dispersion keeps the information point finally retained arrangement density sufficiently high.
Another embodiment of the present invention provides a kind of point of interest screening techniques, and this method comprises the following steps as shown in Figure 6:
S ' 1 determines interest region, region of interest using the method in interest region in the determination geographic range in previous embodiment
Usually there is more information point in domain;
S ' 2 goes out at least one information point according to the information sifting that information point is included in the interest region, the screening
At least one information point out is point of interest (Point of Interest, POI), which compares its in interest region
His information point has higher popularity or more obvious feature.It will be understood by those skilled in the art that the side of filter information point
There are many methods, such as can be according to the temperature of information point (by user query, the number of selection, submission in various application systems)
It is ranked up, selects the higher information point of temperature.
Map datum can be divided into multiple region of interest by the point of interest screening technique provided according to embodiments of the present invention
Then domain filters out the information point that feature is relatively obvious, popularity is relatively high in interest region, the information point filtered out can
It using as the landmark in interest region, or is point of interest, the information point filtered out can be applied to several scenes, example
It can such as be used as standard information point in certain service class application programs, or highlighted in map datum, use
Family can more easily search information point.
As a preferred embodiment, the step S ' 2 in the present embodiment be can specifically include:
S ' 21 determines the characteristic value of all information points in the interest region, it will be understood by those skilled in the art that really
There are many ways to determining the characteristic value of information point, for example, can according to information point by attention degree etc. because usually determining, it is existing
Feature value calculating method is all feasible;
S ' 22 carries out information point using the characteristic value of the information point as input value using support vector cassification model
Screening.Support vector machines (SVM, Support Vector Machine) is supervised learning mould related to relevant learning algorithm
Type can analyze data, recognition mode, for classification and regression analysis.Supporting vector machine model can pass through given sample
Data are trained to obtain, and the sample data in the present invention can be classified as two classes, and one kind is qualified target sample, another kind of
It is ineligible non-targeted samples.Given letter can be judged using the supporting vector machine model obtained by training
Whether breath point is target information point.
It is preferable to use multiple characteristic values to screen to information point for the present embodiment, i.e., the characteristic value of information point includes first special
Value indicative, Second Eigenvalue and third feature value, wherein the calculation method of the First Eigenvalue include:
The type and number that information point is cited are calculated according to the address information of information point, assigns power for each type
Then weight calculates the First Eigenvalue of information point according to reference type weight and reference number.Preferably, above-mentioned reference type is extremely
Less include three classes:
The first kind is cited in the different another information point in geographical location.Such as there are two information points: information point 1-
Hai Dianqiao, information point 2- China's technology business mansion, the address information of information point 2 include that " technology business of Haidian Qiao Dong China is big
Tall building ", then the Haidian information point 1- bridge is quoted primary by the China's technology business mansion information point 2-, which is properly termed as outside and draws
With;
Second class is cited in public transport category information point.Such as information point 1- Hai Dianqiao, information point 2- Haidian Qiao Dong public transport
It stands, then the Haidian information point 1- bridge is quoted primary by the Haidian information point 2- Qiao Dong bus station, which is properly termed as public transport reference;
Third class is cited in the identical another information point in geographical location.Such as there are two information points: in information point 1-
State's technology business mansion, information point 2- beta coffee shop, the address information of information point 2 include " Chinese technology business mansion B1 floor
Beta coffee ", the China's technology business mansion information point 1- are quoted once by information point 2- beta coffee shop, which is properly termed as
Internal reference.
The type and number that each information point is cited may be all different, and certain information points may belong to simultaneously it is more
Class reference, therefore following equation can be used to calculate the First Eigenvalue:
Y1=information point AX1+BX2+CX3,
Wherein Y1 is the First Eigenvalue, and information point A, B, C are the weight of three kinds of reference types, and X1, X2, X3 are corresponding reference
The reference number of type, information point A, B, C can take arbitrary value, which is used to embody the importance of reference type, such as can
To be information point A > B > C or information point A=B=C.It will be understood by those skilled in the art that above-mentioned formula simply to illustrate that
The physical meaning of the First Eigenvalue and the specific embodiment enumerated can also be by more simply or more when practical application
Complicated algorithm calculates the First Eigenvalue of information point using reference type weight and number the two information.
The calculation method of above-mentioned Second Eigenvalue includes:
The numerical information and maximum preset numerical value and minimum default value for being included according to information point calculate information point
Second Eigenvalue, it will be understood by those skilled in the art that in information point, in addition to name information, address information, reference information,
Also there is rich information, the rich information of various information point is all different, such as hotel's class, hospital's class, scenic spot class, government bodies have
Corresponding class information, house class have the information such as inhabitation amount, price, and food and drink class has the information such as comment quantity or point score value,
Above-mentioned richness information is numerical information, and when practical application can be with the numerical information in Extracting Information point, then according to information point
Type determines the maximum value and minimum value of the numerical information of such information point.It is maximum such as hotel's category information point
Value can take 5, and minimum value can take 0, then can be according to the actual grade information and maximum value, minimum value of the hotel information point
Ratio determine Second Eigenvalue.Other kinds of information point can also be calculated in this manner.Art technology
Personnel be appreciated that the calculation of above-mentioned Second Eigenvalue simply to illustrate that the physical meaning of Second Eigenvalue and enumerate
One specific embodiment can also use simpler or more complicated mode be calculated when practical application.
The calculation method of above-mentioned third feature value includes:
According to user use selected operation of the different application to information point, determine application program used by a user and
Selected number accordingly assigns weight for each application program, then calculates the of information point according to weight and selected number
Three characteristic values.Such as some information point, user selectes the information point X1 times using application program 1, and user uses application
Program 2 selectes the information point X2 times ... user selectes the information point Xn times using application program n, and third feature value can as a result,
To be calculated according to the following formula:
Y3=information point AX1+BX2+ ...+NXn,
Wherein Y3 is third feature value, and information point A, B ... N is the weight of n kind application program, and above-mentioned weight can be any
Value, and higher weight etc. can be assigned for specific application program.It will be understood by those skilled in the art that above-mentioned formula is only
It is the physical meaning in order to illustrate third feature value and the specific embodiment enumerated, when practical application, can also passes through
Simpler or more complicated algorithm is special using the third that Application Type weight and number the two information calculate information point
Value indicative.
The case where above-mentioned preferred embodiment is cited using information point, user submit the rich information of situation and information point as foundation,
And information point is screened using machine learning model, it is possible thereby to keep screening mode more targeted, and improve screening effect
Rate.
It will be understood by those skilled in the art that support vector cassification model is constantly trained using a large amount of sample data
Obtained from.Mesh in order to improve the classification performance of support vector cassification model, as the support vector cassification model
The characteristic value (the First Eigenvalue, Second Eigenvalue and third feature value) of mark information point training sample is all larger than default feature threshold
Value, the default characteristic threshold value include in the average characteristics threshold value and/or the interest region of all information points in the interest region
Same category information point average characteristics threshold value.
Due to may include the higher information point of many popularity in an interest region, such as have in the region of Xidan very much
Market category information point, may all be judged as target information point, but this is not inconsistent when model discrimination is stated in many markets in use
Family is shared to the direct feel of popularity.In some region, user is typically only capable to remember former to the information point of some classification
Name, thus regional area in some classification target information point be excessively it is unreasonable, in order to further to target information click through
Row screening, this method can also include the following steps:
S ' 3 chooses the identical multiple information points of type;
S ' 4 carries out sequence from high to low according to features described above value multiple information points identical to type, before then retaining
N number of same type of target information point improves the experience of user it is possible thereby to keep the resolution of the information point filtered out higher.
Another embodiment of the invention also provides a kind of device in interest region in determining geographic range, as shown in fig. 7,
The device includes:
Acquiring unit 71, for obtaining the information point in geographic range;
Selection unit 72, for selection to be comprising same keyword from above- mentioned information point and quantity is greater than the first preset threshold
Multiple information points;
Information point determination unit 73 determines that is located at a center for calculating the distance the multiple information point
The information point of position is denoted as a, and the determining maximum letter of distance value between information point a of result for passing through the distance calculating
Cease point b;
First screening unit 74 filters out each section for being segmented to the distance value between information point a and information point b
Interior quantity is greater than the information point c of the second preset threshold;
Second screening unit 75 includes the same keyword around the above-mentioned each information point c filtered out for calculating
The quantity of information point, and retain the information point c that quantity is more than or equal to third predetermined threshold value;
Area determination unit 76, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation are true
Dingxing interest region.
Above-mentioned apparatus is by obtaining all information points in geographic range and filtering out multiple information comprising same keyword
Point can determine multiple information points containing same keyword, and the information for then whether being greater than preset threshold to quantity clicks through
Row processing, the removal information point that wherein position disperses, finally determines interest region according to multiple information points of reservation, can be on ground
More specific interest region is further determined that out in reason range, is the information point including same keyword in interest region, by
The accuracy of ground interest regional scope can be improved in this.
Preferably, the information point determination unit 73 includes:
Middle site determination unit, for calculate separately each information point between all information points in addition to itself at a distance from
The sum of, and determine the smallest information point a of sum of the distance;
Farthest point determination unit, for calculate separately the information point in addition to information point a between information point a at a distance from, and
Determine the information point b with maximum range value.
Above-mentioned preferred embodiment, can be with by calculating the sum of the distance of each information point with all information points in addition to itself
Site in more accurately being found in multiple information points, and find the information point that site is farthest in distance.
Preferably, first screening unit 74 includes:
Segmenting unit, for being segmented to the maximum range value;
First amount calculation unit, for calculating separately the quantity of the information point in each section;
Sub- screening unit is denoted as information point c for filtering out quantity greater than the information point in the section of the second preset threshold.
Preferably, the segmenting unit includes:
Number of fragments determination unit, for determining number of fragments;
Equal sub-units, for carrying out equal part to the maximum range value according to the number of fragments.
Above-mentioned preferred embodiment is segmented maximum range value, and successively judges the quantity of the information point in each segmentation,
And retain the information point that quantity is greater than preset threshold, it is possible thereby to get rid of the information point of distribution relative distribution.
Preferably, second screening unit 75 includes:
Average distance computing unit, the average distance Rd between information point c gone out for calculating sifting;
Second amount calculation unit includes described same within the scope of N*Rd for calculating separately around each information point filtered out
The quantity of the information point of one keyword;
Removal unit, for removing the quantity of the information point within the scope of surrounding N*Rd comprising the same keyword less than the
The information point c of three preset thresholds.
Above-mentioned preferred embodiment calculates the average distance Rd between information point first, then with average distance Rd and coefficient N
For according to judging within the scope of each information point N*Rd with the presence or absence of enough information points, it is possible thereby to the further row of getting rid of
The information point of cloth dispersion keeps the information point finally retained arrangement density sufficiently high.
Another embodiment of the invention also provides a kind of point of interest screening plant, as shown in figure 8, the device includes:
Interest area determination unit 81, for region of interest in the determination geographic range using the offer of above-mentioned one embodiment
The method in domain determines interest region;
Screening unit 82, for going out at least one letter according to the information sifting that information point is included in the interest region
Point is ceased, which is point of interest.
Map datum can be divided into multiple semi-cylindrical hills by above-mentioned point of interest screening plant, then in interest region
The information point that feature is relatively obvious, popularity is relatively high is filtered out, which can be used as the terrestrial reference in interest region
Building, the information point filtered out can be applied to several scenes, such as standard can be used as in certain service class application programs
Information point, or highlighted in map datum, so that user is more easily searched information point.
Preferably, above-mentioned screening unit 82 may include:
Characteristic value determines subelement, for determining the characteristic value of all information points in the interest region;
Classification subelement, for utilizing support vector cassification model, using the characteristic value of the information point as input value pair
Information point is screened.
Preferably, the characteristic value includes fisrt feature, second feature and third feature value, wherein
The First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number calculate
It arrives;
The characteristic value further includes Second Eigenvalue, and the Second Eigenvalue is the numerical information for being included according to information point
And maximum preset numerical value and minimum default value are calculated;
The third feature value is that the application program according to used in user's submission information point is corresponding using weight
It is calculated with submission number.
Preferably, the type being cited includes cited in the different another information point in geographical location, by public transport
Cited in category information point, cited in the identical another information point in geographical location.
The case where above-mentioned preferred embodiment is cited with information point, the rich information of information point, user select in different scenes
The case where information point is foundation, and is screened using machine learning model to information point, it is possible thereby to there is screening mode more
Specific aim, and improve screening efficiency.
Preferably, the characteristic value of the information point of the training sample for meeting screening conditions of support vector cassification model is big
In default characteristic threshold value, default characteristic threshold value include all information points to be screened in interest region average characteristics threshold value and/
Or the average characteristics threshold value of the same category of information point to be screened in interest region.Above-mentioned training sample can be improved support to
The classification performance of amount machine disaggregated model.
Preferably, above- mentioned information point screening plant can also include:
Selection unit 83, for choosing the identical multiple information points of type;
Removal unit 84, for carrying out sequence from high to low according to characteristic value multiple information points identical to type, so
Retain at least one the forward information point that sorts afterwards.Above-mentioned preferred embodiment can make the resolution of the information point filtered out higher,
Improve the experience of user.
Obviously, the above embodiments are merely examples for clarifying the description, and does not limit the embodiments.It is right
For those of ordinary skill in the art, can also make on the basis of the above description it is other it is various forms of variation or
It changes.There is no necessity and possibility to exhaust all the enbodiments.And it is extended from this it is obvious variation or
It changes still within the protection scope of the invention.
Claims (22)
1. a kind of method in interest region in determining geographic range characterized by comprising
Obtain the information point in geographic range;
It is chosen from above- mentioned information point comprising same keyword and quantity is greater than multiple information points of the first preset threshold;
Distance the multiple information point is calculated, the information point for being located at center is determined, is denoted as a, and pass through
The determining maximum information point b of distance value between information point a of the result that the distance calculates;
Distance value between information point a and information point b is segmented, quantity in each section is filtered out and is greater than the second preset threshold
Information point c;
The quantity of the information point around the above-mentioned each information point c filtered out comprising the same keyword is calculated, and retains quantity
More than or equal to the information point c of third predetermined threshold value;
Interest region is determined according to the information point c that the quantity of above-mentioned reservation is more than or equal to third predetermined threshold value.
2. the method according to claim 1, wherein the distance between the multiple information point calculates, really
The information point for being located at center is made, is denoted as a, and determining between information point a by the result that the distance calculates
The maximum information point b of distance value, comprising:
The sum of the distance between each information point and all information points in addition to itself is calculated separately, and determines that sum of the distance is minimum
Information point a;
Calculate separately the information point in addition to information point a between information point a at a distance from, and determine have maximum range value information
Point b.
3. according to the method described in claim 2, it is characterized in that, the distance value between information point a and information point b carries out
Segmentation filters out the information point c that quantity in each section is greater than the second preset threshold, comprising:
The maximum range value is segmented;
Calculate separately the quantity of the information point in each section;
Quantity is filtered out greater than the information point in the section of the second preset threshold, is denoted as information point c.
4. according to the method described in claim 3, it is characterized in that, it is described to the maximum range value carry out segmentation include:
Determine number of fragments;
Equal part is carried out to the maximum range value according to the number of fragments.
5. the method according to claim 1, wherein described calculate is wrapped around the above-mentioned each information point c filtered out
The quantity of information point containing the same keyword, and retain the information point c that quantity is more than or equal to third predetermined threshold value, comprising:
The average distance Rd between information point c that calculating sifting goes out;
Calculate separately the quantity of the information point around each information point filtered out within the scope of N*Rd comprising the same keyword;
The quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword is less than the information of third predetermined threshold value
Point c.
6. a kind of point of interest screening technique characterized by comprising
Interest region is determined using the method in interest region in determination geographic range described in any one of claim 1-5;
At least one information point is gone out according to the information sifting that information point is included in the interest region, this is filtered out at least
One information point is point of interest.
7. point of interest screening technique according to claim 6, which is characterized in that it is described in the interest region according to letter
The included information sifting of breath point goes out at least one information point and includes:
Determine the characteristic value of all information points in the interest region;
Using support vector cassification model, information point is screened using the characteristic value of the information point as input value.
8. point of interest screening technique according to claim 7, which is characterized in that the characteristic value include the First Eigenvalue,
Second Eigenvalue and third feature value,
Wherein, the First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number calculate
It arrives;
The Second Eigenvalue is the numerical information for being included and maximum preset numerical value and minimum default value according to information point
It is calculated;
The third feature value is the corresponding application weight of application program according to used in user's submission information point and mentions
Number is handed over to be calculated.
9. point of interest screening technique according to claim 8, which is characterized in that the type being cited includes geographical
The different another information point in position is cited, cited in public transport category information point, by the identical another information point in geographical location
It is cited.
10. the point of interest screening technique according to any one of claim 7-9, which is characterized in that the support vector machines
The characteristic value of the sample data of disaggregated model is all larger than default characteristic threshold value, and the default characteristic threshold value includes the interest region
The average characteristics threshold value and/or the same category of information point to be screened in the interest region of interior all information points to be screened
Average characteristics threshold value.
11. the point of interest screening technique according to any one of claim 7-9, which is characterized in that in the interest region
It is interior at least one information point is gone out according to the information sifting that information point is included after, further includes:
Choose the identical multiple information points of type;
Sequence from high to low is carried out according to the characteristic value multiple information points identical to type, it is forward then to retain sequence
At least one information point.
12. the device in interest region in a kind of determining geographic range characterized by comprising
Acquiring unit, for obtaining the information point in geographic range;
Selection unit, for selection to be comprising same keyword from above- mentioned information point and quantity is greater than the multiple of the first preset threshold
Information point;
Information point determination unit determines that is located at a center for calculating the distance the multiple information point
Information point is denoted as a, and the determining maximum information point b of distance value between information point a of result for passing through the distance calculating;
First screening unit filters out quantity in each section for being segmented to the distance value between information point a and information point b
Greater than the information point c of the second preset threshold;
Second screening unit, for calculating the information point around the above-mentioned each information point c filtered out comprising the same keyword
Quantity, and retain quantity be more than or equal to third predetermined threshold value information point c;
Area determination unit, the information point c for being more than or equal to third predetermined threshold value according to the quantity of above-mentioned reservation determine interest
Region.
13. device according to claim 12, which is characterized in that the information point determination unit includes:
Middle site determination unit, for calculate separately each information point between all information points in addition to itself at a distance from it
With, and determine the smallest information point a of sum of the distance;
Farthest point determination unit, for calculate separately the information point in addition to information point a between information point a at a distance from, and determine
Information point b with maximum range value.
14. device according to claim 13, which is characterized in that first screening unit includes:
Segmenting unit, for being segmented to the maximum range value;
First amount calculation unit, for calculating separately the quantity of the information point in each section;
Sub- screening unit is denoted as information point c for filtering out quantity greater than the information point in the section of the second preset threshold.
15. device according to claim 14, which is characterized in that the segmenting unit includes:
Number of fragments determination unit, for determining number of fragments;
Equal sub-units, for carrying out equal part to the maximum range value according to the number of fragments.
16. device according to claim 12, which is characterized in that second screening unit includes:
Average distance computing unit, the average distance Rd between information point c gone out for calculating sifting;
Second amount calculation unit includes the same pass within the scope of N*Rd around each information point filtered out for calculating separately
The quantity of the information point of keyword;
Removal unit, it is pre- that the quantity for removing the information point within the scope of surrounding N*Rd comprising the same keyword is less than third
If the information point c of threshold value.
17. a kind of point of interest screening plant characterized by comprising
Interest area determination unit, for utilizing interest region in determination geographic range described in any one of claim 1-5
Method determine interest region;
Screening unit, for going out at least one information point according to the information sifting that information point is included in the interest region,
At least one information point filtered out is point of interest.
18. device according to claim 17, which is characterized in that the screening unit includes:
Characteristic value determines subelement, for determining the characteristic value of all information points in the interest region;
Classification subelement is input value to information using the characteristic value of the information point for utilizing support vector cassification model
Point is screened.
19. device according to claim 18, which is characterized in that the characteristic value includes the First Eigenvalue, second feature
Value and third feature value, wherein
The First Eigenvalue is that the corresponding reference weight of type being cited according to information point and reference number are calculated;
The characteristic value further includes Second Eigenvalue, the Second Eigenvalue be the numerical information for being included according to information point and
What maximum preset numerical value and minimum default value were calculated;
The third feature value is the corresponding application weight of application program according to used in user's submission information point and mentions
Number is handed over to be calculated.
20. device according to claim 19, which is characterized in that the type being cited includes by geographical location not phase
With another information point it is cited, cited in public transport category information point, cited in the identical another information point in geographical location.
21. device described in any one of 8-20 according to claim 1, which is characterized in that the support vector cassification model
The characteristic value of sample data be all larger than default characteristic threshold value, the default characteristic threshold value includes all in the interest region
The average spy of same category of information point to be screened in the average characteristics threshold value of information point to be screened and/or the interest region
Levy threshold value.
22. device described in any one of 8-20 according to claim 1, which is characterized in that further include:
Selection unit, for choosing the identical multiple information points of type;
Removal unit, for carrying out sequence from high to low according to the characteristic value multiple information points identical to type, then
Retain at least one the forward information point that sorts.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510746851.5A CN106681996B (en) | 2015-11-05 | 2015-11-05 | The method and apparatus for determining interest region in geographic range, point of interest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510746851.5A CN106681996B (en) | 2015-11-05 | 2015-11-05 | The method and apparatus for determining interest region in geographic range, point of interest |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106681996A CN106681996A (en) | 2017-05-17 |
CN106681996B true CN106681996B (en) | 2019-03-26 |
Family
ID=58857525
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510746851.5A Active CN106681996B (en) | 2015-11-05 | 2015-11-05 | The method and apparatus for determining interest region in geographic range, point of interest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106681996B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109960778B (en) * | 2017-12-26 | 2023-06-27 | 北京金风慧能技术有限公司 | Method and device for calculating theoretical power of wind power plant |
CN108520070B (en) * | 2018-04-13 | 2022-08-02 | 百度在线网络技术(北京)有限公司 | Method and device for screening interest points of electronic map |
CN108875013B (en) * | 2018-06-19 | 2022-05-27 | 百度在线网络技术(北京)有限公司 | Method and device for processing map data |
CN108875032B (en) * | 2018-06-25 | 2022-02-25 | 讯飞智元信息科技有限公司 | Region type determination method and device |
CN110889048B (en) * | 2018-08-20 | 2022-09-09 | 阿里巴巴(中国)有限公司 | Map data query method, system, terminal and server |
CN109710709A (en) * | 2018-12-13 | 2019-05-03 | 北京百度网讯科技有限公司 | Interest point data processing method, device, electronic equipment and storage medium |
CN110348525B (en) * | 2019-07-15 | 2022-02-22 | 北京百度网讯科技有限公司 | Map interest point acquisition method, device, equipment and storage medium |
CN112711719B (en) * | 2019-10-25 | 2024-06-04 | 北京搜狗科技发展有限公司 | Point-of-interest searching method and device and readable storage medium |
CN111552750B (en) * | 2020-04-13 | 2023-05-30 | 深圳震有科技股份有限公司 | Dynamic tracking display method, terminal and storage medium |
CN111523061B (en) * | 2020-04-23 | 2023-03-21 | 北京百度网讯科技有限公司 | Method and apparatus for generating interest plane |
CN111797184A (en) * | 2020-05-29 | 2020-10-20 | 北京百度网讯科技有限公司 | Information display method, device, equipment and medium |
CN111797183A (en) * | 2020-05-29 | 2020-10-20 | 汉海信息技术(上海)有限公司 | Method and device for mining road attribute of information point and electronic equipment |
CN114077979B (en) * | 2020-08-18 | 2024-05-28 | 北京三快在线科技有限公司 | Method and device for determining distribution service range |
CN112132460A (en) * | 2020-09-22 | 2020-12-25 | 京东城市(北京)数字科技有限公司 | Method, device and system for identifying potential danger area and storage medium |
CN114363796A (en) * | 2020-09-30 | 2022-04-15 | 中移(成都)信息通信科技有限公司 | Livestock outlier judgment method and device, electronic equipment and computer storage medium |
CN113626729B (en) * | 2021-07-30 | 2024-04-16 | 高德软件有限公司 | Method and equipment for determining interest point information |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010040400A1 (en) * | 2008-10-08 | 2010-04-15 | Tomtom International B.V. | Navigation apparatus and method of providing points of interest |
CN102176206A (en) * | 2011-01-18 | 2011-09-07 | 宇龙计算机通信科技(深圳)有限公司 | Periphery searching method and device of points of interest |
CN103167404A (en) * | 2011-12-14 | 2013-06-19 | 北京千橡网景科技发展有限公司 | Method and device used for confirming interest points |
CN103533501A (en) * | 2013-10-15 | 2014-01-22 | 厦门雅迅网络股份有限公司 | Geofence generating method |
US20140071170A1 (en) * | 2011-05-11 | 2014-03-13 | Nokia Corporation | Non-uniformly scaling a map for emphasizing areas of interest |
CN104102637A (en) * | 2013-04-02 | 2014-10-15 | 高德软件有限公司 | Method and device for generating hot spot region |
-
2015
- 2015-11-05 CN CN201510746851.5A patent/CN106681996B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010040400A1 (en) * | 2008-10-08 | 2010-04-15 | Tomtom International B.V. | Navigation apparatus and method of providing points of interest |
CN102176206A (en) * | 2011-01-18 | 2011-09-07 | 宇龙计算机通信科技(深圳)有限公司 | Periphery searching method and device of points of interest |
US20140071170A1 (en) * | 2011-05-11 | 2014-03-13 | Nokia Corporation | Non-uniformly scaling a map for emphasizing areas of interest |
CN103167404A (en) * | 2011-12-14 | 2013-06-19 | 北京千橡网景科技发展有限公司 | Method and device used for confirming interest points |
CN104102637A (en) * | 2013-04-02 | 2014-10-15 | 高德软件有限公司 | Method and device for generating hot spot region |
CN103533501A (en) * | 2013-10-15 | 2014-01-22 | 厦门雅迅网络股份有限公司 | Geofence generating method |
Also Published As
Publication number | Publication date |
---|---|
CN106681996A (en) | 2017-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106681996B (en) | The method and apparatus for determining interest region in geographic range, point of interest | |
Miah et al. | A big data analytics method for tourist behaviour analysis | |
Liu et al. | Classifying urban land use by integrating remote sensing and social media data | |
CN107291888B (en) | Machine learning statistical model-based living recommendation system method near living hotel | |
Zheng et al. | Mining interesting locations and travel sequences from GPS trajectories | |
Huang et al. | Road centreline extraction from high‐resolution imagery based on multiscale structural features and support vector machines | |
CN103795613B (en) | Method for predicting friend relationships in online social network | |
US11966424B2 (en) | Method and apparatus for dividing region, storage medium, and electronic device | |
CN102163214B (en) | Numerical map generation device and method thereof | |
CN112861972B (en) | Site selection method and device for exhibition area, computer equipment and medium | |
CN104182517A (en) | Data processing method and data processing device | |
CN111401692A (en) | Method for measuring urban space function compactness | |
Deng et al. | A density-based approach for detecting network-constrained clusters in spatial point events | |
CN110647607A (en) | POI data verification method and device based on picture identification | |
CN104516980B (en) | The output method and server system of search result | |
Abbruzzo et al. | A pre-processing and network analysis of GPS tracking data | |
CN117217872A (en) | Method for intelligently generating scenic spot playing scheme based on tourist portrait | |
CN113590940A (en) | Article generation method and device based on knowledge graph | |
Ganguly et al. | Optimization of spatial statistical approaches to identify land use/land cover change hot spots of Pune region of Maharashtra using remote sensing and GIS techniques | |
JP4906705B2 (en) | Method and apparatus for automatically identifying a region of interest in a digital map | |
CN107437367A (en) | One kind mark system of selection and device | |
CN113704373B (en) | User identification method, device and storage medium based on movement track data | |
Ensari et al. | Web scraping and mapping urban data to support urban design decisions | |
CN110263250A (en) | A kind of generation method and device of recommended models | |
Shende et al. | Analyzing changes in travel patterns due to Covid-19 using Twitter data in India |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210930 Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd. Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd. Address before: 518000 Room 403, East Building 2, SEG Science Park, Zhenxing Road, Shenzhen, Guangdong Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd. |