CN108011987B - IP address positioning method and device, electronic equipment and storage medium - Google Patents

IP address positioning method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN108011987B
CN108011987B CN201710942850.7A CN201710942850A CN108011987B CN 108011987 B CN108011987 B CN 108011987B CN 201710942850 A CN201710942850 A CN 201710942850A CN 108011987 B CN108011987 B CN 108011987B
Authority
CN
China
Prior art keywords
geographic
area
geographical
sample
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710942850.7A
Other languages
Chinese (zh)
Other versions
CN108011987A (en
Inventor
胡潇
王程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN201710942850.7A priority Critical patent/CN108011987B/en
Publication of CN108011987A publication Critical patent/CN108011987A/en
Priority to PCT/CN2018/108010 priority patent/WO2019072092A1/en
Application granted granted Critical
Publication of CN108011987B publication Critical patent/CN108011987B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/69Types of network addresses using geographic information, e.g. room number

Abstract

The application provides an IP address positioning method, belongs to the field of IP address positioning, and is used for solving the problem of low positioning accuracy rate of the IP address positioning method in the prior art. The method comprises the following steps: acquiring a geographical position sample of an IP address to be positioned; aggregating the geographic positions in the geographic position sample to obtain at least one geographic area; aggregating the at least one geographical area again to determine an optimal geographical area; according to the geographical location sample in the optimal geographical area. According to the method, the geographic positions carried in a large number of geographic position samples are aggregated to divide the samples into corresponding geographic areas, and the geographic position of the IP address to be positioned is further determined by combining the coordinates of the geographic position samples in the geographic area with the largest distribution density, so that the accuracy of positioning the IP address is effectively improved.

Description

IP address positioning method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of IP address location, and in particular, to a method and an apparatus for IP address location, an electronic device, and a storage medium.
Background
With the rapid development of the internet and mobile communication technology, the network application and the network service bring great convenience to the daily life of people, and the network positioning technology can be used for further providing high-quality service for users and expanding the service field. IP (Internet Protocol) address location is one of the important methods in network location technology. In the prior art, the IP address location technology mainly determines the location of a user through a GPS (global positioning system) location signal received by a mobile terminal or peripheral WIFI (a wireless network communication technology) and base station information. If the acquired GPS positioning signal or WIFI signal has large deviation or cannot be acquired, the geographic position corresponding to the determined IP address has large deviation.
Therefore, the IP address positioning method in the prior art at least has the problem of low IP address positioning accuracy.
Disclosure of Invention
The embodiment of the application provides an IP address positioning method, and solves the problem that the IP address positioning method in the prior art is low in positioning accuracy.
In order to solve the foregoing problem, in a first aspect, an embodiment of the present application provides an IP address positioning method, including:
obtaining a geographical location sample of an IP address to be located, wherein the geographical location sample at least comprises: geographic location and sample weight information;
aggregating the geographic positions in the geographic position sample to obtain at least one geographic area;
aggregating the at least one geographical area again to determine an optimal geographical area;
and determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area.
In a second aspect, an embodiment of the present application provides an IP address positioning apparatus, including:
a geographic location sample obtaining module, configured to obtain a geographic location sample of the IP address to be located, where the geographic location sample at least includes: geographic location and sample weight information;
the sample aggregation module is used for aggregating the geographic positions in the geographic position samples acquired by the geographic position sample acquisition module to obtain at least one geographic area;
the optimal geographic area determining module is used for aggregating at least one geographic area obtained by the sample aggregation module again to determine an optimal geographic area;
and the IP address positioning module is used for determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the IP address location method according to the embodiment of the present application.
In a fourth aspect, the present application provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the steps of the IP address location method according to the present application.
According to the IP address positioning method disclosed by the embodiment of the application, at least one geographical area is obtained by obtaining a geographical position sample of an IP address to be positioned and then aggregating geographical positions in the geographical position sample; aggregating the at least one geographical area again to determine an optimal geographical area; and finally, determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area, thereby solving the problem of low positioning accuracy of the IP address positioning method in the prior art. By acquiring a large number of geographical position samples and aggregating the geographical positions carried in the geographical position samples to divide the samples into corresponding geographical areas, and further determining the geographical position of the IP address to be positioned by combining the coordinates of the geographical position samples in the geographical area with the maximum distribution density, the processing efficiency of the massive samples is improved, and meanwhile, due to the adoption of the large number of samples and the selection of a proper sample as a reference, the accuracy of the positioning of the IP address is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of an IP address location method according to a first embodiment of the present application;
fig. 2 is a schematic diagram of geographical area distribution in an IP address location method according to a second embodiment of the present application;
fig. 3 is a schematic structural diagram of an IP address location apparatus according to a third embodiment of the present application;
fig. 4 is a second schematic structural diagram of an IP address location device according to a third embodiment of the present application;
fig. 5 is a third schematic structural diagram of an IP address location apparatus according to a third embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
As shown in fig. 1, an IP address location method disclosed in the present application includes: step 100 to step 130.
Step 100, obtaining a geographical position sample of the IP address to be positioned.
The geographical location samples comprise at least: geographic location and sample weight information. When a user logs in an application program through a mobile terminal and carries out actions of searching nearby take-out, searching nearby shopping malls or accessory restaurants and the like, the application program acquires the position information of the mobile terminal by calling a system interface of the mobile terminal, and a service request initiated to a server by the user through the application program comprises the IP address of the user and the position information of the user. The server of the application program stores the behavior log of each service request of the user, and records the time of accessing the service, the IP address of the user and the geographic position information, such as the longitude and latitude coordinates of the geographic position, in the behavior log. Generally, the position information of the mobile terminal, which is acquired by the application program through the system interface of the mobile terminal, is obtained by positioning the mobile terminal according to GPS positioning data of the mobile terminal and/or nearby WIFI information scanned by the mobile terminal.
The server records the request of each user, and obtains a mass user historical behavior log containing the access time, the IP address and the geographic position corresponding to the IP address.
When a user accesses the application program through a computer, or accesses the application program through a mobile terminal, or when the user accesses a website page, the website page or the application program can obtain an IP address corresponding to an access request of the user, that is, an IP address of the user, and access time. Then, geographic position data is extracted from the user historical behavior log stored in the server, and a geographic position sample is generated. In specific implementation, a log at least recording access time, the IP address and the geographic position is selected, the access time and the geographic position are extracted from the log, and a piece of geographic position data is generated, wherein the geographic position data comprises: time of access and geographic location. Finally, a geographical location sample is generated according to the geographical location data, and the geographical location sample at least comprises: the sample weight is determined according to the relation between the visit time in the geographic position data and the current time, and the sample weight of the geographic position data with the visit time closer to the current time is larger.
And step 110, aggregating the geographic positions in the geographic position sample to obtain at least one geographic area.
And for the geographic position sample of the IP address to be positioned currently, aggregating the geographic position sample according to the geographic position coordinates carried in the sample to obtain at least one geographic area. In specific implementation, the geographical position samples can be aggregated by performing spatial index coding on the coordinates of the geographical position samples and then dividing the geographical position samples corresponding to the same spatial index coding into the same geographical area. By spatially indexing and coding the coordinates in the geographical position samples, the geographical position samples with the closer coordinates will obtain the same spatial index code. Therefore, after encoding, the geographic location coordinates corresponding to each spatial index code will form a geographic area.
In specific implementation, the geographic area can be divided according to administrative or commercial districts or other area division strategies, and then the geographic position sample is divided into different geographic areas according to the coordinate values.
And step 120, aggregating the at least one geographic area again to determine an optimal geographic area.
And for the obtained geographic areas, each geographic area comprises a large number of geographic position samples, and the area weight of each geographic area is determined according to the geographic position samples contained in each geographic area. For example: taking the sum of the weights of the geographic position samples included in a certain geographic area as the area weight of the geographic area; or, a quotient of the sum of the weights of the geographic position samples included in a certain geographic area and the sum of the weights of all the geographic position samples of the IP address is taken as the area weight of the geographic area.
Preferably, when determining the regional weight of the geographic area, the regional weight of the geographic area that is less than the preset distance away from the center of the geographic area is also combined with the periphery of the geographic area. For example, the sum of the area weight of the geographical area and the area weights of the geographical areas adjacent thereto is used as the final area weight of the geographical area.
In specific implementation, the factors for determining the region weight are determined according to specific requirements, and the action size of each factor in determining the region weight is also determined according to the specific requirements.
After determining the regional weight of each geographic region, the geographic region with the largest regional weight is selected as the optimal geographic region.
Step 130, determining the geographical position of the IP address to be positioned according to the geographical position sample in the optimal geographical area.
After the processing of the foregoing steps, the obtained optimal geographic area is usually an area with higher distribution density of geographic position samples, and the geographic position samples in the area have higher reliability and can be used for positioning the IP address to be positioned. For example, the coordinates of the geographic location with the largest distribution area weight in the geographic area may be used as the coordinates of the IP address to be located. Or, when the geographic location coordinate with the largest weight of the distribution area is greater than one, the centroid of the geographic location coordinate with the largest weight of all the distribution areas can be used as the coordinate of the IP address to be located. Wherein the distribution area weight is proportional to the number of geographical location samples carrying the geographical location coordinates.
According to the IP address positioning method disclosed by the embodiment of the application, at least one geographical area is obtained by obtaining a geographical position sample of an IP address to be positioned and then aggregating geographical positions in the geographical position sample; aggregating the at least one geographical area again to determine an optimal geographical area; and finally, determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area, thereby solving the problem of low positioning accuracy of the IP address positioning method in the prior art. By acquiring a large number of geographical position samples and aggregating the geographical positions carried in the geographical position samples to divide the samples into corresponding geographical areas, and further determining the geographical position of the IP address to be positioned by combining the coordinates of the geographical position samples in the geographical area with the maximum distribution density, the processing efficiency of the massive samples is improved, and meanwhile, due to the adoption of the large number of samples and the selection of a proper sample as a reference, the accuracy of the positioning of the IP address is effectively improved.
Example two
Based on the first embodiment, in another specific embodiment of the present application, a technical solution for obtaining a geographic area is described in detail by taking an example of dividing the geographic area by performing spatial index coding on a geographic position sample.
When obtaining a geographical location sample of an IP address to be located, the geographical location sample at least comprises: geographic location and sample weight information. Wherein the sample weight of the geographic location sample is related to the visit time of the corresponding geographic location, and the more recent the visit time is, the higher the sample weight is. In a specific implementation, the sample weight may be represented by the following formula:
Figure GDA0002498949950000061
wherein V represents a sample weight value; and Δ t represents the difference between the visit time and the current time, namely the difference between the time of visiting the geographic position by the user recorded in the user historical behavior log corresponding to the geographic position sample and the algorithm execution time. For example, if the time when a user visits the geographic location (38.597011,116.437109) recorded in a certain user historical behavior log is 2017, month 01 and day 05, month is 5, and the date of operation of the algorithm is 2017, month 01 and day 05, the value of Δ t is 5, and the value of Δ t is always a positive integer. As can be seen from the above formula, the sample weight of the geographic location sample is related to the visit time of the corresponding geographic location, and the newer the visit time, the higher the sample weight. In specific implementation, the location sample of the IP address to be located may be represented as a Key-Value pair of Key _ Value, where Key represents a geographic location, and Value represents a sample weight of the geographic location sample, for example: (38.597011,116.437109),1). In the subsequent step, the geographical positions are aggregated to realize the geographical area division. And the sample weight will be used as a basis for determining the weight of the geographical area.
In specific implementation, the step of aggregating the geographic positions in the geographic position sample to obtain at least one geographic area includes: carrying out spatial index coding on the geographic position coordinates of the geographic position samples; and aggregating the geographic positions corresponding to the same spatial index code to the same geographic area.
Spatial index coding (GeoHash) represents a rectangular geographic region with a string, the size of the region being determined by the length of the GeoHash string, the longer the string, the smaller the geographic region represented by the GeoHash string, e.g., a 4-bit GeoHash string may represent a 40km by 20km region, a 6-bit GeoHash may represent a 1.2km by 0.6km region, and an 8-bit GeoHash may represent a 40m by 20m region. After acquiring a large number of geographic position samples corresponding to an IP address to be positioned, the method first carries out spatial index coding on geographic position coordinate values (such as longitude and latitude coordinate values) carried by the geographic position samples.
Taking the IP address to be located as "192.168.0.1" as an example, it is assumed that there are 1 ten thousand acquired geographical location samples whose IP address to be located is "192.168.0.1", and a format of each geographical location sample may be represented as (Lat, Lon). For example, one of the geographical location samples is (38.597011,116.437109), where 38.597011 is the latitude value and 116.437109 is the longitude value of the geographical location.
And respectively carrying out spatial index coding on the coordinate value of each geographic position sample to obtain 1 ten thousand spatial index codes corresponding to 1 ten thousand geographic position samples. For example, the spatial index of the geographical location sample (38.597011,116.437109) is encoded as a 6-bit GeoHash, wwfg9 d; the spatial index coding of the other geographical position sample (38.597100,116.437100) is also wwfg9 d. Since wwfg9d may represent a 1.2km by 0.6km rectangular area, the 6-bit GeoHash code for all geographic location coordinates within the area is wwfg9d, and thus a 1.2km by 0.6km rectangular area may be identified by the 6-bit GeoHash code.
After the coordinate value of each geographical position sample is subjected to spatial index coding, all spatial index codes are traversed, the geographical position samples corresponding to the same spatial index code are divided into the same geographical area, a plurality of geographical areas can be obtained, and the spatial index codes are used for identifying the corresponding geographical areas. For example, the geographic location samples (38.597011,116.437109) and (38.597100,116.437100) are aggregated to a geographic region identified by the GeoHash value wwfg9 d. Taking 1 ten thousand geographic position samples subjected to spatial index coding to obtain corresponding 1 ten thousand spatial index coding values as an example, traversing the 1 ten thousand spatial index coding values, and finally possibly obtaining 300 different spatial index coding values. Of the 300 different spatial index code values, each spatial index code value corresponds to at least 1 sample of a geographic location. Taking the spatial index code value wwfg9d as an example, the corresponding geographical position samples at least include: (38.597100,116.437100) and (38.597011,116.437109). If 300 different spatial index code values are obtained after spatial index coding is performed on 1 ten thousand geographical position samples, then 1 ten thousand geographical position samples are divided into 300 geographical areas.
The purpose of performing the GeoHash coding on the geographic position coordinates is data degradation, one to-be-positioned IP address may correspond to tens of millions of geographic position coordinates, and if the calculation is directly performed according to the geographic position coordinates, a large amount of calculation resources are occupied. And ten million-level geographic position coordinates are converted into GeoHash codes with reasonable areas, which are only thousands of GeoHash codes, so that the operation order is greatly reduced. In specific implementation, the spatial index coding length may take other values, for example, 4 is a length, and the geographical area obtained by aggregation is larger.
Based on the first embodiment, in another specific embodiment of the present application, the at least one geographic area is aggregated again to determine an optimal geographic area, specifically: and aggregating the at least one geographic area again based on the area weight of the geographic area to determine an optimal geographic area, wherein the area weight is positively correlated with the sample distribution density and the sample weight of the geographic area and the adjacent geographic areas. The determining the optimal geographic area by aggregating the at least one geographic area again based on the regional weights for the geographic areas comprises sub-steps S1 through S3.
And a substep S1, determining, for each geographic area, an initial area weight for the geographic area based on the geographic location sample corresponding to the spatial index code identifying the geographic area and the corresponding sample weight.
The area weight of a geographic area is positively correlated with the number of geographic location samples corresponding to the spatial index code identifying the geographic area and the sample weight of each sample.
In a specific implementation, for each geographic area, a sum of sample weights of all geographic position samples included in the geographic area, that is, a sum of sample weights corresponding to all geographic position samples corresponding to the spatial index code identifying the geographic area, may be used as an initial weight of the geographic area.
For example, for the geographic area identified by the spatial index code wwfg9d, the number of the geographic position samples corresponding to the geographic area is counted, that is, the geographic position samples to which the geographic position coordinates corresponding to the spatial index code wwfg9d belong, then the sample weights of all the geographic position samples corresponding to the geographic area are summed, and the sum of the obtained sample weights is used as the area weight of the geographic area identified by the spatial index code wwfg9 d.
Or, for the geographic area identified by the spatial index code wwfg9d, counting the geographic position samples corresponding to the geographic area, that is, the geographic position samples to which the geographic position coordinates corresponding to the spatial index code wwfg9d belong; then, counting all the geographic position samples of the acquired IP address to be positioned, such as 192.168.0.1; and finally, summing the sample weights of all the geographical position samples corresponding to the geographical area, and taking the ratio of the obtained sample weight sum to the weighted value sum of all the geographical position samples of the IP address to be positioned as the area weight of the geographical area identified by the spatial index code wwfg9 d.
In specific implementation, the initial area weight of the geographic area may also be determined by other methods according to the weight of the geographic position sample corresponding to the geographic area, which is not illustrated in this embodiment.
In specific implementation, in order to further reduce the computation amount, after determining the initial region weight of the geographic region, the method further includes: and selecting the geographical area with the highest weight of the preset number of areas for IP address positioning. In specific implementation, the regional weights of the geographical regions identified by each GeoHash code can be sorted in descending order, and a preset number of geographical regions with higher regional weights are selected for positioning the IP address. For example, after 300 geographical areas obtained by aggregation in the previous step are sorted in the order of the area weights from large to small, the first 100 geographical areas may be taken for locating the IP address to be located. For a geographic area containing a small number of geographic position samples or a geographic area containing a small number of geographic position samples and having an earlier geographic position visit time, the geographic position samples are generally regarded as being not representative, and are used as dirty data and are not considered, so that the accuracy of positioning can be further improved.
Substep S2, regarding each geographic area, using the sum of its initial area weight and the initial area weight of the geographic area adjacent to it as the area weight of the geographic area, wherein the adjacent geographic area is the geographic area whose distance between the central points is smaller than the preset threshold distance.
Errors sometimes occur by determining the regional weights of geographic regions only by the number of geographic location samples included in each geographic region and selecting the optimal geographic region for locating the IP address accordingly. For example, for one IP address, five geographical areas are determined according to the obtained geographical location samples, wherein four geographical areas are in beijing, one geographical area is in Chongqing, and although the area in Chongqing has a larger weight, the area in Chongqing is not representative. For example, in real life, when a user in beijing goes on business to Chongqing, a user historical behavior log generated in a request for Chongqing access will be a data source of a geographic location sample of a geographic area obtained subsequently, but for the user, the geographic area distributed in beijing is the main basis for positioning. Therefore, in order to improve the accuracy of positioning in a specific implementation, it is necessary to determine a final area weight of a certain geographic area by combining area weights of a plurality of geographic areas within a certain range. That is, the final regional weight of the geographic region is determined from the sample distribution of geographic locations within a certain range around the geographic region.
In specific implementation, the geographic region identified by the GeoHash code corresponds to a rectangular block, and the rectangular block has a central point. The specific method for calculating the center point of the rectangular block corresponding to the geographic area identified by the GeoHash code is referred to in the prior art, and is not described herein again. As shown in fig. 2, it is assumed that the geographic location samples corresponding to the IP address to be located are aggregated into 4 geographic areas, which are: 210 to 240, each geographic area including a plurality of geographic locations, such as 211, 212. As shown in FIG. 2, the center point of the geographic area 210 is O1The center point of the geographic area 220 is O2The center point of the geographic area 230 is O3The center point of the geographic area 240 is O4And respectively calculating the distance between each geographic area and the central point of the other 3 geographic areas. Taking the geographic region 210 as an example, O is calculated separately1And O2Distance L of12、O1And O3Distance L of13And O1And O4Distance L of14. Then, L is judged separately12、L13、L14Whether greater than a predetermined threshold distance, such as 1 km, and appending an area weight of a geographic area having a distance from the geographic area 210 that is less than the predetermined threshold distance to the area weight of the geographic area 210. The geographic region 210 determined in sub-step S1 has a regional weight W1Geographic region 220 has a regional weight of W2Geographic region 230 has a region weight of W3Geographic region 240 has a region weight of W4For example, if the distance L between the geographic area 210 and the center point of the geographic area 220, 23012And L13Are all less than the preset threshold distance, the area weight of geographic area 210, i.e., W, is updated by the sum of the area weights of geographic area 210, geographic area 220, and geographic area 2301’=W1+W2+W3. The distances between the geographic area 240 and the center points of the geographic areas 210, 220, 230 are all greater than the preset threshold distance, and the area weight of the geographic area 240 remains unchanged. By adopting the method, the updated regional weights W of the geographic regions are respectively obtained1’、W2’、W3' andW4'. And selecting the geographical area corresponding to the updated maximum area weight as the optimal geographical area.
In specific implementation, when the distance between the center points of the two geographic areas is calculated, the following formula can be used:
C=sin(LatA)*sin(LatB)*cos(LonA-LonB)+cos(LatA)*cos(LatB);
Distance=R*Arccos(C)*Pi/180;
the method comprises the following steps that A and B respectively represent central points of two geographical areas, the longitude and latitude of the point A are latA and lonA, the longitude and latitude of the point B are latB and lonB, the latitude range is (-90), the longitude range is (-180), R represents the radius of the earth, Pi represents the circumferential ratio, and Distance represents the Distance between two map points A and B.
And a substep S3 of determining the geographical area with the highest regional weight as the optimal geographical area.
In specific implementation, if the updated geographic area corresponding to the maximum area weight is more than one, one of the geographic areas with the maximum area weight can be randomly selected as the optimal geographic area. Other additional factors may be used to further adjust the regional weight of the geographic region having the greatest regional weight. In particular implementations, additional factors that may be employed may include: the initial regional weight for the geographic region determined in sub-step S1. For example, the zone weights W when updating the geographic zones 210 and 2201' and W2' As such, if the regional weight W for the geographic region 210 determined in sub-step S11A regional weight W greater than the geographic region 220 determined in sub-step S12Then the geographic area 210 is determined to be the optimal geographic area.
After the optimal geographical area is determined, the geographical position of the IP address to be positioned is determined according to the geographical position sample in the optimal geographical area, and the method comprises the following steps: and taking the coordinate value of the geographical position sample with the maximum occurrence frequency in the optimal geographical area or the maximum occurrence frequency within preset time as the geographical position of the IP address to be positioned. Assuming that 200 geographic location samples correspond to the GeoHash code wwfg9d identifying the geographic area 210, and 60 geographic location samples carrying the geographic location coordinates (38.597011,116.437109) are maximum, the geographic location coordinates (38.597011,116.437109) are taken as the geographic location of the IP address to be located.
In specific implementation, before the step of using the coordinate value of the geographical location sample with the largest occurrence frequency in the optimal geographical area or the largest occurrence frequency within a preset time as the geographical location of the IP address to be located, the method further includes: if the number of the geographical position samples with the largest occurrence frequency in the optimal geographical area or the largest occurrence frequency in the preset time is more than 1; and taking the centroid of the geographical position sample with the maximum occurrence frequency in the optimal geographical area or the maximum occurrence frequency within the preset time as the geographical position of the IP address to be positioned. Similarly, assuming that 200 geo-location samples corresponding to the GeoHash code wwfg9d identifying the geographical area 210, wherein the number of geo-location samples carrying the geo-location coordinates (38.597011,116.437109) and (38.597001,116.437110) is at most, 60 each, the centroid of the geo-location coordinates (38.597011,116.437109) and (38.597001,116.437110) is taken as the geo-location of the IP address to be located.
According to the IP address positioning method disclosed by the embodiment of the application, at least one geographical area is obtained by obtaining a geographical position sample of an IP address to be positioned and then aggregating geographical positions in the geographical position sample; aggregating the at least one geographical area again to determine an optimal geographical area; and finally, determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area, thereby solving the problem of low positioning accuracy of the IP address positioning method in the prior art. By acquiring a large number of geographical position samples and aggregating the geographical positions carried in the geographical position samples to divide the samples into corresponding geographical areas, and further determining the geographical position of the IP address to be positioned by combining the coordinates of the geographical position samples in the geographical area with the maximum distribution density, the processing efficiency of the massive samples is improved, and meanwhile, due to the adoption of the large number of samples and the selection of a proper sample as a reference, the accuracy of the positioning of the IP address is effectively improved.
According to the IP address positioning method disclosed by the application, the reference geographic position of each IP address is subjected to Geohash block coding conversion, and the area represented by Geohash is used as a unit to participate in calculation, so that the system can flexibly process sampling data of any magnitude. By carrying out map point aggregation on the geographical position sample, the rough range of IP address positioning can be effectively planned, and the problem that the weight of an abnormal data area is overlarge due to the fact that the longitude and latitude with the maximum positioning times are directly used as the calculation coordinate of the IP is avoided. And determining the optimal geographic area in the rough range, and then determining the optimal coordinate point, which also conforms to the internet IP allocation system defining method, so that the most accurate geographic position of the IP address can be found.
By the IP address positioning method disclosed in the present application, for the case that the application program cannot obtain the geographic location corresponding to the IP address according to the IP address positioning method in the prior art, for example: without a GPS location signal or a WIFI signal, the IP address can be accurately located to ensure normal start and accurate execution of the geographic location based service.
EXAMPLE III
Correspondingly, as shown in fig. 3, the IP address positioning apparatus disclosed in the embodiment of the present application includes:
a geographic location sample obtaining module 300, configured to obtain a geographic location sample of an IP address to be located, where the geographic location sample at least includes: geographic location and sample weight information;
a sample aggregation module 310, configured to aggregate geographic locations in the geographic location samples acquired by the geographic location sample acquisition module 300 to obtain at least one geographic area;
an optimal geographic area determining module 320, configured to aggregate at least one geographic area obtained by the sample aggregation module 310 again to determine an optimal geographic area;
and the IP address positioning module 330 is configured to determine the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area.
Preferably, the sample weight of the geographic position sample is related to the visit time of the corresponding geographic position, and the more recent the visit time is, the higher the sample weight is.
Optionally, as shown in fig. 4, the sample aggregation module 310 includes:
a geographic position encoding unit 3101, configured to perform spatial index encoding on geographic position coordinates of the geographic position samples;
a geographic location aggregation unit 3102, configured to aggregate geographic locations corresponding to the same spatial index code into the same geographic area.
Optionally, the optimal geographic area determining module 320 is configured to determine an optimal geographic area by aggregating the at least one geographic area again based on an area weight of the geographic area, where the area weight is positively correlated to the sample distribution density and the sample weight of the geographic area and its neighboring geographic areas.
Optionally, as shown in fig. 4, the optimal geographic area determining module 320 includes:
a geographic area weight determining unit 3201, configured to determine, for each geographic area, an initial area weight of the geographic area according to the geographic position sample corresponding to the spatial index code that identifies the geographic area and the corresponding sample weight;
a geographic area weight adjusting unit 3202, configured to, for each geographic area, use a sum of an initial area weight of the geographic area and an initial area weight of a geographic area adjacent to the initial area weight as an area weight of the geographic area, where the adjacent geographic area is a geographic area in which a distance between center points is smaller than a preset threshold distance;
an optimal aggregated geographic determination unit 3203, configured to determine a geographic area with the highest area weight as the optimal geographic area.
Optionally, as shown in fig. 4, the optimal geographic area determining module 320 includes:
a geographical area selection unit 3204 is configured to select a preset number of geographical areas with the highest geographical area weight for use in determining the optimal geographical area.
Optionally, as shown in fig. 4, the IP address positioning module 330 includes:
the first IP address locating unit 3301 is configured to use the coordinate value of the geographic location sample with the largest occurrence frequency in the optimal geographic area or the largest occurrence frequency within a preset time as the geographic location of the IP address to be located.
Optionally, as shown in fig. 5, the IP address positioning module 330 further includes:
a second IP address locating unit 3302, configured to, if the number of the geographic location samples with the largest number of occurrences in the optimal geographic area or the largest number of occurrences within the preset time is greater than 1, use the centroid of the geographic location sample with the largest number of occurrences in the optimal geographic area or the largest number of occurrences within the preset time as the geographic location of the IP address to be located.
The IP address positioning device disclosed by the embodiment of the application obtains at least one geographical area by obtaining a geographical position sample of an IP address to be positioned and then aggregating geographical positions in the geographical position sample; aggregating the at least one geographical area again to determine an optimal geographical area; and finally, determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area, thereby solving the problem of low positioning accuracy of the IP address positioning method in the prior art. By acquiring a large number of geographical position samples and aggregating the geographical positions carried in the geographical position samples to divide the samples into corresponding geographical areas, and further determining the geographical position of the IP address to be positioned by combining the coordinates of the geographical position samples in the geographical area with the maximum distribution density, the processing efficiency of the massive samples is improved, and meanwhile, due to the adoption of the large number of samples and the selection of a proper sample as a reference, the accuracy of the positioning of the IP address is effectively improved.
Through the IP address positioning device disclosed by the application, the reference geographic position of each IP address is subjected to Geohash block coding conversion, and the area represented by Geohash is used as a unit to participate in calculation, so that the system can flexibly process sampling data of any magnitude. By carrying out map point aggregation on the geographical position sample, the rough range of IP address positioning can be effectively planned, and the problem that the weight of an abnormal data area is overlarge due to the fact that the longitude and latitude with the maximum positioning times are directly used as the calculation coordinate of the IP is avoided. And determining the optimal geographic area in the rough range, and then determining the optimal coordinate point, which also conforms to the internet IP allocation system defining method, so that the most accurate geographic position of the IP address can be found.
With the IP address location device disclosed in the present application, when an application cannot obtain a geographic location corresponding to an IP address according to an IP address location method in the prior art, for example: without a GPS location signal or a WIFI signal, the IP address can be accurately located to ensure normal start and accurate execution of the geographic location based service.
Correspondingly, the application also discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the IP address positioning method according to the first embodiment and the second embodiment of the application. The electronic device can be a PC, a mobile terminal, a personal digital assistant, a tablet computer and the like.
The present application also discloses a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the IP address location method according to the first and second embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The IP address positioning method and device provided by the present application are introduced in detail, and a specific example is applied in the description to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understanding the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding, the above technical solutions may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM (read only memory)/RAM (random access memory), a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to each embodiment or some parts of the embodiments.

Claims (12)

1. An IP address location method, comprising:
acquiring a geographic position sample of an IP address to be positioned, wherein the geographic position sample at least comprises a geographic position and sample weight information, the sample weight of the geographic position sample is related to the access time of the corresponding geographic position, and the more the access time is, the higher the sample weight is;
aggregating the geographic positions in the geographic position sample to obtain at least one geographic area;
aggregating the at least one geographical area again to determine an optimal geographical area;
determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area;
wherein the step of aggregating the at least one geographical area again to determine an optimal geographical area comprises:
for each geographic area, determining an initial area weight of the geographic area according to a geographic position sample corresponding to a spatial index code identifying the geographic area and a corresponding sample weight;
regarding each geographic area, taking the sum of the initial area weight of the geographic area and the initial area weight of the geographic area adjacent to the initial area weight as the area weight of the geographic area, wherein the adjacent geographic area is the geographic area of which the distance between the central points is less than a preset threshold distance;
and determining the geographical area with the highest area weight as the optimal geographical area.
2. The method of claim 1, wherein the step of aggregating the geographic locations in the sample of geographic locations to obtain at least one geographic region comprises:
carrying out spatial index coding on the geographic position coordinates of the geographic position samples;
and aggregating the geographic positions corresponding to the same spatial index code to the same geographic area.
3. The method of claim 1, wherein after the step of determining the initial regional weight for the geographic region, the method further comprises:
and selecting the geographical areas with the highest weight of the preset number of geographical areas for determining the optimal geographical areas.
4. The method according to any one of claims 1 to 3, wherein the step of determining the geographical location of the IP address to be located according to the geographical location samples in the optimal geographical area comprises:
and taking the coordinate value of the geographical position sample with the maximum occurrence frequency in the optimal geographical area or the maximum occurrence frequency within preset time as the geographical position of the IP address to be positioned.
5. The method of claim 4,
and if the number of the geographical position samples with the largest occurrence times in the optimal geographical area or the largest occurrence times in the preset time is more than 1, taking the centroid of the geographical position sample with the largest occurrence times in the optimal geographical area or the largest occurrence times in the preset time as the geographical position of the IP address to be positioned.
6. An IP address locating apparatus, comprising:
a geographic location sample obtaining module, configured to obtain a geographic location sample of the IP address to be located, where the geographic location sample at least includes: geographic position and sample weight information, the sample weight of the geographic position sample is relevant to the visit time of the corresponding geographic position, and the more new the visit time is, the higher the sample weight is;
the sample aggregation module is used for aggregating the geographic positions in the geographic position samples acquired by the geographic position sample acquisition module to obtain at least one geographic area;
the optimal geographic area determining module is used for aggregating at least one geographic area obtained by the sample aggregation module again to determine an optimal geographic area;
the IP address positioning module is used for determining the geographic position of the IP address to be positioned according to the geographic position sample in the optimal geographic area;
the optimal geographical area determination module comprises:
a geographical area weight determining unit, configured to determine, for each geographical area, an initial area weight of the geographical area according to a geographical location sample corresponding to a spatial index code that identifies the geographical area and a corresponding sample weight;
a geographical region weight adjusting unit, configured to, for each geographical region, use a sum of an initial region weight of the geographical region and an initial region weight of a geographical region adjacent to the geographical region as a region weight of the geographical region, where the distance between center points of the adjacent geographical regions is smaller than a preset threshold distance
And the optimal aggregation geographic determination unit is used for determining the geographic area with the highest regional weight as the optimal geographic area.
7. The apparatus of claim 6, wherein the sample aggregation module comprises:
the geographic position coding unit is used for carrying out spatial index coding on the geographic position coordinates of the geographic position samples;
and the geographic position aggregation unit is used for aggregating the geographic positions corresponding to the same spatial index code to the same geographic area.
8. The apparatus of claim 6, wherein the optimal geographic area determination module comprises:
and the geographic area selection unit is used for selecting the geographic areas with the highest weight of the preset number of geographic areas to be used for determining the optimal geographic area.
9. The apparatus according to any one of claims 6 to 8, wherein the IP address location module comprises:
and the first IP address positioning unit is used for taking the coordinate value of the geographical position sample with the maximum occurrence frequency in the optimal geographical area or the maximum occurrence frequency within preset time as the geographical position of the IP address to be positioned.
10. The apparatus of claim 9, wherein the IP address location module further comprises:
and the second IP address positioning unit is used for taking the centroid of the geographical position sample which meets the preset condition and has the maximum occurrence frequency or the maximum occurrence frequency within the preset time in the optimal geographical area as the geographical position of the IP address to be positioned if the number of the geographical position samples which have the maximum occurrence frequency in the optimal geographical area or have the maximum occurrence frequency within the preset time is greater than 1.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the IP address location method of any one of claims 1 to 5 when executing the computer program.
12. A storage medium having stored thereon a computer program, characterized in that the program, when being executed by a processor, carries out the steps of the IP address localization method of any one of claims 1 to 5.
CN201710942850.7A 2017-10-11 2017-10-11 IP address positioning method and device, electronic equipment and storage medium Active CN108011987B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710942850.7A CN108011987B (en) 2017-10-11 2017-10-11 IP address positioning method and device, electronic equipment and storage medium
PCT/CN2018/108010 WO2019072092A1 (en) 2017-10-11 2018-09-27 Ip address positioning method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710942850.7A CN108011987B (en) 2017-10-11 2017-10-11 IP address positioning method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108011987A CN108011987A (en) 2018-05-08
CN108011987B true CN108011987B (en) 2020-09-04

Family

ID=62051396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710942850.7A Active CN108011987B (en) 2017-10-11 2017-10-11 IP address positioning method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN108011987B (en)
WO (1) WO2019072092A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108011987B (en) * 2017-10-11 2020-09-04 北京三快在线科技有限公司 IP address positioning method and device, electronic equipment and storage medium
CN109299747B (en) * 2018-10-24 2020-12-15 北京字节跳动网络技术有限公司 Method and device for determining cluster center, computer equipment and storage medium
CN111221924B (en) * 2018-11-23 2023-04-11 腾讯科技(深圳)有限公司 Data processing method, device, storage medium and network equipment
CN109743745B (en) * 2019-02-19 2021-01-22 北京三快在线科技有限公司 Mobile network access type identification method and device, electronic equipment and storage medium
CN109982413B (en) * 2019-02-19 2023-04-07 北京三快在线科技有限公司 Mobile hotspot identification method and device, electronic equipment and storage medium
CN111225079B (en) * 2019-12-31 2024-03-05 苏州三六零智能安全科技有限公司 Method, device, storage medium and device for positioning geographical position of malicious software author
CN111372242B (en) * 2020-01-16 2023-10-03 深圳市卡牛科技有限公司 Fraud identification method, fraud identification device, server and storage medium
CN111382212B (en) * 2020-03-02 2021-07-27 拉扎斯网络科技(上海)有限公司 Associated address acquisition method and device, electronic equipment and storage medium
CN111711707B (en) * 2020-04-30 2023-08-08 国家计算机网络与信息安全管理中心江苏分中心 IP address positioning method based on neighbor relation
CN111694914B (en) * 2020-06-08 2023-08-29 北京百度网讯科技有限公司 Method and device for determining resident area of user
CN115086411B (en) * 2022-06-16 2023-12-05 京东城市(北京)数字科技有限公司 IP positioning method, system, storage medium and electronic equipment
CN114793203B (en) * 2022-06-21 2022-08-30 北京奕千科技有限公司 IP tracing method for seed downloading

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677804A (en) * 2015-12-31 2016-06-15 百度在线网络技术(北京)有限公司 Determination of authority stations and building method and device of authority station database
CN105933294A (en) * 2016-04-12 2016-09-07 晶赞广告(上海)有限公司 Network user positioning method, device and terminal
CN106646339A (en) * 2017-01-06 2017-05-10 重庆邮电大学 Online matching and positioning method in wireless position fingerprint indoor positioning
CN106792522A (en) * 2016-12-09 2017-05-31 北京羲和科技有限公司 A kind of fingerprint base localization method and system based on access point AP

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103220376B (en) * 2013-03-30 2014-07-16 清华大学 Method for positioning IP (Internet Protocol) by position data of mobile terminal
CN103248723B (en) * 2013-04-10 2015-11-25 腾讯科技(深圳)有限公司 The defining method of region, a kind of IP address and device
US9904932B2 (en) * 2014-12-29 2018-02-27 Google Llc Analyzing semantic places and related data from a plurality of location data reports
CN106534392B (en) * 2015-09-10 2019-12-06 阿里巴巴集团控股有限公司 Positioning information acquisition method, positioning method and device
CN106936887B (en) * 2015-12-31 2020-10-20 珠海金山办公软件有限公司 Geographic position positioning method and device
CN108011987B (en) * 2017-10-11 2020-09-04 北京三快在线科技有限公司 IP address positioning method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677804A (en) * 2015-12-31 2016-06-15 百度在线网络技术(北京)有限公司 Determination of authority stations and building method and device of authority station database
CN105933294A (en) * 2016-04-12 2016-09-07 晶赞广告(上海)有限公司 Network user positioning method, device and terminal
CN106792522A (en) * 2016-12-09 2017-05-31 北京羲和科技有限公司 A kind of fingerprint base localization method and system based on access point AP
CN106646339A (en) * 2017-01-06 2017-05-10 重庆邮电大学 Online matching and positioning method in wireless position fingerprint indoor positioning

Also Published As

Publication number Publication date
WO2019072092A1 (en) 2019-04-18
CN108011987A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
CN108011987B (en) IP address positioning method and device, electronic equipment and storage medium
CN109992633B (en) User position-based geo-fence determination method and device and electronic equipment
US10034141B2 (en) Systems and methods to identify home addresses of mobile devices
US20180249398A1 (en) Systems, Methods, and Apparatus for Geolocation Platform Mechanics
CN108181607B (en) Positioning method and device based on fingerprint database and computer readable storage medium
US20130054647A1 (en) Information processing apparatus, information processing method, and program
CN109084795B (en) Method and device for searching service facilities based on map service
CN103929719B (en) The optimization method and optimization device of location information
KR20140112545A (en) Finding wireless network access points
CN109033128A (en) A kind of geographic position identification method and device
CN111107556B (en) Signal coverage quality evaluation method and device of mobile communication network
RU2641246C2 (en) Method and device of safety evaluation
CN109787961B (en) False flow identification method and device, storage medium and server
CN111479321B (en) Grid construction method and device, electronic equipment and storage medium
CN108574715A (en) Information recommendation method, apparatus and system
TW201644301A (en) Method and System for Determining a Positioning Interval of a Mobile Terminal
CN111447292B (en) IPv6 geographical position positioning method, device, equipment and storage medium
US20210337400A1 (en) Identification and prioritization of optimum capacity solutions in a telecommunications network
CN111459723A (en) Terminal data processing system
Andrade et al. RiSC: Quantifying change after natural disasters to estimate infrastructure damage with mobile phone data
CN110532254A (en) The method and apparatus of fused data table
CN111143639B (en) User intimacy calculation method, device, equipment and medium
CN112887910B (en) Method and device for determining abnormal coverage area and computer readable storage medium
CN110659320B (en) Analysis method and analysis device for occupational distribution and readable storage medium
CN112214677A (en) Interest point recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant