CN111026829A - Street-level landmark obtaining method based on service identification and domain name association - Google Patents

Street-level landmark obtaining method based on service identification and domain name association Download PDF

Info

Publication number
CN111026829A
CN111026829A CN201911264591.2A CN201911264591A CN111026829A CN 111026829 A CN111026829 A CN 111026829A CN 201911264591 A CN201911264591 A CN 201911264591A CN 111026829 A CN111026829 A CN 111026829A
Authority
CN
China
Prior art keywords
domain name
street
level
feature
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911264591.2A
Other languages
Chinese (zh)
Other versions
CN111026829B (en
Inventor
罗向阳
李瑞祥
尹美娟
徐锐
杨文�
郭鑫淼
杨春芳
朱玛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201911264591.2A priority Critical patent/CN111026829B/en
Publication of CN111026829A publication Critical patent/CN111026829A/en
Application granted granted Critical
Publication of CN111026829B publication Critical patent/CN111026829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a street-level landmark obtaining method based on service identification and domain name association, which comprises the steps of firstly extracting and simplifying features from the scanning result of an IP (Internet protocol) of a known service type to obtain training features, utilizing the training features to train a classifier to obtain an IP classifier, and identifying the service borne by the IP of the unknown service type by using the IP classifier to obtain a server IP; then, based on the relation between the mechanism information and the domain name obtained by statistics, estimating the domain name keyword of the mechanism according to the mechanism name, and constructing a mechanism information base of a target area to realize the mapping between the geographic position of the mechanism and the domain name; finally, converting the identified server IP into a domain name, obtaining the geographic position of the domain name by using the strategies of database query, online map search, mechanism information base matching and the like, thereby obtaining a street-level landmark, and evaluating the reliability of the landmark to obtain a reliable street-level landmark; the invention improves the quantity of the obtained street-level landmarks.

Description

Street-level landmark obtaining method based on service identification and domain name association
Technical Field
The invention relates to the technical field of landmark acquisition, in particular to a street-level landmark acquisition method based on service identification and domain name association.
Background
At present, IP positioning has very high application prospect in the aspects of determining network space boundary, tracking network attack object, positioning hidden communication main body and the like. The IP positioning based on landmarks is a common positioning method with a more accurate positioning result, and the number, precision and reliability of landmarks directly affect the reliability of the positioning result. How to obtain rich reliable landmarks is an urgent problem to be solved in IP positioning. According to the source of the obtained landmark, the existing landmark obtaining methods are mainly divided into a landmark obtaining method based on IP position database query, a landmark obtaining method based on Web pages and a social base.
Currently, some data service companies establish an IP location database (e.g., Baidu, IPIP, IP. cn, MaxMind, etc.) to map IP and geographic locations. The landmark obtaining method based on the position database query is to query the geographic position corresponding to the IP in the existing position database, thereby realizing the landmark obtaining. The method can acquire a large number of landmarks in a short time, but the highest precision of the landmarks provided by the existing database is only the city level, and the reliability of the database is not high. Therefore, it is difficult to obtain a large number of reliable street-level landmarks using this method.
The Web page contains rich geographic position information, and the geographic position in the Web page is associated with the IP address corresponding to the Web domain name, so that landmark acquisition is realized. Based on the thought, Guo C et al propose a Structon method, take a Web page with wide source and huge number as a landmark acquisition source for the first time, and expand the number of landmarks based on a position inference algorithm. The Structon method realizes the acquisition of street-level landmarks, but due to limitations in network bandwidth limitation, difficulty in acquiring URL sources, diversity of Web page structures and the like, the method has difficulty in acquiring a large number of street-level landmarks.
Wang Y et al obtain an organization directory in a specific range by querying based on organization data (which usually includes an organization name, an organization address, an organization domain name, etc.) included in an online map, associate an organization position with an organization domain name IP, and realize landmark acquisition; jiang H et al, based on the information of colleges and universities included in Wikipedia, associates the IP address of the university Web server with the geographic location of the university, and establishes a landmark library.
The online map-based landmark acquisition method and the navigation page-based landmark acquisition method can acquire a plurality of landmarks in one visit or query, and the efficiency of acquiring street-level landmarks is higher, but the number of the acquired landmarks is limited by the amount of the included data.
After analyzing the characteristics of different types of internet forums, zhugue et al propose an internet forum-based urban landmark mining method, guess the geographical position of a forum user set based on semantic information in the forum name, and associate the access IP of the forum user, thereby realizing landmark acquisition. The method also extracts the location nouns searched by the user from the search engine log according to the social relationship between the location nouns searched by the user and the user, and associates the location nouns with the IP used for searching, thereby realizing the acquisition of the landmark. Compared with database query and online map search methods, the two methods can obtain more landmarks, but the landmark position granularity is coarse, the landmark position granularity can only reach the city level, and a large number of street-level landmarks are difficult to obtain.
In addition, other landmark acquisition methods exist, such as acquiring landmarks based on a target cooperation mode, acquiring longitude and latitude data of equipment through a GPS, and associating the longitude and latitude data with an IP address of the equipment to realize landmark acquisition. The method can obtain the high-precision reliable landmarks, but needs the support of hardware, and has high cost for obtaining the landmarks in a large batch.
Therefore, a method for rapidly acquiring a large number of street-level landmarks is needed.
Disclosure of Invention
The invention aims to provide a street-level landmark obtaining method based on service identification and domain name association, which can firstly classify a server IP and obtain a corresponding domain name, then obtain an organization domain name keyword according to organization information, and finally match the server domain name and the organization domain name to realize mapping between the server IP and the organization geographic position, thereby obtaining a street-level landmark.
The technical scheme adopted by the invention is as follows: a street-level landmark obtaining method based on service identification and domain name association comprises the following steps:
step 1: acquiring a plurality of IPs, wherein the IPs comprise an IP of a known service type and an IP of a plurality of unknown service types;
step 2: using a port scanning tool to perform open port scanning on all IP ports to obtain the open condition of each IP port;
and step 3: extracting training features for classification from the scan results of the IP of a known service type: reducing an IP open port of a known service type by adopting a feature reduction algorithm to obtain a minimum feature set, wherein the minimum feature set is used as a training feature;
and 4, step 4: training an IP classifier by using the training characteristics obtained in the step 3, and classifying the IP of the unknown service type by using the trained IP classifier to obtain a server IP;
and 5: acquiring a domain name corresponding to a server IP: performing domain name resolution on the server IP obtained in the step 4 under each DNS respectively to obtain domain name information corresponding to the server IP; if one server IP analyzes a plurality of domain name information, respectively establishing the mapping relation between the IP and the domain name;
step 6: obtaining a city to which an unknown service type IP belongs based on a voting strategy, and constructing an organization information base of the city based on a domain name and an organization name; obtaining mechanism information corresponding to the domain name of each server IP by utilizing one or more of an online map, a mechanism record base and a mechanism information base matching method according to the characteristics of various domain names of the server IP obtained in the step 5, thereby obtaining the association between the server IP and the mechanism geographic position and obtaining street-level candidate landmarks;
and 7: and (4) evaluating the street-level candidate landmarks obtained in the step (6) by using a street-level landmark evaluation method, so as to obtain reliable street-level landmarks.
Preferably, the step 3 comprises the following steps:
3.1: setting m types of service types provided by the IP of the known service type, setting SE (IP) to represent a set of IP structures of the known service types providing the same type of service, and sequentially representing the set of the IP structures of the m known service types providing the same type of service as SE1(IP) and SE2(IP) … SEq (IP) … SEm (IP); q is more than or equal to 1 and less than or equal to m;
setting feature (IP) as a feature set of a single IP in SE (IP); sorting all features (IP) according to the order of the number of elements in the features (IP) from small to large, and sorting the elements in SE (IP); the sorted Feature sets are respectively marked as Feature (IP1), Feature (IP2), … Feature (IPi) … Feature (IPj) … Feature (IPn); elements in se (IP) are correspondingly denoted as IP1, IP2, … IPi … IPj … IPn; i is more than or equal to 1 and less than or equal to j and less than or equal to n, wherein n is the number of the IPs of the known service types in the SE (IP);
the reduction algorithm for SE (IP) is as follows:
Figure BDA0002312487280000034
if it satisfies
Figure BDA0002312487280000031
Then IP will bejDeleted from SE (IP);
until when
Figure BDA0002312487280000032
All satisfy
Figure BDA0002312487280000033
The SE (IP) reduced feature set FeatureSet is a union of feature sets FeatureSet (IP) of all the remaining IPs in SE (IP);
3.2: respectively reducing SE1(IP), SE2(IP) … SEq (IP) … SEm (IP) to obtain reduced feature sets Featureset1 and Featureset2 … Featureset … Featureset;
3.3: the minimum feature set is the union of the feature sets FeatureSet1, FeatureSet2 … FeatureSetq … FeatureSetm.
Specifically, in step 6, constructing the organization information base of the city includes the following steps:
the method comprises the steps of obtaining a POI library and a mechanism directory of a target city from public data sets, obtaining mechanism names and categories of mechanisms, and extracting domain name keywords from the mechanism names; for non-English organization names, converting Chinese keywords in the organization names into letter combinations, and taking the letter combinations as domain name keywords; and associating the domain name keywords with the organization names to construct an organization information base.
Specifically, in step 6, the mechanism information base matching method includes the following steps:
and extracting a subdomain name field which implies mechanism information in the domain name of the server IP as an information field, matching the information field with domain name keywords in an mechanism information base, establishing association between the domain name of the server IP and the mechanism name, and finally establishing mapping between the server IP and the mechanism geographical position to obtain street-level candidate landmarks.
Firstly, extracting and simplifying features from the scanning result of the IP of the known service type to obtain training features, training a classifier by using the training features to obtain an IP classifier, and identifying the service borne by the IP of the unknown service type by using the IP classifier to obtain a server IP; then, based on the relation between the mechanism information and the domain name obtained by statistics, estimating the domain name keyword of the mechanism according to the mechanism name, and constructing a mechanism information base of a target area to realize the mapping between the geographic position of the mechanism and the domain name; finally, converting the identified server IP into a domain name, obtaining the geographic position of the domain name by using the strategies of database query, online map search, mechanism information base matching and the like, thereby obtaining a street-level landmark, and evaluating the reliability of the landmark to obtain a reliable street-level landmark; the invention can not only obtain the landmarks of the Web server, but also obtain the landmarks of other types of servers, thereby increasing the number of the obtained street level landmarks and serving for improving the IP positioning and positioning accuracy.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 shows F of the SVM-IP classifier of the present invention when Kernel function is linear and accuracy is Tol 0.0011A value;
FIG. 3 shows F of the SVM-IP classifier of the present invention when Kernel function Kernel is rbf and accuracy Tol is 0.0011A value;
FIG. 4 shows F of the SVM-IP classifier of the present invention when Kernel is 0.001 and accuracy is 0.011A value;
FIG. 5 shows F of the SVM-IP classifier of the present invention when Kernel function is linear and accuracy is Tol 0.00011A value;
FIG. 6 shows F of the SVM-IP classifier of the present invention when Kernel function Kernel is rbf and accuracy Tol is 0.00011A value;
FIG. 7 shows F of the SVM-IP classifier of the present invention when Kernel is 0.0001 and accuracy is Kernel is sigmoid1A value;
FIG. 8 is the average classification F of SVM-IP classifiers of different parameters of the present invention1A value comparison graph;
FIG. 9 shows the KNN-IP classifier of the present invention adjusting the neighbor node number neighbor nodes to F of uniform1A value;
FIG. 10 shows the KNN-IP classifier of the present invention adjusting the neighbor node number neighbor distances F1A value;
FIG. 11 shows the average classification F of the KNN-IP classifiers of different parameters according to the present invention1A value comparison graph;
FIG. 12 shows an MLP-IP classifier with 1 hidden Level and 1 Activation function ideF with ntity and accuracy Tol of 0.0011A value;
FIG. 13 shows F with hidden Level 1, Activation function Activation and precision Tol 0.001 for MLP-IP classifier of the present invention1A value;
fig. 14 shows F1 values of the MLP-IP classifier of the present invention with the hidden Level 1, Activation function Activation relu, and precision Tol 0.001;
fig. 15 shows F1 values of the MLP-IP classifier of the present invention with the hidden Level 1, the Activation function Activation tanh, and the accuracy Tol 0.001;
FIG. 16 shows F with hidden Level 1, Activation function activity and precision Tol of MLP-IP classifier of the present invention being 0.00011A value;
FIG. 17 shows F with hidden Level 1, Activation function Activation and precision Tol 0.0001 for MLP-IP classifier of the present invention1A value;
FIG. 18 shows F with hidden Level 1, Activation function relu, and precision Tol of MLP-IP classifier of the present invention1A value;
FIG. 19 shows F with hidden Level 1, Activation function Activation tanh and precision Tol of MLP-IP classifier of the present invention1A value;
FIG. 20 is a diagram illustrating an average classification F of an MLP-IP classifier with different adjusted hidden layer number 2 and different other parameters according to the present invention1A value comparison graph;
FIG. 21 is F of MLP-IP classifier with hidden Level 2, Activation function identity and precision Tol 0.0011A value;
FIG. 22 shows F with hidden Level 2, Activation function Activation and precision Tol 0.001 for MLP-IP classifier of the present invention1A value;
FIG. 23 shows F with hidden Level 2, Activation function relu, and precision Tol of MLP-IP classifier of the present invention1A value;
FIG. 24 shows an adjusting hidden Level 2 and an activating function Acti of the MLP-IP classifier of the present inventionF with variation of tanh and accuracy of Tol 0.0011A value;
FIG. 25 is F of MLP-IP classifier with hidden Level 2, Activation function activity and precision Tol 0.00011A value;
FIG. 26 shows F with hidden Level 2, Activation function Activation and precision Tol 0.0001 for MLP-IP classifier of the present invention1A value;
FIG. 27 shows F with hidden Level 2, Activation function Activation relu, and precision Tol of MLP-IP classifier of the present invention1A value;
FIG. 28 is a diagram of F with the hidden layer number of adjustment Level 2, the Activation function Activation tanh and the precision Tol of the MLP-IP classifier of the present invention being 0.00011A value;
FIG. 29 is a diagram illustrating an average classification F of an MLP-IP classifier with different adjusted hidden layer numbers (Level 2) and different other parameters according to the present invention1A value comparison graph;
FIG. 30 is a graph comparing the number of reliable street level landmarks obtained by the method of the present invention, the Structon method and the Online Maps method;
FIG. 31 is a graph showing the relationship between the positioning error and the cumulative probability of 100 accurate street level landmarks at street level positioning by all reliable landmarks in Beijing, which are obtained by the method of the present invention and the Structon method, respectively;
FIG. 32 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at street level using all reliable landmarks in Shanghai obtained by the method of the present invention and the Structon method, respectively;
FIG. 33 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at the street level for all reliable landmarks in Guangzhou obtained by the method of the present invention and the Structon method, respectively;
FIG. 34 is a graph of relationships between positioning errors and cumulative probabilities for street-level positioning of 100 accurate street-level landmarks with all reliable landmarks in Shenzhen obtained by the method of the present invention and the Structon method, respectively;
FIG. 35 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at street level for all reliable landmarks in hong Kong obtained by the method of the present invention and the Structon method, respectively;
FIG. 36 is a graph of relationships between positioning errors and cumulative probabilities for street-level positioning of 100 accurate street-level landmarks with all reliable landmarks in Wuhan that are obtained by the method of the present invention and the Structon method, respectively;
FIG. 37 is a graph showing the relationship between the positioning error and the cumulative probability of 100 accurate street level landmarks at the street level for all reliable Zheng states landmarks obtained by the method of the present invention and the Structon method, respectively;
FIG. 38 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at all reliable landmarks obtained by the method of the present invention and the Structon method, respectively;
FIG. 39 is a graph showing the relationship between the positioning error and the cumulative probability of 100 accurate street-level landmarks at the street-level location of all reliable Kaemphra landmarks obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 40 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at all reliable landmarks in John Nernsberg obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 41 is a graph of relationships between positioning errors and cumulative probabilities for street-level positioning of 100 accurate street-level landmarks by all reliable Laves landmarks respectively obtained by the method of the present invention, the Structon method, and the Online Maps method;
FIG. 42 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at all reliable landmarks in London obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 43 is a graph of the relationship between the positioning error and the cumulative probability for street-level positioning of 100 accurate street-level landmarks by all reliable landmarks in los Angeles obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 44 is a graph of the relationship between positioning error and cumulative probability for street level positioning of 100 accurate street level landmarks at all reliable landmarks in Mexico city obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 45 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at the street level for all reliable New York landmarks obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 46 is a graph showing the relationship between positioning error and cumulative probability for 100 accurate street level landmarks at all reliable landmarks at Ottawa obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 47 is a graph of relationships between positioning errors and cumulative probabilities for 100 accurate street level landmarks at the street level by all reliable landmarks of Seoul obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively;
FIG. 48 is a graph showing the relationship between the positioning error and the cumulative probability of 100 accurate street-level landmarks at the street-level location of all reliable landmarks in Tokyo obtained by the method of the present invention, the Structon method, and the Online Maps method, respectively.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the method for obtaining street-level landmarks based on service identification and domain name association according to the present invention includes the following steps:
step 1: acquiring a plurality of IPs, wherein the IPs comprise an IP of a known service type and an IP of a plurality of unknown service types;
step 2: using a port scanning tool to perform open port scanning on all IP ports to obtain the open condition of each IP port;
and step 3: extracting training features for classification from the scan results of the IP of a known service type: reducing an IP open port of a known service type by adopting a feature reduction algorithm to obtain a minimum feature set, wherein the minimum feature set is used as a training feature;
and 4, step 4: training an IP classifier by using the training characteristics obtained in the step 3, and classifying the IP of the unknown service type by using the trained IP classifier to obtain a server IP;
and 5: acquiring a domain name corresponding to a server IP: performing domain name resolution on the server IP obtained in the step 4 under each DNS respectively to obtain domain name information corresponding to the server IP; if one server IP analyzes a plurality of domain name information, respectively establishing the mapping relation between the IP and the domain name;
the IP of the unknown service type comprises the types of a host IP, a server IP and the like, and the server IP has the characteristics of stability and small IP change, and after the server IP is classified by the IP classifier in the step 4, only the domain name of the server IP is selected and analyzed, so that the calculation amount of domain name analysis performed under the DNS in the step 5 is reduced.
Step 6: meanwhile, obtaining a city to which the unknown service type IP belongs based on a voting strategy, and constructing an organization information base of the city based on the domain name and the organization name; according to the characteristics of the domain names of the server IPs obtained in the step 5, acquiring mechanism information corresponding to the domain name of each server IP by using one or more of an online map, a mechanism record base and a mechanism information base matching method, thereby acquiring mapping between the server IPs and mechanism geographical positions and acquiring street-level candidate landmarks;
and 7: and (4) evaluating the street-level candidate landmarks obtained in the step (6) by using a street-level landmark evaluation method, so as to obtain reliable street-level landmarks.
The street level landmark evaluation method is the prior art, for example, chinese patent application No. 201811338745.3, which describes a method and apparatus for evaluating reliability of Web landmarks based on multi-layer decision-making.
Specifically, because there is a difference between the characteristics of the IPs carrying the same service (for example, the IP carrying the Web service may open 80 ports or 443 ports), the characteristic reduction cannot be performed based on the existing characteristic reduction algorithm, in this embodiment, the step 3 includes the following steps:
3.1: setting m types of service types provided by the IP of the known service type, setting SE (IP) to represent a set of IP structures of the known service types providing the same type of service, and sequentially representing the set of the IP structures of the m known service types providing the same type of service as SE1(IP) and SE2(IP) … SEq (IP) … SEm (IP); q is more than or equal to 1 and less than or equal to m;
setting feature (IP) as a feature set of a single IP in SE (IP); sorting all features (IP) according to the order of the number of elements in the features (IP) from small to large, and sorting the elements in SE (IP); the sorted Feature sets are respectively marked as Feature (IP1), Feature (IP2), … Feature (IPi) … Feature (IPj) … Feature (IPn); elements in se (IP) are correspondingly denoted as IP1, IP2, … IPi … IPj … IPn; i is more than or equal to 1 and less than or equal to j and less than or equal to n, wherein n is the number of the IPs of the known service types in the SE (IP);
the reduction algorithm for SE (IP) is as follows:
Figure BDA0002312487280000091
if it satisfies
Figure BDA0002312487280000092
Then IP will bejDeleted from SE (IP);
until when
Figure BDA0002312487280000093
All satisfy
Figure BDA0002312487280000094
The SE (IP) reduced feature set FeatureSet is a union of feature sets FeatureSet (IP) of all the remaining IPs in SE (IP);
3.2: respectively reducing SE1(IP), SE2(IP) … SEq (IP) … SEm (IP) to obtain reduced feature sets Featureset1 and Featureset2 … Featureset … Featureset;
3.3: the minimum feature set is the union of the feature sets FeatureSet1, FeatureSet2 … FeatureSetq … FeatureSetm.
For example: SE1(IP) comprises three IPs of IP1, IP2 and IP 3; feature (IP1) includes open port 1 and open port 2; feature (IP2) includes open port 1, open port 2, and open port 4; feature (IP3) includes open port 1 and open port 3; then in the SE1(IP) the,
Figure BDA0002312487280000095
delete Feature (IP 2); the union of Feature (IP1) and Feature (IP3) is open port 1, open port 2 and open port 3; the SE1(IP) reduced feature set FeatureSet1 is open port 1, open port 2 and open port 3.
The effectiveness of the reduction algorithm is demonstrated below.
280 DNS IPs, 1000 Email server IPs, 900 Web server IPs and 1200 user host IPs are scanned, an open port is extracted from the scanning result to be used as a feature, and the size of the finally obtained training feature is 317 when the feature is not reduced.
After the feature reduction is performed by using the feature reduction method provided by the invention, the size of the obtained training feature is 62. Training by using training features before and after reduction respectively by using an SVM-IP classifier, and when penalty factors C are 2.0, 1.0, 0.5 and 0.2 and Kernel functions Kernel are linear, rbf and sigmoid respectively, the SVM-IP classifier classifies average F of 300 IPs (DNS, Email and 100 IP of a Web server)1The values are shown in Table 1, with the values in parentheses being the average F obtained from training and classification using the pre-reduced features1The value is obtained.
TABLE 1 mean F for SVM classification using features before and after reduction, respectively1Value of
Kernel linear rbf sigmoid
C=2.0 0.893831(0.887214) 0.900894(0.902594) 0.888605(0.55851)
C=1.0 0.900737(0.89625) 0.886828(0.55851) 0.890583(0.542526)
C=0.5 0.904237(0.90355) 0.890583(0.540644) 0.596991(0.242038)
C=0.2 0.906152(0.905421) 0.556114(0.219239) 0.492444(0.203704)
As can be seen from Table 1, the reduced training features are used for classifier training and classification, resulting in an average F1The value is mostly smaller than the average F obtained using the pre-reduction features1The values illustrate the effectiveness of the feature reduction algorithm proposed by the present invention.
In this embodiment, constructing the organization information base of the city includes the following steps:
the method comprises the steps of obtaining a POI library and a mechanism directory of a target city from public data sets, obtaining mechanism names and categories of mechanisms, and extracting domain name keywords from the mechanism names; for non-English mechanism names, converting Chinese keywords in the mechanism names into letter combinations, and taking the letter combinations as domain name keywords, thereby realizing domain name keyword estimation of the mechanisms according to the mechanism names of the mechanisms; and associating the domain name keywords with the organization names to construct an organization information base.
The mechanism information base matching method comprises the following steps:
extracting a subdomain name field which implies mechanism information in a domain name of the server IP as an information field, matching the information field with domain name keywords in an mechanism information base, and establishing mapping between the domain name of the server IP and the domain name keywords. If the information field matches multiple agency names, multiple street level candidate landmarks will be obtained.
Obtaining the institution information corresponding to the domain name of the server IP by an online map matching method and an institution record base matching method is the prior art and is not described herein again.
ICANN defines top-level domain names representing various countries (the top-level domain names of countries usually consist of two english letters), and also defines top-level category domain names such as top, com, edu, gov, org; the second-level domain names below the top-level domain name are also generally classified by category, such as education and scientific research second-level domain names, edu,. ca,. com, etc.; in order to quickly obtain the mechanism information field in the domain name, the domain name needs to be classified;
the invention mainly divides the domain name into three categories, the category 1 is top-level domain name such as top, com, edu, gov, org, etc.; class I
The other 2 is secondary domain names such as com, edu, ca, gov, org and the like; category 3 is other domain names;
according to the definition of ICANN,. top represents a business (individuals may also register),. com represents a business,. edu
Denotes an educational institution,. gov denotes a government institution,. org denotes a non-profit organization; second level of representation category under country domain name
Domain names are generally synonymous with those in ICANN, i.e., under the national domain name, com domain name denotes the business entity,. edu
Indicating an educational institution (some countries also use ca for scientific research education institutions),. org for non-profit organizations, and. gov for government departments.
During domain name registration, sub-domain name fields under the domain names are artificially defined, and the sub-domain name fields usually have some correlation with characteristics such as organization names, organization functions, working characteristics and the like and contain organization information, for example, the domain name of harvard university is harvard. Therefore, a subdomain name field which implies organization information is taken as an information field.
Meanwhile, the statistics of 1000 domain names shows that the information field of more than 96% of the domain names does not exceed 10 letters. For English countries, the keywords in the organization name are letter combinations; for non-english countries, the obtained organization name keyword is often not english and needs to be converted into an alphabetic combination similar to english, and when the keyword is converted into an english alphabetic combination by directly using a translation tool, the situation of more than 10 characters often occurs, so that the alphabetic combination obtained by conversion needs to be subjected to deformation processing.
The method comprises the steps of counting the relation between a domain name information field and mechanism characteristics, and extracting domain name keywords from mechanism names when verifying a mechanism information base of a city, wherein the mechanism information base is provided by the invention; and in the mechanism information base matching method, the information field is correspondingly matched with the domain name keyword in the mechanism information base, so that the reasonability of the method of associating the server IP with the mechanism is obtained.
The relationship between the 1000 domain name information fields and the organization characteristics is counted, and the result is shown in table 2.
TABLE 2 statistical results of the relationship between Domain name information fields and organizational characteristics
Features of the mechanism Name of organization Function of organization Working characteristic Others
Number of domain names 955 24 15 6
In proportion of 0.955 0.024 0.015 0.006
As can be seen from table 2, in the 1000 domain names of the experiment, there is a relationship between the information field and the organization name of more than 95% of the domain names.
Meanwhile, according to the mechanism names, a mechanism information base corresponding to 3000 domain names is constructed by using the method, the number of the domain names successfully matched with the domain name information field by using the mechanism information base is 2791, and the success rate exceeds 93%. Therefore, it is reasonable to select domain name keywords extracted from organization names to construct an organization information base.
To verify the effectiveness of obtaining reliable street-level landmarks through the present invention, a classifier selection experiment and a street-level landmark obtaining experiment are developed below, and the experimental results are analyzed.
A. Classifier selection experiment
In order to obtain a better classifier to realize the classification of the unknown service type IP, classifiers such as SVM, KNN (K-nearest neighbor algorithm), MLP (Multi-Layer Perception), and the like are trained under multiple parameters to obtain a trained IP classifier, and a harmonic factor F is used1To determine the quality of each IP classifier on the experimental data set. The parameter settings for training each classifier are as follows:
SVM-IP classifier: and adjusting a penalty factor C, a Kernel function Kernel and precision Tol. The variation range of C is 0.1-10, and the step length is 0.1; kernel is respectively set as linear, rbf and sigmoid; tol was 0.001 and 0.0001 respectively.
KNN-IP classifier: adjusting neighbor node numbers neighbor and neighbor weight calculation modes weight. The value range of the neighbor is 4-22, and the step length is 1; weights are set to uniform and distance, respectively.
MLP-IP classifier: and adjusting the hidden layer number Level, the number Node of hidden layers, an Activation function Activation and the precision Tol. Level is respectively set to be 1 and 2; num ranges from 10 to 100, and the step length is 5; activation is respectively set as identity, logic, tanh and relu; tol was set to 0.001, 0.0001, respectively.
IP classifier training dataset: 1000 DNS servers (500 of which belong to China and 500 of which belong to other countries), 6000 DNS servers (2000 of which belong to China and 4000 of which belong to other countries), 6000 DNS servers (IP), Web servers (IP) and user host computers (IP) respectively.
The IP classifier validates the data set: 100 DNS servers, Email servers, Web servers and user hosts IP (30 of which belong to China and 70 of which belong to other countries) are respectively arranged.
Firstly, an nmap detection tool is used for carrying out port detection on the IPs in the training data set, an open port of each IP is obtained from a detection result, and the open port is used as a classification characteristic. And reducing the classification features of each type of IP by using a feature reduction algorithm, and taking the reduced union of feature sets of each type of IP as a feature set of the training classification model.
F of SVM-IP classifier under different parameters1The values are shown in fig. 2-7.
Average classification F of SVM-IP classifiers with different parameters1The value comparison is shown in fig. 8.
As can be seen from FIGS. 2-7, on the experimental data set of the present example, the classification F of the DNS type IP by the SVM-IP classifier1Class F for Email type IP with highest value1The value is lowest.
Meanwhile, the adjustment parameter precision Tol has no influence on the classifier, the adjustment penalty factor C has a large influence on the classifier of the nonlinear kernel function, and the error tolerance of the nonlinear kernel function on the experimental data set is small. And the average class F obtained by the linear kernel classifier1The value is slightly influenced by the penalty factor C, and the data set has better linear separability.
As can be seen from fig. 8, on a given data set, when C is smaller, the classifier of the linear kernel function is better than the classifiers of the other kernel functions, and when C is greater than 3, the influence of the kernel function and C on the classification becomes smaller on the experimental data set of this section.
KNN-IP classifier F under different parameters1The values are shown in fig. 9-10.
KNN-IP averagely categorizing F of different parameters1The value comparison is shown in fig. 11.
As can be seen from FIGS. 9 and 10, in the experimental data set of this example, the classification F of the DNS type IP by the KNN-IP classifier1Class F for Email type IP with highest value1The value is lowest. Meanwhile, as the number of neighbor nodes increases, the KNN-IP classifier classifies the IP of the DNS and the Web type by F1Overall trend of values is slightly reduced, while classification F for Email type IP1The overall trend of the values increased slightly.
From FIG. 11, average F1The value is less influenced by the change of the neighbor node, and the average classification F of the KNN-IP classifier under the mode of calculating the weight according to the distance1The value is higher than the mean weight calculation method due to the characteristics of the same type of IPThe features have higher similarity and closer distance in the multidimensional space, which shows that the classification effect of the KNN-IP classifier based on the distance calculation weight is better in the experimental data set of the embodiment.
When the number of hidden layers is 1, the MLP-IP classifier is F under each parameter1Values As shown in FIGS. 12-19, average F of MLP-IP classifier1The values are shown in fig. 20.
When the number of hidden layers is 2, the MLP-IP classifier is F under each parameter1Values As shown in FIGS. 21-28, average F of MLP-IP classifier1The values are shown in FIG. 29.
From FIGS. 12-29, on the experimental data set of this example, the classification F of DNS type IP by the MLP-IP classifier1Class F for Email type IP with highest value1The value is lowest. Meanwhile, adjusting the activation function and precision of the MLP-IP classifier, and changing the number of hidden layers and the number of nodes in the hidden layers to classify F1The influence of the value is small.
Maximum average F of each classifier on the data set of this example1The values and corresponding parameter settings are shown in table 3.
TABLE 3 maximum average F for each classifier1Parameter setting at value
Figure BDA0002312487280000141
According to Table 3, the MLP-IP classifier obtains the maximum average F when the activation function is relu, the precision is 0.0001, and the number of hidden nodes is 25 and 20, respectively1The value is obtained.
Therefore, in the subsequent classification experiment of the embodiment, the classifier is used for classification.
B. Street level landmark acquisition experiment
Using a database query method to respectively obtain IP sections of the following cities: beijing, Shanghai, Guangzhou, Shenzhen, hong Kong, Wuhan, Zheng, Chengzhou, Chengdu, Kanbera, south Africa, Nigeria, England, los Angeles, Mexico, city of Mexico, New York, Canada, Ottal, Japan.
And performing open port detection on the obtained urban IP by using an Nmap tool. And classifying DNS, Email and Web server IP in the MLP-IP classifier based on the port detection result by using the trained MLP-IP classifier, and acquiring one or more domain names corresponding to the server IP by using 100 DNS servers which are widely distributed.
And meanwhile, constructing the mechanism information base of the city according to public data such as POI database, enterprise directory and the like. The method comprises the steps of using methods such as online map search, database query, organization information base association and the like to realize association between a domain name of a server IPD and organization information, obtaining Street-Level candidate landmarks, and finally using a Landmark Evaluation method recorded in the text Ruixiang Li, Yuche Sun, Jianwei Hu, Ma Te, Xiangyang Luo, "Street-Level Landmark Evaluation Based on road Routers," Security & Communication Networks, vol.2018, pp.1-12,2018 to evaluate, and obtaining reliable Street-Level landmarks.
Meanwhile, street-level candidate landmarks of the cities are obtained by using a Structon method and an Online Maps method, and the obtained landmarks are evaluated by using the street-level landmark evaluation method in the article. Due to some policies of the Chinese government, map services of non-Chinese companies such as Google are limited in China, and map services of Chinese companies such as Baidu hardly provide domain name information, so that the use of the Online Maps method in cities of China is limited, and the method provided in this chapter is not compared with the method for analysis.
Table 4 lists the number of Web pages crawled in each city using the Structon method.
TABLE 4 number of pages crawled by the Structon method
City Number of Web pages City Number of Web pages City Number of Web pages
Beijing 2648767 Shanghai province 2851134 Guangzhou province 2932164
Shenzhen (Shenzhen medicine) 2564635 Hong Kong 1716904 Wuhan dynasty 2510708
Zhengzhou province 1963874 All of the achievements 2512227 Kanbeila 2164781
John Nernsberg 1849305 Ladies 1766507 London 2468066
Los Angeles 3010358 City of Mexico 2785581 New York, New York 3096256
Ottawa 2902743 Chuer (Chinese character of 'Shou' an) 2710160 Tokyo 2923927
The number of reliable street level landmarks obtained using the present invention (abbreviated as "deployed", fig.), the Structon method, and the Online Maps method is shown in fig. 30.
As can be seen from fig. 30, in developed network areas, the number of street level landmarks obtained using the Online Maps method is greater than that of the Structon method, since the map service is more developed in developed network areas, and all Web pages of these areas cannot be obtained in this experiment.
The number of the street-level landmarks obtained by the method provided by the invention is more than that of the Structon method and the Online Maps method, because the method can obtain the landmarks of the Web server and the landmarks of other types of servers, the number of the obtained street-level landmarks is improved.
In each city, 100 accurate street-level landmarks were positioned street-level using all reliable landmarks obtained by each type of method, respectively, and the relationship between the positioning error and the cumulative probability is shown in fig. 31-48.
According to fig. 31-48, the positioning accuracy in the above-mentioned cities is improved using the method of the present invention compared to the Structon method and the Online Maps method. The method is characterized in that the positioning accuracy is positively correlated with the number of reliable landmarks when IP positioning is carried out, for the same area, the more the number of reliable landmarks is, the higher the positioning accuracy is, and the more reliable landmarks are obtained by the method, so that the positioning accuracy of street-level landmarks in cities is improved.
The experiment shows that compared with the existing method for acquiring street-level landmarks based on Web, the method provided by the invention can acquire more street-level landmarks and has stronger applicability to developed regions of different networks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. A street-level landmark obtaining method based on service identification and domain name association is characterized in that: the method comprises the following steps:
step 1: acquiring a plurality of IPs, wherein the IPs comprise an IP of a known service type and an IP of a plurality of unknown service types;
step 2: using a port scanning tool to perform open port scanning on all IP ports to obtain the open condition of each IP port;
and step 3: extracting training features for classification from the scan results of the IP of a known service type: reducing an IP open port of a known service type by adopting a feature reduction algorithm to obtain a minimum feature set, wherein the minimum feature set is used as a training feature;
and 4, step 4: training an IP classifier by using the training characteristics obtained in the step 3, and classifying the IP of the unknown service type by using the trained IP classifier to obtain a server IP;
and 5: acquiring a domain name corresponding to a server IP: performing domain name resolution on the server IP obtained in the step 4 under each DNS respectively to obtain domain name information corresponding to the server IP; if one server IP analyzes a plurality of domain name information, respectively establishing the mapping relation between the IP and the domain name;
step 6: obtaining a city to which an unknown service type IP belongs based on a voting strategy, and constructing an organization information base of the city based on a domain name and an organization name; obtaining mechanism information corresponding to the domain name of each server IP by utilizing one or more of an online map, a mechanism record base and a mechanism information base matching method according to the characteristics of various domain names of the server IP obtained in the step 5, thereby obtaining the association between the server IP and the mechanism geographic position and obtaining street-level candidate landmarks;
and 7: and (4) evaluating the street-level candidate landmarks obtained in the step (6) by using a street-level landmark evaluation method, so as to obtain reliable street-level landmarks.
2. The method of claim 1, wherein the method comprises the steps of: the step 3 comprises the following steps:
3.1: setting m types of service types provided by the IP of the known service type, setting SE (IP) to represent a set of IP structures of the known service types providing the same type of service, and sequentially representing the set of the IP structures of the m known service types providing the same type of service as SE1(IP) and SE2(IP) … SEq (IP) … SEm (IP); q is more than or equal to 1 and less than or equal to m;
setting feature (IP) as a feature set of a single IP in SE (IP); sorting all features (IP) according to the order of the number of elements in the features (IP) from small to large, and sorting the elements in SE (IP); the sorted Feature sets are respectively marked as Feature (IP1), Feature (IP2), … Feature (IPi) … Feature (IPj) … Feature (IPn); elements in se (IP) are correspondingly denoted as IP1, IP2, … IPi … IPj … IPn; i is more than or equal to 1 and less than or equal to j and less than or equal to n, wherein n is the number of the IPs of the known service types in the SE (IP);
the reduction algorithm for SE (IP) is as follows:
Figure FDA0002312487270000021
if it satisfies
Figure FDA0002312487270000022
Then IP will bejDeleted from SE (IP);
until when
Figure FDA0002312487270000023
All satisfy
Figure FDA0002312487270000024
The SE (IP) reduced feature set FeatureSet is a union of feature sets FeatureSet (IP) of all the remaining IPs in SE (IP);
3.2: respectively reducing SE1(IP), SE2(IP) … SEq (IP) … SEm (IP) to obtain reduced feature sets Featureset1 and Featureset2 … Featureset … Featureset;
3.3: the minimum feature set is the union of the feature sets FeatureSet1, FeatureSet2 … FeatureSetq … FeatureSetm.
3. The method of claim 2, wherein the method comprises the steps of: in the step 6, the establishment of the city institution information base comprises the following steps:
the method comprises the steps of obtaining a POI library and a mechanism directory of a target city from public data sets, obtaining mechanism names and categories of mechanisms, and extracting domain name keywords from the mechanism names; for non-English organization names, converting Chinese keywords in the organization names into letter combinations, and taking the letter combinations as domain name keywords; and associating the domain name keywords with the organization names to construct an organization information base.
4. The method of claim 3, wherein the method comprises the steps of: in step 6, the method for matching the organization information base includes the following steps:
and extracting a subdomain name field which implies mechanism information in the domain name of the server IP as an information field, matching the information field with domain name keywords in an mechanism information base, establishing association between the domain name of the server IP and the mechanism name, and finally establishing mapping between the server IP and the mechanism geographical position to obtain street-level candidate landmarks.
CN201911264591.2A 2019-12-11 2019-12-11 Street-level landmark obtaining method based on service identification and domain name association Active CN111026829B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911264591.2A CN111026829B (en) 2019-12-11 2019-12-11 Street-level landmark obtaining method based on service identification and domain name association

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911264591.2A CN111026829B (en) 2019-12-11 2019-12-11 Street-level landmark obtaining method based on service identification and domain name association

Publications (2)

Publication Number Publication Date
CN111026829A true CN111026829A (en) 2020-04-17
CN111026829B CN111026829B (en) 2022-10-04

Family

ID=70205801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911264591.2A Active CN111026829B (en) 2019-12-11 2019-12-11 Street-level landmark obtaining method based on service identification and domain name association

Country Status (1)

Country Link
CN (1) CN111026829B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113242332A (en) * 2021-05-19 2021-08-10 郑州埃文计算机科技有限公司 Improved method for forming street-level positioning library
CN115987803A (en) * 2022-12-23 2023-04-18 天翼安全科技有限公司 Organization mechanism determination method of autonomous system and related device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537105A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Automatic network physical landmark excavating method based on Web maps
CN104715012A (en) * 2015-01-15 2015-06-17 罗向阳 Network entity city-level landmark mining algorithm based on Internet forum
CN110311991A (en) * 2019-02-20 2019-10-08 罗向阳 Street-level terrestrial reference acquisition methods based on svm classifier model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104537105A (en) * 2015-01-14 2015-04-22 中国人民解放军信息工程大学 Automatic network physical landmark excavating method based on Web maps
CN104715012A (en) * 2015-01-15 2015-06-17 罗向阳 Network entity city-level landmark mining algorithm based on Internet forum
CN110311991A (en) * 2019-02-20 2019-10-08 罗向阳 Street-level terrestrial reference acquisition methods based on svm classifier model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113242332A (en) * 2021-05-19 2021-08-10 郑州埃文计算机科技有限公司 Improved method for forming street-level positioning library
CN113242332B (en) * 2021-05-19 2022-10-04 郑州埃文计算机科技有限公司 Improved method for forming street-level positioning library
CN115987803A (en) * 2022-12-23 2023-04-18 天翼安全科技有限公司 Organization mechanism determination method of autonomous system and related device

Also Published As

Publication number Publication date
CN111026829B (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN110290116B (en) Malicious domain name detection method based on knowledge graph
CN111783419B (en) Address similarity calculation method, device, equipment and storage medium
KR100985450B1 (en) Local search
Rout et al. Where's@ wally? a classification approach to geolocating users based on their social ties
US10078743B1 (en) Cross identification of users in cyber space and physical world
Kinsella et al. " I'm eating a sandwich in Glasgow" modeling locations with tweets
Han et al. A stacking-based approach to twitter user geolocation prediction
EP2803031B1 (en) Machine-learning based classification of user accounts based on email addresses and other account information
US7173632B2 (en) Information display
Dan et al. IP geolocation through reverse DNS
CN111026829B (en) Street-level landmark obtaining method based on service identification and domain name association
Christen et al. A probabilistic geocoding system based on a national address file
CN109543118A (en) Web terrestrial reference reliability estimation method and device based on multilevel policy decision
CN114328962A (en) Method for identifying abnormal behavior of web log based on knowledge graph
Li et al. Street-Level Landmarks Acquisition Based on SVM Classifiers.
Li et al. Street‐Level Landmark Evaluation Based on Nearest Routers
Fink et al. Mapping the Twitterverse in the developing world: An analysis of social media use in Nigeria
Ding et al. Gnn-geo: A graph neural network-based fine-grained ip geolocation framework
Zhao et al. Improving IP geolocation databases based on multi-method classification
Chapuis et al. Geodabs: Trajectory indexing meets fingerprinting at scale
Li et al. LandmarkMiner: Street-level network landmarks mining method for IP geolocation
Marchal et al. Semantic exploration of DNS
CN110311991B (en) Street-level landmark obtaining method based on SVM classification model
Alonso-Lorenzo et al. Language independent big-data system for the prediction of user location on Twitter
He et al. Poi alias discovery in delivery addresses using user locations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200727

Address after: 450001 No. 62 science Avenue, hi tech Zone, Henan, Zhengzhou

Applicant after: Information Engineering University of Strategic Support Force,PLA

Address before: 450001 Information Engineering University, 62 science Avenue, hi tech Zone, Henan, Zhengzhou

Applicant before: Luo Xiangyang

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant