CN112015910B - Domain name knowledge base generation method and device, computer equipment and storage medium - Google Patents

Domain name knowledge base generation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112015910B
CN112015910B CN202010845644.6A CN202010845644A CN112015910B CN 112015910 B CN112015910 B CN 112015910B CN 202010845644 A CN202010845644 A CN 202010845644A CN 112015910 B CN112015910 B CN 112015910B
Authority
CN
China
Prior art keywords
domain name
knowledge base
alternative
access
target domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010845644.6A
Other languages
Chinese (zh)
Other versions
CN112015910A (en
Inventor
张健
石磊
孟宝权
王杰
梁彧
杨满智
蔡琳
田野
傅强
金红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202010845644.6A priority Critical patent/CN112015910B/en
Publication of CN112015910A publication Critical patent/CN112015910A/en
Application granted granted Critical
Publication of CN112015910B publication Critical patent/CN112015910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method and a device for generating a domain name knowledge base, computer equipment and a storage medium. The method comprises the following steps: acquiring a target domain name, and resolving the target domain name through at least one resolving node to obtain at least one alternative resolving result corresponding to the target domain name; if the fact that the target domain name is not stored in the domain name knowledge base is determined, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base; if the domain name knowledge base is determined to store the reference resolution result matched with the target domain name, acquiring the non-overlapping part of the alternative resolution results and the reference resolution result, and additionally storing the non-overlapping part in the domain name knowledge base. According to the scheme provided by the embodiment of the invention, a knowledge base containing a plurality of domain names and a plurality of domain name resolution results can be generated, and a basis is provided for subsequent statistical analysis of the domain names.

Description

Domain name knowledge base generation method and device, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method and a device for generating a domain name knowledge base, computer equipment and a storage medium.
Background
Domain name resolution refers to the process of converting a domain name to an internet protocol (Internet Protocol, IP) address; this process is typically done automatically by the browser with a Domain name system (Domain NAME SYSTEM, DNS) server without the visiting user perceiving it; wherein DNS servers are typically built by domain name registrars and basic operators, including authoritative resolution and recursive resolution. In general, the same domain name is resolved to obtain different resolution results (wherein, different resolution results can correspond to different DNS resolution records and access modes of access nodes); for example, the resolution results for the same domain name are different at different times, or at different locations.
At present, related personnel only pay attention to how to accelerate the speed and efficiency of domain name resolution, but do not carry out statistical analysis on the domain name resolution results, and cannot effectively master the domain name resolution results.
Disclosure of Invention
The embodiment of the invention provides a method, a device, computer equipment and a storage medium for generating a domain name knowledge base, which are used for generating a knowledge base comprising a plurality of domain names and a plurality of domain name resolution results, so as to provide basis for the follow-up statistical analysis of the domain names.
In a first aspect, an embodiment of the present invention provides a method for generating a domain name knowledge base, including:
Acquiring a target domain name, and resolving the target domain name through at least one resolving node to obtain at least one alternative resolving result corresponding to the target domain name;
If the target domain name is not stored in the domain name knowledge base, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base;
If the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring the non-overlapping part of each alternative analysis result and the reference analysis result, and additionally storing the non-overlapping part in the domain name knowledge base.
In a second aspect, an embodiment of the present invention further provides a device for generating a domain name knowledge base, including:
The system comprises an alternative resolution result acquisition module, a target domain name analysis module and a target domain name analysis module, wherein the alternative resolution result acquisition module is used for acquiring a target domain name, resolving the target domain name through at least one resolution node and obtaining at least one alternative resolution result corresponding to the target domain name;
The first alternative resolution result storage module is used for storing the mapping relation between the target domain name and each alternative resolution result in the domain name knowledge base if the target domain name is determined not to be stored in the domain name knowledge base;
and the second alternative analysis result storage module is used for acquiring the non-overlapping part of each alternative analysis result and the reference analysis result and additionally storing the non-overlapping part in the domain name knowledge base if the reference analysis result matched with the target domain name is stored in the domain name knowledge base.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements a method for generating a domain name knowledge base according to any one of the embodiments of the present invention when executing the program.
In a fourth aspect, embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a method for generating a domain name knowledge base according to any of the embodiments of the present invention.
According to the embodiment of the invention, the target domain name is resolved through different resolving nodes, so that a plurality of alternative resolving results are obtained; further determining whether the domain name knowledge base stores the target domain name; if not, directly storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base; otherwise, the part, which is not overlapped with the reference resolution result in the domain name knowledge base, of the obtained target domain name alternative resolution results is additionally stored in the domain name knowledge base, so that a knowledge base containing a plurality of domain names and a plurality of domain name resolution results can be generated, and a basis is provided for the follow-up statistical analysis of the domain names.
Drawings
FIG. 1 is a flow chart of a method for generating a domain name knowledge base according to a first embodiment of the invention;
FIG. 2 is a flowchart of a method for generating a domain name knowledge base according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a method for generating a domain name knowledge base according to a third embodiment of the present invention;
FIG. 4 is a flowchart of a method for generating a domain name knowledge base according to a fourth embodiment of the present invention;
FIG. 5 is a flowchart of a method for generating a domain name knowledge base according to a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a domain name knowledge base generating device in a fifth embodiment of the present invention;
fig. 7 is a schematic structural diagram of a computer device in a sixth embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not limiting of embodiments of the invention. It should be further noted that, for convenience of description, only some, but not all of the structures related to the embodiments of the present invention are shown in the drawings.
Example 1
Fig. 1 is a flowchart of a method for generating a domain name repository according to a first embodiment of the present invention, where the present embodiment is applicable to a case of determining a domain name repository according to a plurality of domain name resolution results, the method may be performed by a device for generating a domain name repository, and the device may be implemented by software and/or hardware and integrated in a computer device for performing the method, and specifically, referring to fig. 1, the method specifically includes the following steps:
Step 110, obtaining a target domain name, and resolving the target domain name through at least one resolving node to obtain at least one alternative resolving result corresponding to the target domain name.
The Domain Name (Domain Name), also called a network Domain, is a Name of a certain computer or a computer group on the Internet, which is formed by a series of names separated by dots, and is used for locating and identifying the computer (sometimes referred to as a geographic location) during data transmission. Because IP addresses have the disadvantages of inconvenient memorization and inability to display names and properties of address organizations, people have designed Domain names and mapped the Domain names and IP addresses to each other through a Domain name system (DNS NAME SYSTEM), so that users can access the internet more conveniently without memorizing IP addresses that can be directly read by a machine.
In practical application, the domain name adopts a hierarchical structure, wherein the highest level is a root domain name, the next highest level is a top level domain name, and the next highest level is a first level domain name, a second level domain name, a third level domain name and the like. For example, ", com", cn "is a top-level domain name," abc.com "is a primary domain name, and" www.abc.com "is a secondary domain name.
In the embodiment of the present invention, the target domain name may be any one or more domain names, which is not limited in the embodiment of the present invention. In an optional implementation manner of the embodiment of the invention, the crawler program can crawl the active and effective domain name information in the ranking website of the website, for example, the effective domain name with the ranking of top 100, top 200 or more of the website can be crawled in the ranking website as the target domain name. In the embodiment of the invention, a large number of effective domain names with high activity can be obtained through a domain name service enterprise and used as target domain names, and the method and the device are not limited in the embodiment of the invention.
The analysis node in the embodiment of the present invention may be an electronic device such as a detection server or a computer distributed at any geographic location, which is not limited in the embodiment of the present invention. For example, the probe server may be a probe server built in Beijing, tianjin or New York.
In a specific implementation, after the target domain name is obtained, the target domain name may be further resolved by using resolution nodes established at each geographic location, so as to obtain a plurality of alternative resolution results corresponding to the target domain name. For example, the target domain name may be resolved by resolving nodes established in Beijing, tianjin and Xinjiang respectively, or may be resolved by resolving nodes established in various geographic locations respectively, which is not limited in the embodiment of the present invention.
And 120, if the target domain name is not stored in the domain name knowledge base, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base.
The domain name knowledge base may include different resolution results of multiple domain names, for example, multiple resolution results of domain name a, or multiple resolution results of domain name B, etc. It is understood that the domain name repository may be an empty repository at an initial stage, which may not contain any resolution of domain names.
In a specific implementation, after obtaining at least one alternative analysis result corresponding to the target domain name, it may be further determined whether the domain name knowledge base includes the target domain name; if it is determined that the domain name knowledge base does not store the target domain name, and accordingly, does not contain the resolution result of the target domain name, the mapping relationship between the target domain name and each of the alternative resolution results corresponding to the target domain name may be stored in the domain name knowledge base.
It should be noted that, in the domain name knowledge base, each domain name corresponds to its corresponding alternative resolution result; the domain name knowledge base can contain a plurality of domain names and mapping relations of each alternative analysis result corresponding to the domain names; for example, it may include: mapping relationships between the domain name a and the respective alternative resolution results corresponding to the domain name a, mapping relationships between the domain name B and the respective alternative resolution results corresponding to the domain name B, or mapping relationships between the domain name C and the respective alternative resolution results corresponding to the domain name C.
And 130, if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring a part, which is not overlapped with the reference analysis result, of each alternative analysis result, and additionally storing the part in the domain name knowledge base.
In a specific implementation, after obtaining at least one alternative resolution result corresponding to the target domain name, if it is determined that the domain name knowledge base includes the target domain name and the reference resolution result matched with the target domain name is stored, a portion of each alternative resolution result, which is not overlapped with the reference resolution result, may be obtained, and the non-overlapped portion is stored in the domain name knowledge base.
In an optional implementation manner of the embodiment of the present invention, each of the candidate analysis results may be compared with the reference analysis result, so as to determine a different portion between each of the candidate analysis results and the reference analysis result, that is, a portion of each of the candidate analysis results that does not overlap with the reference analysis result; further, the part is stored in a domain name knowledge base, so that each analysis result corresponding to the target domain name in the domain name knowledge base is perfected.
According to the technical scheme, the target domain name is resolved through different resolving nodes, and a plurality of alternative resolving results are obtained; further determining whether the domain name knowledge base stores the target domain name; if not, directly storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base; otherwise, the part, which is not overlapped with the reference resolution result in the domain name knowledge base, of the obtained target domain name alternative resolution results is additionally stored in the domain name knowledge base, so that a knowledge base containing a plurality of domain names and a plurality of domain name resolution results can be generated, and a basis is provided for the follow-up statistical analysis of the domain names.
Example two
Fig. 2 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, where the technical solutions in the present embodiment are further refined, and the technical solutions in the present embodiment may be combined with each alternative solution in the one or more embodiments. As shown in fig. 2, the method for generating the domain name repository may include the following steps:
step 210, resolving the target domain name through the resolving node to obtain a DNS resolving record corresponding to the target domain name and an access mode as an alternative resolving result.
In a specific implementation, after the target domain name is obtained, the target domain name can be further resolved by the resolution node to obtain a DNS resolution record and an access mode corresponding to the target domain name, and the DNS resolution record and the access mode corresponding to the target domain name are used as alternative resolution results.
In the embodiment of the present invention, each domain name is resolved by the resolving node, and the obtained domain name resolving result may include a DNS resolving record corresponding to the resolved domain name and an access mode of the resolving node. The DNS resolution records and the access modes in the resolution results obtained by resolving the target domain name by different nodes can be different.
Optionally, the DNS resolution record may include at least one of: resolution server IP, resolution manufacturer name, access IP, access manufacturer name, domain name access state or unit information of domain name corresponding to target domain name. The IP of the resolution server corresponding to the target domain name is the IP address of the server for resolving the target domain name; resolving the manufacturer name is the manufacturer name of the resolved target domain name; the domain name access status may be accessible or not accessible for the domain name.
Optionally, the access manner of the parsing node may include at least one of the following: single access, multiple access, cloud access, or accelerated access. The single access and the multiple accesses are distinguished according to whether the IP is unique or not, namely, the accelerated access of the content distribution network (Content Delivery Network, CDN) is distinguished from the cloud access according to whether the acceleration characteristics exist in a domain name resolution mode, the judgment that the acceleration characteristics exist is the accelerated access, and the judgment that the acceleration characteristics do not exist is the cloud access. In the embodiment of the invention, the single access IP number value range [0,1], the multiple access IP number value range [2, 128] and the acceleration and cloud access IP number value range [2, 2048] are adopted.
Step 220, adding the target domain name into the domain name knowledge base, and creating a reference analysis record set matched with the target domain name and a reference access mode set.
In a specific implementation, if it is determined that the domain name knowledge base does not store the target domain name, the target domain name may be further added to the domain name knowledge base, and a reference resolution record set matched with the target domain name and a reference access mode set are created. The reference analysis record set is a DNS analysis record set.
It can be understood that the target domain name is resolved by different resolution nodes in different geographic locations, or the same resolution node resolves the target domain name at different times, and the resolution results obtained are different, that is, the DNS resolution record and the access mode may be different.
After the target domain name is added into the domain name knowledge base, each alternative analysis result corresponding to the target domain name can be further clustered, so that a DNS analysis record set and an access mode set corresponding to the target domain name are obtained. In the embodiment of the invention, the target domain name is added into a domain name knowledge base, the DNS analysis record set corresponding to the target domain name is determined as a reference analysis record set, and the access mode set corresponding to the target domain name is determined as a reference access mode set.
And 230, performing de-duplication processing on each DNS analysis record included in each alternative analysis result, and adding each alternative analysis result after the de-duplication processing into the reference analysis record set.
In a specific implementation, after the reference analysis record set matched with the target domain name is created and the reference access mode set is referred, the reference analysis record set may be further processed, specifically, the DNS analysis records repeated in the set may be filtered, and each alternative analysis result after the duplicate removal process is added to the reference analysis record set.
And 240, performing de-duplication processing on each access mode included in each alternative analysis result, and adding each access mode subjected to the de-duplication processing to the reference access mode set.
In a specific implementation, after the reference analysis record set matched with the target domain name and the reference access mode set are created, the access mode set can be further processed, specifically, repeated access modes in the set can be filtered, and each access mode after the duplicate removal processing is added into the reference access mode set.
According to the technical scheme, the target domain name is resolved through the resolution node to obtain a Domain Name System (DNS) resolution record corresponding to the target domain name and an access mode as alternative resolution results; when the fact that the target domain name is not stored in the domain name knowledge base is determined, adding the target domain name into the domain name knowledge base, and creating a reference analysis record set matched with the target domain name and a reference access mode set; performing de-duplication processing on each DNS analysis record included in each alternative analysis result, and adding each alternative analysis result after the de-duplication processing into a reference analysis record set; and respectively carrying out de-duplication treatment on each access mode included in each alternative analysis result, adding each access mode subjected to the de-duplication treatment to a reference access mode set, generating a knowledge base containing a plurality of domain names and a plurality of domain name analysis results, providing a basis for the follow-up statistical analysis of the domain names, and carrying out historical tracing on the analysis and access conditions of the domain names.
Example III
Fig. 3 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, where the technical solutions in the present embodiment are further refined, and the technical solutions in the present embodiment may be combined with each alternative solution in the one or more embodiments. As shown in fig. 3, the method for generating the domain name repository may include the following steps:
And 310, resolving the target domain name through the resolving node to obtain a DNS resolving record corresponding to the target domain name and an access mode as alternative resolving results.
Step 320, obtaining each DNS resolution record included in each alternative resolution result, and adding DNS resolution records not belonging to the reference resolution record set to the resolution record set.
In a specific implementation, if it is determined that the domain name knowledge base stores the reference resolution result matched with the target domain name and the target domain name, the target domain name may be further resolved by each resolution node, and each DNS resolution record included in each obtained alternative resolution result may be further obtained, and DNS resolution records not included in the reference resolution record set may be added to the resolution record set.
For example, if each alternative resolution result includes DNS resolution record A, DNS resolution record B and DNS resolution record C, and the reference resolution record set includes DNS resolution record a and DNS resolution record B, then the DNS resolution record C may be continuously added to the reference resolution record set, thereby obtaining a new resolution record set.
Step 330, each access mode included in each alternative analysis result is obtained, and the access modes not belonging to the reference access mode set are added into the access mode set.
In a specific implementation, if it is determined that the domain name knowledge base stores the reference resolution result matched with the target domain name and the target domain name, the target domain name may be further resolved by each resolution node, and each access mode included in each obtained alternative resolution result may be further obtained, and an access mode that does not belong to the reference access mode set may be added to the access mode set.
For example, if each of the alternative analysis results includes the access manner a, the access manner B, and the access manner C, and the reference access manner set includes the access manner a and the access manner B, the access manner C may be continuously added to the reference access manner set, thereby obtaining a new access manner set.
According to the technical scheme, the target domain name is resolved through the resolution node, so that a DNS resolution record corresponding to the target domain name and an access mode are obtained as alternative resolution results; if the reference analysis results matched with the target domain name are stored in the domain name knowledge base, respectively acquiring each DNS analysis record included in each alternative analysis result, and adding the DNS analysis record which does not belong to the reference analysis record set into the analysis record set; and/or respectively acquiring each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set, so that a knowledge base containing a plurality of domain names and a plurality of domain name analysis results can be generated, and a basis is provided for the follow-up statistical analysis of the domain names.
Example IV
Fig. 4 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, where the technical solutions in the present embodiment are further refined, and the technical solutions in the present embodiment may be combined with each alternative solution in the one or more embodiments. As shown in fig. 4, the method for generating the domain name repository may include the following steps:
and 410, resolving the target domain name through the resolving node to obtain a Domain Name System (DNS) resolving record corresponding to the target domain name and an access mode as alternative resolving results.
Step 420, splitting the domain name repository into a first domain name repository and a second domain name repository.
The first domain name knowledge base stores mapping relations between domain names and DNS analysis records, and the second domain name knowledge base stores mapping relations between domain names and access modes.
In a specific implementation, after the target domain name is resolved by each resolution node to obtain a domain name system DNS resolution record corresponding to the target domain name and an access manner as an alternative resolution result, the domain name repository may be further split, for example, into a first domain name repository including a mapping relationship between the domain name and the DNS resolution record, or into a second domain name repository including a mapping relationship between the domain name and the access manner.
It should be noted that in the embodiment of the present invention, the domain name repository may be split into multiple repositories by other splitting methods, which is not limited in the embodiment of the present invention.
According to the technical scheme, the target domain name is resolved through the resolution node, so that a Domain Name System (DNS) resolution record corresponding to the target domain name and an access mode are obtained as alternative resolution results; the domain name knowledge base is split into a first domain name knowledge base and a second domain name knowledge base, so that a knowledge base containing a plurality of domain names and a plurality of domain name resolution results can be generated, and a basis is provided for subsequent statistical analysis of the domain names.
In order to enable those skilled in the art to better understand the method for generating the domain name repository according to the present embodiment, a specific example is described below, where the specific process includes:
1. Active and effective domain name information is obtained by actively crawling website ranking websites such as www.alexa.cn websites through a crawler program; and performing breadth crawling on the active domain names, and acquiring the domain names as much as possible.
2. And carrying out national analysis on the domain name through analysis nodes built nationally, so as to find different analysis and access conditions of the website.
Specifically, the domain name extracted in the step 1 may be actively sent to a probe server deployed nationwide in an interface manner, so as to obtain a DNS resolution record corresponding to the domain name and corresponding access information in real time, which specifically includes: the domain name, the resolution IP, the resolution server IP address, the actual access IP, the actual domain name access state, the resolution time and other information.
3. And carrying out dictionary table association, access condition clustering and IP dispersion degree discrimination on the domain names so as to form a domain name resolution and access mode list and calibrating the corresponding similarity.
Specifically, the domain name in the step 2 may be compared and filtered based on a known analysis condition dictionary table, and the corresponding domain name is associated with the corresponding domain name and the analysis IP and the analysis manufacturer, then the domain name in the step 2 may be compared and filtered based on a known access mode dictionary table, and the corresponding domain name and the access IP are associated with the corresponding access mode and the access manufacturer, and meanwhile the analysis and the access relation are associated, and the corresponding domain name analysis access relation dictionary is updated.
4. Aiming at the means of domain name resolution and access mode list association, record inquiry, whois inquiry, manual discrimination and the like, the association relationship between an access manufacturer and an access manufacturer is established.
Specifically, the domain name which is not related to the existing dictionary table can be screened out in the step 3 for corresponding state supplement, and the main supplementary contents are as follows: the method comprises the steps of a domain name, an analysis server IP corresponding to the domain name, an analysis manufacturer name, an access IP, an access manufacturer name, a domain name access state, units to which the domain name belongs and the like. Thereby creating a corresponding relationship dictionary.
Specifically, the domain name which is not related with the existing dictionary table can be screened out in the step 3 to supplement the access mode. Optionally, the corresponding access mode can be searched for according to the corresponding association of the existing access mode dictionary table, and if the corresponding access mode information is queried, the corresponding access mode information is associated with the domain name; if not, the IP dispersion condition is passed through; the method comprises the steps of judging that a single access or a plurality of accesses are judged with relatively low dispersity, judging that an acceleration access or a cloud access is judged with relatively high dispersity, distinguishing the single access from the plurality of accesses according to whether the IP is unique or not, and distinguishing the acceleration access from the cloud access according to whether the acceleration characteristics exist in a domain name resolution mode or not; the method comprises the steps of determining that acceleration features exist as acceleration access, determining that no acceleration features exist as cloud access, and associating the cloud access with a corresponding access mode dictionary, so that the original dictionary is updated. The updating content is mainly updated for the original collected access modes; the updating of the access modes can be distinguished according to the IP dispersity, the single access IP number value range [0,1], the multi-access IP number value range [2, 128], and the acceleration and cloud access value range [2, 2048]; further, according to the IP dispersity, judging the condition that the multi-access IP overlaps with the acceleration or cloud access value range [2, 128], if the region distribution of the IP addresses is wider and the same IP appears in the history, determining that the access condition is acceleration access or cloud access; furthermore, according to whether acceleration characteristics exist in the analysis process, the acceleration access or cloud access is specifically distinguished, and the acceleration access or cloud access is updated into a corresponding dictionary.
5. Based on the above data, a domain name resolution and access situation knowledge base (i.e., domain name knowledge base) is established, which specifically includes: the name of the domain name owner, the access condition of the domain name, the resolution manufacturer of the domain name, the access condition of the domain name, the access mode of the domain name, the IP of the domain name access node, the country where the domain name is accessed, and the place where the domain name is accessed can be used for analyzing the basic data of the urgent need of industry management such as the resolution and access condition, the survival condition and the like of each domain name through statistics and summarization.
I.e. establishing a basic knowledge base (domain name knowledge base) of final domain name access situations and domain name resolution situations, which may specifically include: the name of a domain name owner, the access condition of a domain name, the resolution manufacturer of a domain name, the access condition of a domain name, the access mode of a domain name, the IP of a domain name access node, the country, the place and the like where the domain name is accessed.
In order to enable those skilled in the art to better understand the method for generating the domain name repository according to this embodiment, the following description uses another specific example, where the specific process includes:
1. The source domain name collection operator is connected with a website www.alexa.cn (or connected with other ranking websites) once every day (can be adjusted according to the actual system requirement), sequentially obtains corresponding domain names from a ranking list, performs breadth crawling according to the domain names to obtain more domain names, then performs active condition detection on the obtained domain names, and then forms a source domain name list L1, and comprises the following steps: domain name, access time, access status.
2. The domain name resolution information acquisition operator firstly filters out known inaccessible domain names through the dictionary table association operator according to all domain names in the L1, and sends the screened domain names to a nationally constructed DNS resolution point to carry out nationally domain name resolution, and a current time tag is added to form a local domain name resolution knowledge base D1.
3. The domain name access condition acquisition operator filters domain names with known access conditions according to all domain names in the D1 through the dictionary table association operator, and then analyzes access conditions for the rest domain names. Calculating according to a cluster operator of the number of the domain names and the access IPs to generate a forest (set) S1, then generating a forest S2 according to a smaller corresponding relation between the access IPs and the domain names, and generating a domain name list access knowledge base D2 according to the actual accessible condition of the domain names; generating a forest S3 with more corresponding relation between the access IP and the domain name, screening out nodes with lower dispersion degree based on the generated S3, generating a forest S4, and generating a domain name multi-access knowledge base D3 according to the actual accessible condition of the domain name; and S3, analyzing CDN acceleration characteristics of the nodes with higher dispersity in the S3 to generate a forest S5, and generating a domain name acceleration access knowledge base D4 according to the actual accessible condition of the domain name. And finally, removing the contents of S5 and S4 in S3, generating a forest S6, and generating a domain name cloud access knowledge base D5 according to the actual accessible condition of the domain name. The specific generation process of this step can be as shown in fig. 5.
In the examples, the actual access condition and analysis condition of the domain name can be more accurately positioned, and the history access and history analysis are recorded (analysis manufacturer, analysis IP, access mode, access manufacturer, access IP and access place), so that the corresponding activity characteristics of the domain name can be more deeply known, support can be provided for related industry management and industry development more comprehensively, and meanwhile, the analysis and the history access condition of the domain name can be traced.
Example five
Fig. 6 is a schematic structural diagram of a domain name repository generating apparatus according to a fifth embodiment of the present invention, which may execute the domain name repository generating method according to the foregoing embodiments. Referring to fig. 6, the apparatus includes: the analysis result obtaining module 610, the first analysis result storage module 620 and the second analysis result storage module 630.
The alternative resolution result obtaining module 610 is configured to obtain a target domain name, and resolve the target domain name through at least one resolution node to obtain at least one alternative resolution result corresponding to the target domain name;
The first alternative resolution result storage module 620 is configured to store, if it is determined that the domain name knowledge base does not store the target domain name, a mapping relationship between the target domain name and each alternative resolution result in the domain name knowledge base;
and the second alternative resolution result storage module 630 is configured to, if it is determined that the domain name knowledge base stores the reference resolution result that matches the target domain name, obtain a portion of each alternative resolution result that does not overlap the reference resolution result, and additionally store the portion in the domain name knowledge base.
According to the scheme of the embodiment, the target domain name is obtained through the alternative analysis result obtaining module, and the target domain name is analyzed through the at least one analysis node, so that at least one alternative analysis result corresponding to the target domain name is obtained; storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base through a first alternative analysis result storage module; the second alternative analysis result storage module is used for obtaining the part, which is not overlapped with the reference analysis result, of each alternative analysis result and additionally storing the part in the domain name knowledge base, so that the knowledge base containing a plurality of domain names and a plurality of domain name analysis results can be generated, and a basis is provided for the follow-up statistical analysis of the domain names.
Optionally, the alternative resolution result obtaining module 610 is specifically configured to resolve the target domain name through the resolution node, so as to obtain a domain name system DNS resolution record and an access manner corresponding to the target domain name as an alternative resolution result.
Optionally, the first alternative resolution result storage module 620 is specifically configured to add the target domain name to the domain name repository, and create a reference resolution record set matched with the target domain name, and a reference access mode set; performing de-duplication processing on each DNS analysis record included in each alternative analysis result, and adding each alternative analysis result after the de-duplication processing into a reference analysis record set; and performing de-duplication processing on each access mode included in each alternative analysis result, and adding each access mode subjected to the de-duplication processing to the reference access mode set.
Optionally, the second alternative resolution result storage module 630 is specifically configured to obtain each DNS resolution record included in each alternative resolution result, and add a DNS resolution record that does not belong to the reference resolution record set to the resolution record set; and/or respectively acquiring each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set.
Optionally, in the embodiment of the present invention, the DNS resolution record includes at least one of the following: resolution server IP, resolution manufacturer name, access IP, access manufacturer name, domain name access state or unit information of domain name corresponding to target domain name.
Optionally, in the embodiment of the present invention, the access manner includes at least one of the following: single access, multiple access, cloud access, or accelerated access.
Optionally, the device for generating a domain name knowledge base related to the embodiment of the present invention further includes a knowledge base splitting module, configured to split the domain name knowledge base into a first domain name knowledge base and a second domain name knowledge base; the first domain name knowledge base stores mapping relations between domain names and DNS analysis records, and the second domain name knowledge base stores mapping relations between domain names and access modes.
The domain name knowledge base generating device provided by the embodiment of the invention can execute the domain name knowledge base generating method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example six
Fig. 7 is a schematic structural diagram of a computer device according to a sixth embodiment of the present invention, and as shown in fig. 7, the computer device includes a processor 70, a memory 71, an input device 72, and an output device 73; the number of processors 70 in the computer device may be one or more, one processor 70 being taken as an example in fig. 7; the processor 70, memory 71, input means 72 and output means 73 in the computer device may be connected by a bus or other means, in fig. 7 by way of example.
The memory 71 is used as a computer readable storage medium, and may be used to store a software program, a computer executable program, and a module, such as program instructions/modules corresponding to a method for generating a domain name repository in an embodiment of the present invention (for example, the alternative resolution result obtaining module 610, the first alternative resolution result storage module 620, and the second alternative resolution result storage module 630 in the generating device of the domain name repository). The processor 70 executes various functional applications of the computer device and data processing by executing software programs, instructions and modules stored in the memory 71, i.e., implements the domain name knowledge base generation method described above.
The memory 71 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 71 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 71 may further include memory remotely located relative to processor 70, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 72 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the computer device. The output means 73 may comprise a display device such as a display screen.
Example seven
A seventh embodiment of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a method of generating a domain name knowledge base, the method comprising:
acquiring a target domain name, and resolving the target domain name through at least one resolving node to obtain at least one alternative resolving result corresponding to the target domain name;
if the fact that the target domain name is not stored in the domain name knowledge base is determined, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base;
if the domain name knowledge base is determined to store the reference resolution result matched with the target domain name, acquiring the non-overlapping part of the alternative resolution results and the reference resolution result, and additionally storing the non-overlapping part in the domain name knowledge base.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the method operations described above, and may also perform the related operations in the domain name knowledge base generation method provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk, or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the domain name knowledge base generating apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (7)

1. The method for generating the domain name knowledge base is characterized by comprising the following steps of:
Acquiring a target domain name, and resolving the target domain name through at least one resolving node to obtain at least one alternative resolving result corresponding to the target domain name;
If the target domain name is not stored in the domain name knowledge base, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base;
if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring a part, which is not overlapped with the reference analysis result, of each alternative analysis result, and additionally storing the part in the domain name knowledge base;
The method comprises the steps of analyzing the target domain name through an analysis node to obtain an alternative analysis result corresponding to the target domain name, wherein the steps include:
Resolving the target domain name through a resolving node to obtain a Domain Name System (DNS) resolving record corresponding to the target domain name and an access mode as the alternative resolving result;
storing the mapping relationship between the target domain name and each of the alternative resolution results in the domain name knowledge base, including:
adding the target domain name into the domain name knowledge base, and creating a reference analysis record set matched with the target domain name and a reference access mode set;
performing de-duplication processing on each DNS analysis record included in each alternative analysis result, and adding each alternative analysis result after the de-duplication processing into the reference analysis record set;
performing de-duplication processing on each access mode included in each alternative analysis result, and adding each access mode subjected to the de-duplication processing to the reference access mode set;
The step of obtaining the part, which is not overlapped with the reference analysis result, of the candidate analysis result and additionally storing the part in the domain name knowledge base comprises the following steps:
respectively acquiring each DNS analysis record included in each alternative analysis result, and adding the DNS analysis record which does not belong to the reference analysis record set into the analysis record set; and/or
And respectively acquiring each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set.
2. The method of claim 1, wherein the DNS resolution record includes at least one of:
And resolving the server IP, resolving the manufacturer name, accessing the IP, accessing the manufacturer name, domain name access state or unit information of the domain name corresponding to the target domain name.
3. The method of claim 1, wherein the access manner comprises at least one of:
single access, multiple access, cloud access, or accelerated access.
4. The method as recited in claim 1, further comprising:
splitting the domain name knowledge base into a first domain name knowledge base and a second domain name knowledge base;
the first domain name knowledge base stores mapping relations between domain names and DNS analysis records, and the second domain name knowledge base stores mapping relations between domain names and access modes.
5. A domain name knowledge base generating device, comprising:
The system comprises an alternative resolution result acquisition module, a target domain name analysis module and a target domain name analysis module, wherein the alternative resolution result acquisition module is used for acquiring a target domain name, resolving the target domain name through at least one resolution node and obtaining at least one alternative resolution result corresponding to the target domain name;
The first alternative resolution result storage module is used for storing the mapping relation between the target domain name and each alternative resolution result in the domain name knowledge base if the target domain name is determined not to be stored in the domain name knowledge base;
The second alternative analysis result storage module is used for acquiring the non-overlapping part of each alternative analysis result and the reference analysis result and additionally storing the non-overlapping part in the domain name knowledge base if the reference analysis result matched with the target domain name is stored in the domain name knowledge base;
The alternative analysis result obtaining module is specifically configured to analyze the target domain name through the analysis node, and obtain a domain name system DNS analysis record corresponding to the target domain name and an access mode as an alternative analysis result;
The first alternative analysis result storage module is specifically configured to add a target domain name into a domain name knowledge base, and create a reference analysis record set matched with the target domain name, and a reference access mode set; performing de-duplication processing on each DNS analysis record included in each alternative analysis result, and adding each alternative analysis result after the de-duplication processing into a reference analysis record set; performing de-duplication treatment on each access mode included in each alternative analysis result, and adding each access mode subjected to the de-duplication treatment to a reference access mode set;
The second alternative analysis result storage module is specifically configured to obtain each DNS analysis record included in each alternative analysis result, and add a DNS analysis record that does not belong to the reference analysis record set to the analysis record set; and/or respectively acquiring each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of generating a domain name repository according to any one of claims 1-4 when the program is executed by the processor.
7. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing the method of generating a domain name repository according to any of claims 1-4.
CN202010845644.6A 2020-08-20 2020-08-20 Domain name knowledge base generation method and device, computer equipment and storage medium Active CN112015910B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010845644.6A CN112015910B (en) 2020-08-20 2020-08-20 Domain name knowledge base generation method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010845644.6A CN112015910B (en) 2020-08-20 2020-08-20 Domain name knowledge base generation method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112015910A CN112015910A (en) 2020-12-01
CN112015910B true CN112015910B (en) 2024-05-17

Family

ID=73504390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010845644.6A Active CN112015910B (en) 2020-08-20 2020-08-20 Domain name knowledge base generation method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112015910B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112685072B (en) * 2020-12-31 2023-08-01 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for generating communication address knowledge base

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025793A (en) * 2010-01-22 2011-04-20 中国移动通信集团北京有限公司 Domain name resolution method and system and DNS in IP network
US8832283B1 (en) * 2010-09-16 2014-09-09 Google Inc. Content provided DNS resolution validation and use
CN108156274A (en) * 2017-12-18 2018-06-12 杭州迪普科技股份有限公司 Equipment is made to obtain the method and device of domain name mapping result in a kind of VPN network
CN108574744A (en) * 2017-07-28 2018-09-25 北京金山云网络技术有限公司 A kind of domain name analytic method, device, electronic equipment and readable storage medium storing program for executing
CN108900581A (en) * 2018-06-12 2018-11-27 恒安嘉新(北京)科技股份公司 A kind of method for building up of the key feature knowledge base of large-scale website
CN109165334A (en) * 2018-09-20 2019-01-08 恒安嘉新(北京)科技股份公司 A method of establishing CDN producer primary knowledge base
CN109241292A (en) * 2018-08-13 2019-01-18 恒安嘉新(北京)科技股份公司 A method of name server architectural knowledge map is established based on the passive data of master
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160260112A1 (en) * 2015-03-03 2016-09-08 Go Daddy Operating Company, LLC System and method for market research within a social network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025793A (en) * 2010-01-22 2011-04-20 中国移动通信集团北京有限公司 Domain name resolution method and system and DNS in IP network
US8832283B1 (en) * 2010-09-16 2014-09-09 Google Inc. Content provided DNS resolution validation and use
CN108574744A (en) * 2017-07-28 2018-09-25 北京金山云网络技术有限公司 A kind of domain name analytic method, device, electronic equipment and readable storage medium storing program for executing
CN108156274A (en) * 2017-12-18 2018-06-12 杭州迪普科技股份有限公司 Equipment is made to obtain the method and device of domain name mapping result in a kind of VPN network
CN108900581A (en) * 2018-06-12 2018-11-27 恒安嘉新(北京)科技股份公司 A kind of method for building up of the key feature knowledge base of large-scale website
CN109241292A (en) * 2018-08-13 2019-01-18 恒安嘉新(北京)科技股份公司 A method of name server architectural knowledge map is established based on the passive data of master
CN109165334A (en) * 2018-09-20 2019-01-08 恒安嘉新(北京)科技股份公司 A method of establishing CDN producer primary knowledge base
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"浅谈网络中管理系统(DDNS)中动态域名服务器的设计";耿立伟;《硅谷》;第50页 *

Also Published As

Publication number Publication date
CN112015910A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
US10027688B2 (en) Method and system for detecting malicious and/or botnet-related domain names
US10404731B2 (en) Method and device for detecting website attack
CN109241292B (en) Method for establishing domain name server system knowledge graph based on active and passive data
US8504673B2 (en) Traffic like NXDomains
CN102710795B (en) Hotspot collecting method and device
CN103888490A (en) Automatic WEB client man-machine identification method
RU2722693C1 (en) Method and system for detecting the infrastructure of a malicious software or a cybercriminal
CN111104579A (en) Identification method and device for public network assets and storage medium
CN106888280A (en) DNS update methods, apparatus and system
CN110727663A (en) Data cleaning method, device, equipment and medium
CN112769838B (en) Access user filtering method, device, equipment and storage medium
CN106104550A (en) Site information extraction element, system, site information extracting method and site information extraction procedure
CN109165334A (en) A method of establishing CDN producer primary knowledge base
CN114328962A (en) Method for identifying abnormal behavior of web log based on knowledge graph
CN108154024B (en) Data retrieval method and device and electronic equipment
CN111488594A (en) Authority checking method and device based on cloud server, storage medium and terminal
CN114124895A (en) Domain name data processing method, domain name description method, electronic device and storage medium
Sujatha Improved user navigation pattern prediction technique from web log data
CN112015910B (en) Domain name knowledge base generation method and device, computer equipment and storage medium
US10171415B2 (en) Characterization of domain names based on changes of authoritative name servers
CN105069074A (en) Strategy configuration file processing method, device and system
CN111010456A (en) Main domain name acquisition and verification method
CN107704494B (en) User information collection method and system based on application software
CN114765599A (en) Sub-domain name acquisition method and device
US20210173729A1 (en) Systems and methods of application program interface (api) parameter monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant