CN112015910A - Method and device for generating domain name knowledge base, computer equipment and storage medium - Google Patents

Method and device for generating domain name knowledge base, computer equipment and storage medium Download PDF

Info

Publication number
CN112015910A
CN112015910A CN202010845644.6A CN202010845644A CN112015910A CN 112015910 A CN112015910 A CN 112015910A CN 202010845644 A CN202010845644 A CN 202010845644A CN 112015910 A CN112015910 A CN 112015910A
Authority
CN
China
Prior art keywords
domain name
knowledge base
resolution
alternative
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010845644.6A
Other languages
Chinese (zh)
Inventor
张健
石磊
孟宝权
王杰
梁彧
杨满智
蔡琳
田野
傅强
金红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202010845644.6A priority Critical patent/CN112015910A/en
Publication of CN112015910A publication Critical patent/CN112015910A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Abstract

The embodiment of the invention discloses a method and a device for generating a domain name knowledge base, computer equipment and a storage medium. The method comprises the following steps: acquiring a target domain name, and analyzing the target domain name through at least one analyzing node to obtain at least one alternative analyzing result corresponding to the target domain name; if the domain name knowledge base is determined not to store the target domain name, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base; and if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring the part, which is not overlapped with the reference analysis result, of each alternative analysis result and additionally storing the part into the domain name knowledge base. According to the scheme provided by the embodiment of the invention, a knowledge base containing a plurality of domain names and a plurality of domain name resolution results can be generated, so that a basis is provided for the subsequent statistical analysis of the domain names.

Description

Method and device for generating domain name knowledge base, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method and a device for generating a domain name knowledge base, computer equipment and a storage medium.
Background
Domain name resolution refers to the process of converting a domain name to an Internet Protocol (IP) address; this process is typically done by the browser automatically interacting with a Domain Name System (DNS) server without the visiting user's perception; the DNS server is usually set up by a domain name registrar and an infrastructure operator, including authoritative resolution and recursive resolution. In general, the same domain name is resolved, and different resolution results can be obtained (wherein, different resolution results can correspond to different DNS resolution records and access modes of access nodes); for example, the resolution results for the same domain name at different times, or at different locations, are all different.
At the present stage, relevant personnel only pay attention to how to accelerate the speed and efficiency of domain name resolution, and do not perform statistical analysis on the domain name resolution result, and cannot effectively master the domain name resolution result.
Disclosure of Invention
The embodiment of the invention provides a method and a device for generating a domain name knowledge base, computer equipment and a storage medium, which are used for generating the knowledge base comprising a plurality of domain names and a plurality of domain name resolution results and providing a basis for the subsequent statistical analysis of the domain names.
In a first aspect, an embodiment of the present invention provides a method for generating a domain name knowledge base, including:
acquiring a target domain name, and analyzing the target domain name through at least one analyzing node to obtain at least one alternative analyzing result corresponding to the target domain name;
if the domain name knowledge base is determined not to store the target domain name, storing the mapping relation between the target domain name and each alternative resolution result in the domain name knowledge base;
and if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring the part, which is not overlapped with the reference analysis result, of each alternative analysis result and additionally storing the part into the domain name knowledge base.
In a second aspect, an embodiment of the present invention further provides a device for generating a domain name knowledge base, including:
the alternative resolution result acquisition module is used for acquiring a target domain name and resolving the target domain name through at least one resolution node to obtain at least one alternative resolution result corresponding to the target domain name;
the first alternative analysis result storage module is used for storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base if the target domain name is determined not to be stored in the domain name knowledge base;
and the second alternative analysis result storage module is used for acquiring a part of each alternative analysis result which is not overlapped with the reference analysis result and additionally stored in the domain name knowledge base if the reference analysis result matched with the target domain name is determined to be stored in the domain name knowledge base.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the program, implements the method for generating a domain name repository according to any embodiment of the present invention.
In a fourth aspect, the embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are used to perform the method for generating a domain name repository according to any one of the embodiments of the present invention.
According to the embodiment of the invention, the target domain name is analyzed through different analyzing nodes to obtain a plurality of alternative analyzing results; further determining whether the domain name knowledge base stores the target domain name; if not, directly storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base; otherwise, the part of each alternative analysis result of the target domain name, which is not overlapped with the reference analysis result in the domain name knowledge base, is acquired and additionally stored in the domain name knowledge base, so that the knowledge base comprising a plurality of domain names and a plurality of domain name analysis results can be generated, and a basis is provided for the subsequent statistical analysis of the domain names.
Drawings
FIG. 1 is a flowchart of a method for generating a domain name knowledge base according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a method for generating a domain name knowledge base according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a method for generating a domain name knowledge base according to a third embodiment of the present invention;
FIG. 4 is a flowchart of a method for generating a domain name knowledge base according to a fourth embodiment of the present invention;
FIG. 5 is a flowchart of a method for generating a domain name knowledge base according to a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a domain name knowledge base generation apparatus in the fifth embodiment of the present invention;
fig. 7 is a schematic structural diagram of a computer device in a sixth embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Example one
Fig. 1 is a flowchart of a method for generating a domain name repository in a first embodiment of the present invention, where this embodiment is applicable to a case where a domain name repository is determined according to a plurality of domain name resolution results, and the method may be implemented by a device for generating a domain name repository, where the device may be implemented in a software and/or hardware manner and is integrated in a computer device that executes the method, and specifically, referring to fig. 1, the method specifically includes the following steps:
and step 110, acquiring a target domain name, and analyzing the target domain name through at least one analyzing node to obtain at least one alternative analyzing result corresponding to the target domain name.
A Domain Name (Domain Name), also called a network Domain, is a Name of a computer or a group of computers on the Internet, which is composed of a string of names separated by points, and is used for identifying the computer (sometimes referred to as a geographical location) during data transmission. Because the IP address has the disadvantages of inconvenient memorization and incapability of displaying the Name and property of the address organization, people design a Domain Name and map the Domain Name and the IP address with each other through a Domain Name System (DNS), so that users can access the internet more conveniently without remembering the IP address which can be directly read by a machine.
In practical application, the domain names adopt a hierarchical structure, the highest level is the root domain name, the second level is the top level domain name, the second level is the first level domain name, the second level is the third level, and the like. For example, ". is the root domain name,". com ",". cn "is the top level domain name," abc.com "is the first level domain name, and" www.abc.com "is the second level domain name.
In this embodiment of the present invention, the target domain name may be any one or more domain names, which is not limited in this embodiment of the present invention. In an optional implementation manner of the embodiment of the present invention, the domain name information with higher activity and effectiveness may be crawled in the ranking website of the website through a crawler program, for example, a top 100, a top 200, or more number of effective domain names ranked on the website may be crawled in the ranking website as target domain names. In the embodiment of the invention, a large number of effective domain names with high activity can be obtained through a domain name service enterprise to serve as target domain names, which is not limited in the embodiment of the invention.
The parsing node in the embodiment of the present invention may be a probe server or an electronic device such as a computer, which are distributed in any geographic location, and is not limited in the embodiment of the present invention. For example, the system can be a probe server established in Beijing, Tianjin or New York.
In a specific implementation, after the target domain name is obtained, the target domain name may be further analyzed by the analyzing nodes established at the geographic positions, so as to obtain a plurality of alternative analyzing results corresponding to the target domain name. For example, the target domain name may be resolved by resolution nodes established in beijing, tianjin, and xinjiang, respectively, or may be resolved by all resolution nodes established in various geographic locations, which is not limited in the embodiment of the present invention.
And step 120, if it is determined that the target domain name is not stored in the domain name knowledge base, storing the mapping relationship between the target domain name and each alternative resolution result in the domain name knowledge base.
The domain name repository may include different resolution results of a plurality of domain names, for example, a plurality of resolution results of domain name a, or a plurality of resolution results of domain name B, and so on. It is understood that the domain name repository may be an empty repository at an initial stage, which may not contain any domain name resolution results.
In a specific implementation, after at least one alternative resolution result corresponding to the target domain name is obtained, whether the domain name knowledge base contains the target domain name can be further determined; if it is determined that the domain name knowledge base does not store the target domain name, and accordingly, does not contain the resolution result of the target domain name, the mapping relationship between the target domain name and each alternative resolution result corresponding to the target domain name may be stored in the domain name knowledge base.
It should be noted that, in the domain name knowledge base, each domain name corresponds to its corresponding alternative resolution result; the domain name knowledge base can contain a plurality of domain names and mapping relations of alternative analysis results corresponding to the domain names; for example, it may include: the mapping relationship between the domain name a and each alternative analysis result corresponding to the domain name a, the mapping relationship between the domain name B and each alternative analysis result corresponding to the domain name B, or the mapping relationship between the domain name C and each alternative analysis result corresponding to the domain name C, and so on.
Step 130, if it is determined that the domain name knowledge base stores the reference analysis result matched with the target domain name, acquiring a part of each alternative analysis result which is not overlapped with the reference analysis result, and additionally storing the part of each alternative analysis result in the domain name knowledge base.
In a specific implementation, after at least one alternative analysis result corresponding to a target domain name is obtained, if it is determined that the domain name knowledge base includes the target domain name and a reference analysis result matched with the target domain name is stored, a portion of each alternative analysis result that does not overlap with the reference analysis result may be obtained, and the non-overlapping portion is stored in the domain name knowledge base.
In an optional implementation manner of the embodiment of the present invention, each alternative analysis result may be compared with the reference analysis result, so as to determine different portions between each alternative analysis result and the reference analysis result, that is, portions of each alternative analysis result that do not overlap with the reference analysis result; further, the part is stored in a domain name knowledge base, so that each resolution result corresponding to the target domain name in the domain name knowledge base is perfected.
According to the technical scheme of the embodiment, the target domain name is analyzed through different analyzing nodes to obtain a plurality of alternative analyzing results; further determining whether the domain name knowledge base stores the target domain name; if not, directly storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base; otherwise, the part of each alternative analysis result of the target domain name, which is not overlapped with the reference analysis result in the domain name knowledge base, is acquired and additionally stored in the domain name knowledge base, so that the knowledge base comprising a plurality of domain names and a plurality of domain name analysis results can be generated, and a basis is provided for the subsequent statistical analysis of the domain names.
Example two
Fig. 2 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, which is a further refinement of the foregoing technical solutions, and the technical solutions in this embodiment may be combined with various alternatives in one or more of the foregoing embodiments. As shown in fig. 2, the method for generating the domain name repository may include the following steps:
and step 210, analyzing the target domain name through the analyzing node to obtain a DNS analyzing record corresponding to the target domain name and an access mode as an alternative analyzing result.
In a specific implementation, after the target domain name is obtained, the target domain name may be further analyzed by an analyzing node to obtain a DNS analysis record and an access manner corresponding to the target domain name, and the DNS analysis record and the access manner corresponding to the target domain name are used as alternative analysis results.
It should be noted that, in the embodiment of the present invention, each domain name is resolved by the resolution node, and the obtained domain name resolution result may include a DNS resolution record corresponding to the resolved domain name and an access manner of the resolution node. Different nodes analyze the target domain name, and DNS analysis records and access modes in the obtained analysis result can be different.
Optionally, the DNS resolution record may include at least one of: and the resolution server IP corresponding to the target domain name, the resolution manufacturer name, the access IP, the access provider name, the domain name access state or the unit information of the domain name. The IP of the resolution server corresponding to the target domain name is the IP address of the server for resolving the target domain name; the name of the resolution manufacturer is the name of the manufacturer who resolves the target domain name; the domain name access state may be that the domain name is accessible or not accessible.
Optionally, the access manner of the resolution node may include at least one of the following: single access, multiple access, cloud access, or accelerated access. The single access and the multiple access are distinguished according to whether the IP is unique or not, and accelerated access, namely Content Delivery Network (CDN) accelerated access and cloud access are distinguished according to whether an acceleration feature exists or not in a domain name resolution mode, whether the acceleration feature exists or not is judged as accelerated access, and whether the acceleration feature does not exist is judged as cloud access. In the embodiment of the invention, the value range of the single access IP number [0, 1], the value range of the multi-access IP number [2, 128], and the value range of the acceleration and cloud access IP number [2, 2048 ].
Step 220, adding the target domain name into the domain name knowledge base, and creating a reference resolution record set matched with the target domain name and a reference access mode set.
In a specific implementation, if it is determined that the domain name knowledge base does not store the target domain name, the target domain name may be further added to the domain name knowledge base, and a reference resolution record set and a reference access mode set that are matched with the target domain name may be created. Wherein, the reference analysis record set is the DNS analysis record set.
It can be understood that the target domain name is resolved by different resolution nodes in different geographical locations, or the target domain name is resolved by the same resolution node at different times, and the obtained resolution results are different, that is, the DNS resolution records and the access modes may be different.
After the target domain name is added into the domain name knowledge base, the alternative resolution results corresponding to the target domain name can be further clustered, so that a DNS resolution record set and an access mode set corresponding to the target domain name are obtained. In the embodiment of the invention, the target domain name is added into the domain name knowledge base, the DNS resolution record set corresponding to the target domain name is determined as a reference resolution record set, and the access mode set corresponding to the DNS resolution record set is determined as a reference access mode set.
And step 230, performing deduplication processing on the DNS resolution records included in the alternative resolution results, and adding the deduplication processed alternative resolution results to the reference resolution record set.
In a specific implementation, after a reference resolution record set matched with a target domain name and a reference access mode set are created, the reference resolution record set may be further processed, specifically, repeated DNS resolution records in the set may be filtered, and each candidate resolution result after deduplication processing is added to the reference resolution record set.
And 240, performing deduplication processing on each access mode included in each alternative analysis result respectively, and adding each access mode subjected to deduplication processing to the reference access mode set.
In a specific implementation, after a reference resolution record set matched with a target domain name is created and an access mode set is referred to, the access mode set may be further processed, specifically, repeated access modes in the set may be filtered, and each access mode subjected to deduplication processing is added to the reference access mode set.
According to the technical scheme of the embodiment, the target domain name is analyzed through the analyzing node to obtain a domain name system DNS analyzing record corresponding to the target domain name and an access mode as alternative analyzing results; when the target domain name is determined not to be stored in the domain name knowledge base, adding the target domain name into the domain name knowledge base, and creating a reference resolution record set matched with the target domain name and a reference access mode set; performing duplicate removal processing on each DNS analysis record included in each alternative analysis result respectively, and adding each alternative analysis result after the duplicate removal processing into a reference analysis record set; and performing deduplication processing on each access mode included in each alternative analysis result, adding each access mode subjected to deduplication processing into a reference access mode set, generating a knowledge base containing a plurality of domain names and a plurality of domain name analysis results, providing a basis for subsequent statistical analysis of the domain names, and performing historical traceability on domain name analysis and access conditions.
EXAMPLE III
Fig. 3 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, which is a further refinement of the foregoing technical solutions, and the technical solutions in this embodiment may be combined with various alternatives in one or more of the foregoing embodiments. As shown in fig. 3, the method for generating the domain name repository may include the following steps:
and 310, analyzing the target domain name through the analyzing node to obtain a DNS analyzing record corresponding to the target domain name and an access mode as an alternative analyzing result.
And step 320, respectively obtaining each DNS analysis record included in each alternative analysis result, and adding the DNS analysis record not belonging to the reference analysis record set into the analysis record set.
In a specific implementation, after it is determined that the domain name knowledge base stores the target domain name and the reference resolution result matched with the target domain name, the target domain name may be further analyzed by each resolution node, each DNS resolution record included in each obtained alternative resolution result may be further obtained, and DNS resolution records not belonging to the reference resolution record set may be added to the resolution record set.
For example, if each candidate resolution result includes a DNS resolution record A, DNS resolution record B and a DNS resolution record C, and the reference resolution record set includes a DNS resolution record a and a DNS resolution record B, the DNS resolution record C may be continuously added to the reference resolution record set, so as to obtain a new resolution record set.
And step 330, respectively obtaining each access mode included in each alternative analysis result, and adding an access mode which does not belong to the reference access mode set into the access mode set.
In a specific implementation, after determining that the domain name knowledge base stores the reference resolution result matched with the target domain name and the target domain name, the method may further obtain each access mode included in each candidate resolution result obtained by resolving the target domain name through each resolution node, and add the access mode not belonging to the reference access mode set to the access mode set.
For example, if each candidate analysis result includes an access method a, an access method B, and an access method C, and the reference access method set includes the access method a and the access method B, the access method C may be continuously added to the reference access method set, so as to obtain a new access method set.
According to the technical scheme of the embodiment, the target domain name is analyzed through the analyzing node, and a DNS analyzing record corresponding to the target domain name and an access mode are obtained and used as alternative analyzing results; if the domain name knowledge base is determined to store reference analysis results matched with the target domain name, respectively acquiring DNS analysis records included in the alternative analysis results, and adding DNS analysis records which do not belong to the reference analysis record set into the analysis record set; and/or respectively acquiring each access mode included in each alternative analysis result, adding the access modes which do not belong to the reference access mode set into the access mode set, and generating a knowledge base containing a plurality of domain names and a plurality of domain name analysis results to provide a basis for the subsequent statistical analysis of the domain names.
Example four
Fig. 4 is a flowchart of a method for generating a domain name knowledge base in the second embodiment of the present invention, which is a further refinement of the foregoing technical solutions, and the technical solutions in this embodiment may be combined with various alternatives in one or more of the foregoing embodiments. As shown in fig. 4, the method for generating the domain name repository may include the following steps:
and step 410, analyzing the target domain name through the analyzing node to obtain a domain name system DNS analyzing record corresponding to the target domain name and an access mode as an alternative analyzing result.
Step 420, splitting the domain name knowledge base into a first domain name knowledge base and a second domain name knowledge base.
The first domain name knowledge base stores the mapping relation between the domain name and the DNS analysis record, and the second domain name knowledge base stores the mapping relation between the domain name and the access mode.
In a specific implementation, after the target domain name is resolved by each resolution node to obtain a domain name system DNS resolution record corresponding to the target domain name and an access manner as alternative resolution results, the domain name repository may be further split, for example, the domain name repository may be split into a first domain name repository including a mapping relationship between the domain name and the DNS resolution record, or the domain name repository may be split into a second domain name repository including a mapping relationship between the domain name and the access manner.
It should be noted that, in the embodiment of the present invention, the domain name repository may also be split into multiple repositories by using other splitting methods, which are not limited in the embodiment of the present invention.
According to the technical scheme of the embodiment, the target domain name is analyzed through the analyzing node, and a domain name system DNS analyzing record corresponding to the target domain name and an access mode are obtained and used as alternative analyzing results; the domain name knowledge base is divided into a first domain name knowledge base and a second domain name knowledge base, so that the knowledge base containing a plurality of domain names and domain name resolution results can be generated, and basis is provided for the subsequent statistical analysis of the domain names.
In order to make those skilled in the art better understand the method for generating the domain name knowledge base in this embodiment, a specific example is used for description below, and the specific process includes:
1. actively crawling website ranking websites such as an 'www.alexa.cn' website and the like through a crawler program to obtain active and effective domain name information; and (4) crawling the active domain name to acquire the domain name as much as possible.
2. The domain names are analyzed nationally through the nationally constructed analysis nodes, so that different analysis and access conditions of the websites are discovered.
Specifically, the domain name extracted in step 1 may be actively sent to a probe server deployed in 31 provinces across the country in an interface manner, so as to obtain a DNS resolution record corresponding to the domain name and corresponding access information in real time, and the method specifically includes: the domain name, the resolution IP, the resolution server IP address, the actual access IP, the actual domain name access state, the resolution time and other information.
3. And performing dictionary table association, access condition clustering and IP dispersity discrimination aiming at the domain name so as to form a domain name resolution and access mode list and calibrating corresponding similarity.
Specifically, the domain name in step 2 may be compared and filtered based on a known analysis condition dictionary table, and the corresponding domain name is associated with the corresponding domain name and an analysis IP and analysis vendor, and then the domain name in step 2 may be compared and filtered based on a known access method dictionary table, and the corresponding domain name and access IP are associated with the corresponding access method and access vendor, and the analysis and access relationship may be associated at the same time, and the corresponding domain name analysis access relationship dictionary may be updated.
4. And establishing an association relation between an access manufacturer and an access manufacturer by means of domain name resolution and access mode list association, record inquiry, whois inquiry, manual judgment and the like.
Specifically, the domain name which is screened out in step 3 and has not established a relationship with the existing dictionary table may be supplemented in a corresponding state, and the main supplementary contents include: the domain name, the resolution server IP corresponding to the domain name, the resolution manufacturer name, the access IP, the access provider name, the domain name access state, the unit to which the domain name belongs and the like. Thereby establishing a corresponding relational dictionary.
Specifically, the domain name which is screened out in the step 3 and has not established a relationship with the existing dictionary table may be supplemented with the access mode. Optionally, corresponding association may be performed according to an existing access method dictionary table to search for a corresponding access method, and if corresponding access method information is found, the corresponding access method information is associated with the domain name; if not, the dispersion condition of the IP is passed; the single access and the multiple access are distinguished according to whether the IP is unique or not, and the accelerated access and the cloud access are distinguished according to whether the domain name resolution mode has the accelerated characteristic or not; the acceleration characteristic is determined to be acceleration access, the acceleration characteristic is not determined to be cloud access, and the acceleration characteristic is associated to the corresponding access mode dictionary, so that the original dictionary is updated. Wherein, the updating content is mainly to the original converged access mode to update; the updating of the access mode can be distinguished according to the IP dispersity, the value range of the single access IP quantity [0, 1], the value range of the multi-access IP quantity [2, 128], the value range of the acceleration and cloud access [2, 2048 ]; further, the condition that the value ranges of the multi-access IP and the acceleration or cloud access are overlapped [2, 128] is judged according to the IP dispersity, and if the areas where the IP addresses are located are widely distributed and the same IP appears in history, the access condition is determined to be the acceleration access or the cloud access; further, whether acceleration features exist in the analysis process is specifically divided into acceleration access or cloud access, and the acceleration features are updated to the corresponding dictionary.
5. Based on the above data, a domain name resolution and access condition knowledge base (i.e., domain name knowledge base) is established, which specifically includes: the domain name owner name, the domain name access condition, the domain name resolution manufacturer, the domain name access condition, the domain name access mode, the domain name access node IP, the country and the place where the domain name is accessed, and the essential data urgently needed by industry management, such as the resolution and access condition, the survival condition and the like of each domain name can be analyzed through statistics and summarization.
That is, a final basic knowledge base (domain name knowledge base) for the domain name access situation and the domain name resolution situation is established, and the knowledge base may specifically include: the domain name owner name, the domain name access condition, the domain name resolution manufacturer, the domain name access condition, the domain name access mode, the domain name access node IP, the country and the place where the domain name is accessed, and the like.
In order to make those skilled in the art better understand the method for generating the domain name knowledge base in this embodiment, another specific example is used for description below, and the specific process includes:
1. the source domain name collection operator connects website www.alexa.cn once every day (can adjust according to actual system needs) (or connects other ranking websites), obtains corresponding domain name in proper order from "leaderboard", carries out the breadth according to the domain name and crawls, acquires more domain names, then carries out the active condition with the domain name who obtains and detects, then forms source domain name list L1, includes: domain name, access time, access status.
2. The domain name resolution information acquisition operator firstly filters the known inaccessible domain names through a dictionary table association operator according to all the domain names in the L1, sends the screened domain names to DNS resolution points constructed nationwide for national domain name resolution, and adds a current time label to form a local domain name resolution knowledge base D1.
3. And the domain name access condition acquisition operator firstly filters the domain names with known access conditions through the dictionary table association operator according to all the domain names in the D1, and then performs access condition analysis on the rest domain names. Calculating according to the domain name and the access IP number clustering operator to generate a forest (set) S1, then generating a forest S2 according to the access IP with less corresponding relation with the domain name, and generating a domain name list access knowledge base D2 according to the actual accessible condition of the domain name; accessing a generated forest S3 with more corresponding relations between the IP and the domain name, screening out nodes with lower dispersity based on the generated forest S3 to generate a forest S4, and generating a domain name multi-access knowledge base D3 according to the actual accessible condition of the domain name; and after the nodes with higher dispersity in the S3 analyze CDN acceleration characteristics, a forest S5 is generated, and a domain name accelerated access knowledge base D4 is generated according to the actual accessible condition of the domain name. And finally, eliminating the contents of S5 and S4 in S3 to generate a forest S6, and generating a domain name cloud access knowledge base D5 according to the actual accessible condition of the domain name. The specific generation process of this step can be as shown in fig. 5.
In each of the above examples, the actual access situation and the resolution situation of the domain name can be more accurately located, and the historical access and the historical resolution are recorded (resolution manufacturer, resolution IP, access mode, access manufacturer, access IP, access place), so that the corresponding activity characteristics of the domain name can be more deeply known, and therefore, support can be more comprehensively provided for related industry management and industry development, and meanwhile, the historical traceability can be performed on the resolution and access situation of the domain name.
EXAMPLE five
Fig. 6 is a schematic structural diagram of a domain name knowledge base generation device in a fifth embodiment of the present invention, which can execute the domain name knowledge base generation methods in the foregoing embodiments. Referring to fig. 6, the apparatus includes: a candidate parsing result obtaining module 610, a first candidate parsing result storage module 620, and a second candidate parsing result storage module 630.
The alternative resolution result obtaining module 610 is configured to obtain a target domain name, and perform resolution on the target domain name through at least one resolution node to obtain at least one alternative resolution result corresponding to the target domain name;
a first alternative resolution result storage module 620, configured to store, if it is determined that the domain name repository does not store the target domain name, a mapping relationship between the target domain name and each alternative resolution result in the domain name repository;
the second alternative parsing result storage module 630 is configured to, if it is determined that the domain name knowledge base stores the reference parsing result matched with the target domain name, obtain a portion, which is not overlapped with the reference parsing result, of each alternative parsing result, and additionally store the portion in the domain name knowledge base.
In the scheme of this embodiment, a target domain name is obtained by an alternative resolution result obtaining module, and the target domain name is resolved by at least one resolution node to obtain at least one alternative resolution result corresponding to the target domain name; storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base through a first alternative analysis result storage module; the second alternative analysis result storage module is used for acquiring the part of each alternative analysis result which is not overlapped with the reference analysis result and additionally storing the part of each alternative analysis result in the domain name knowledge base, so that the knowledge base comprising a plurality of domain names and a plurality of domain name analysis results can be generated, and a basis is provided for the subsequent statistical analysis of the domain names.
Optionally, the alternative resolution result obtaining module 610 is specifically configured to perform resolution on the target domain name through the resolution node, and obtain a domain name system DNS resolution record corresponding to the target domain name and an access mode as the alternative resolution result.
Optionally, the first alternative parsing result storage module 620 is specifically configured to add a target domain name into the domain name knowledge base, and create a reference parsing record set matched with the target domain name and a reference access mode set; performing duplicate removal processing on each DNS analysis record included in each alternative analysis result respectively, and adding each alternative analysis result after the duplicate removal processing into a reference analysis record set; and respectively carrying out duplicate removal processing on each access mode included in each alternative analysis result, and adding each access mode subjected to the duplicate removal processing into the reference access mode set.
Optionally, the second alternative analysis result storage module 630 is specifically configured to obtain each DNS analysis record included in each alternative analysis result, and add a DNS analysis record that does not belong to the reference analysis record set to the analysis record set; and/or respectively obtaining each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set.
Optionally, in an embodiment of the present invention, the DNS resolution record includes at least one of the following items: and the resolution server IP corresponding to the target domain name, the resolution manufacturer name, the access IP, the access provider name, the domain name access state or the unit information of the domain name.
Optionally, the access method in the embodiment of the present invention includes at least one of the following: single access, multiple access, cloud access, or accelerated access.
Optionally, the apparatus for generating a domain name knowledge base according to the embodiment of the present invention further includes a knowledge base splitting module, configured to split the domain name knowledge base into a first domain name knowledge base and a second domain name knowledge base; the first domain name knowledge base stores the mapping relation between the domain name and the DNS analysis record, and the second domain name knowledge base stores the mapping relation between the domain name and the access mode.
The device for generating the domain name knowledge base provided by the embodiment of the invention can execute the method for generating the domain name knowledge base provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
EXAMPLE six
Fig. 7 is a schematic structural diagram of a computer apparatus according to a sixth embodiment of the present invention, as shown in fig. 7, the computer apparatus includes a processor 70, a memory 71, an input device 72, and an output device 73; the number of the processors 70 in the computer device may be one or more, and one processor 70 is taken as an example in fig. 7; the processor 70, the memory 71, the input device 72 and the output device 73 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 7.
The memory 71 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the domain name repository generation method in the embodiment of the present invention (for example, the alternative resolution result obtaining module 610, the first alternative resolution result storing module 620, and the second alternative resolution result storing module 630 in the domain name repository generation device). The processor 70 executes various functional applications and data processing of the computer device by executing software programs, instructions and modules stored in the memory 71, that is, implements the above-described domain name repository generation method.
The memory 71 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 71 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 71 may further include memory located remotely from the processor 70, which may be connected to a computer device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 72 may be used to receive input numeric or character information and generate key signal inputs relating to user settings and function controls of the computer apparatus. The output device 73 may include a display device such as a display screen.
EXAMPLE seven
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for generating a domain name repository, the method including:
acquiring a target domain name, and analyzing the target domain name through at least one analyzing node to obtain at least one alternative analyzing result corresponding to the target domain name;
if the domain name knowledge base is determined not to store the target domain name, storing the mapping relation between the target domain name and each alternative analysis result in the domain name knowledge base;
and if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring the part, which is not overlapped with the reference analysis result, of each alternative analysis result and additionally storing the part into the domain name knowledge base.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the method for generating a domain name repository provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the apparatus for generating a domain name repository, each unit and each module included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for generating a domain name knowledge base is characterized by comprising the following steps:
acquiring a target domain name, and analyzing the target domain name through at least one analyzing node to obtain at least one alternative analyzing result corresponding to the target domain name;
if the domain name knowledge base is determined not to store the target domain name, storing the mapping relation between the target domain name and each alternative resolution result in the domain name knowledge base;
and if the domain name knowledge base is determined to store the reference analysis result matched with the target domain name, acquiring the part, which is not overlapped with the reference analysis result, of each alternative analysis result and additionally storing the part into the domain name knowledge base.
2. The method according to claim 1, wherein the analyzing the target domain name by the analyzing node to obtain an alternative analyzing result corresponding to the target domain name comprises:
and analyzing the target domain name through an analyzing node to obtain a domain name system DNS analyzing record corresponding to the target domain name and an access mode as the alternative analyzing result.
3. The method of claim 2, wherein storing the mapping relationship between the target domain name and each of the candidate resolution results in the domain name repository comprises:
adding the target domain name into the domain name knowledge base, and creating a reference resolution record set matched with the target domain name and a reference access mode set;
performing deduplication processing on each DNS resolution record included in each alternative resolution result respectively, and adding each alternative resolution result after deduplication processing into the reference resolution record set;
and performing deduplication processing on each access mode included in each alternative analysis result respectively, and adding each access mode subjected to deduplication processing to the reference access mode set.
4. The method according to claim 3, wherein the step of obtaining a part of the alternative resolution result that does not overlap with the reference resolution result and additionally storing the part of the alternative resolution result in the domain name knowledge base comprises:
respectively obtaining DNS analysis records included in each alternative analysis result, and adding DNS analysis records which do not belong to the reference analysis record set into the analysis record set; and/or
And respectively acquiring each access mode included in each alternative analysis result, and adding the access modes which do not belong to the reference access mode set into the access mode set.
5. The method of claim 2, wherein the DNS resolution record comprises at least one of:
and the resolution server IP, the resolution manufacturer name, the access IP, the access provider name, the domain name access state or the unit information of the domain name corresponding to the target domain name.
6. The method of claim 2, wherein the access mode comprises at least one of:
single access, multiple access, cloud access, or accelerated access.
7. The method of claim 2, further comprising:
splitting the domain name knowledge base into a first domain name knowledge base and a second domain name knowledge base;
the first domain name knowledge base stores the mapping relation between the domain name and the DNS analysis record, and the second domain name knowledge base stores the mapping relation between the domain name and the access mode.
8. An apparatus for generating a domain name knowledge base, comprising:
the alternative resolution result acquisition module is used for acquiring a target domain name and resolving the target domain name through at least one resolution node to obtain at least one alternative resolution result corresponding to the target domain name;
the first alternative analysis result storage module is used for storing the mapping relation between the target domain name and each alternative analysis result in a domain name knowledge base if the target domain name is determined not to be stored in the domain name knowledge base;
and the second alternative analysis result storage module is used for acquiring a part of each alternative analysis result which is not overlapped with the reference analysis result and additionally stored in the domain name knowledge base if the reference analysis result matched with the target domain name is determined to be stored in the domain name knowledge base.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of generating a domain name repository according to any of claims 1-7 when executing the program.
10. A storage medium containing computer-executable instructions for performing the method of generating a domain name repository according to any one of claims 1-7 when executed by a computer processor.
CN202010845644.6A 2020-08-20 2020-08-20 Method and device for generating domain name knowledge base, computer equipment and storage medium Pending CN112015910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010845644.6A CN112015910A (en) 2020-08-20 2020-08-20 Method and device for generating domain name knowledge base, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010845644.6A CN112015910A (en) 2020-08-20 2020-08-20 Method and device for generating domain name knowledge base, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112015910A true CN112015910A (en) 2020-12-01

Family

ID=73504390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010845644.6A Pending CN112015910A (en) 2020-08-20 2020-08-20 Method and device for generating domain name knowledge base, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112015910A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112685072A (en) * 2020-12-31 2021-04-20 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for generating communication address knowledge base

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025793A (en) * 2010-01-22 2011-04-20 中国移动通信集团北京有限公司 Domain name resolution method and system and DNS in IP network
US8832283B1 (en) * 2010-09-16 2014-09-09 Google Inc. Content provided DNS resolution validation and use
US20160260039A1 (en) * 2015-03-03 2016-09-08 Go Daddy Operating Company, LLC System and method for domain name community network
CN108900581A (en) * 2018-06-12 2018-11-27 恒安嘉新(北京)科技股份公司 A kind of method for building up of the key feature knowledge base of large-scale website
CN109165334A (en) * 2018-09-20 2019-01-08 恒安嘉新(北京)科技股份公司 A method of establishing CDN producer primary knowledge base
CN109241292A (en) * 2018-08-13 2019-01-18 恒安嘉新(北京)科技股份公司 A method of name server architectural knowledge map is established based on the passive data of master
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025793A (en) * 2010-01-22 2011-04-20 中国移动通信集团北京有限公司 Domain name resolution method and system and DNS in IP network
US8832283B1 (en) * 2010-09-16 2014-09-09 Google Inc. Content provided DNS resolution validation and use
US20160260039A1 (en) * 2015-03-03 2016-09-08 Go Daddy Operating Company, LLC System and method for domain name community network
CN108900581A (en) * 2018-06-12 2018-11-27 恒安嘉新(北京)科技股份公司 A kind of method for building up of the key feature knowledge base of large-scale website
CN109241292A (en) * 2018-08-13 2019-01-18 恒安嘉新(北京)科技股份公司 A method of name server architectural knowledge map is established based on the passive data of master
CN109165334A (en) * 2018-09-20 2019-01-08 恒安嘉新(北京)科技股份公司 A method of establishing CDN producer primary knowledge base
CN110677514A (en) * 2019-10-21 2020-01-10 怀来斯达铭数据有限公司 IP filing information management method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
耿立伟: ""浅谈网络中管理系统(DDNS)中动态域名服务器的设计"", 《硅谷》, pages 50 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112685072A (en) * 2020-12-31 2021-04-20 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for generating communication address knowledge base
CN112685072B (en) * 2020-12-31 2023-08-01 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for generating communication address knowledge base

Similar Documents

Publication Publication Date Title
CN109241292B (en) Method for establishing domain name server system knowledge graph based on active and passive data
US20040078368A1 (en) Indexing virtual attributes in a directory server system
US10404731B2 (en) Method and device for detecting website attack
US8504673B2 (en) Traffic like NXDomains
US20160140232A1 (en) System and Method of Expanding a Search Query
CN110474994A (en) Domain name analytic method, device, electronic equipment and storage medium
EP1989644A2 (en) Systems and methods for finding log files generated by a distributed computer
CN109165334B (en) Method for establishing CDN manufacturer basic knowledge base
CN111104579A (en) Identification method and device for public network assets and storage medium
CN106104550A (en) Site information extraction element, system, site information extracting method and site information extraction procedure
WO2010043257A1 (en) Retrieving configuration records from a configuration management database
CN114124895A (en) Domain name data processing method, domain name description method, electronic device and storage medium
CN106302862A (en) The collection method of a kind of DNS recursion server and system
CN106685951A (en) Network flow filtering system and method based on domain name rules
CN111488594A (en) Authority checking method and device based on cloud server, storage medium and terminal
CN110795434A (en) Method and device for constructing service attribute database
US10171415B2 (en) Characterization of domain names based on changes of authoritative name servers
CN112015910A (en) Method and device for generating domain name knowledge base, computer equipment and storage medium
CN108833424B (en) System for acquiring all resource records of domain name
US20070130316A1 (en) System and method to generate hosting company statistics
CN113364780B (en) Network attack victim determination method, equipment, storage medium and device
CN115001724B (en) Network threat intelligence management method, device, computing equipment and computer readable storage medium
CN111885220B (en) Active acquisition and verification method for target unit IP assets
CN113032471A (en) Database processing method and device, electronic equipment and medium
KR100347987B1 (en) Method of Application Services using Supplementary Information for Internet Addresses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination