CN107888606B - Domain name credit assessment method and system - Google Patents

Domain name credit assessment method and system Download PDF

Info

Publication number
CN107888606B
CN107888606B CN201711206344.8A CN201711206344A CN107888606B CN 107888606 B CN107888606 B CN 107888606B CN 201711206344 A CN201711206344 A CN 201711206344A CN 107888606 B CN107888606 B CN 107888606B
Authority
CN
China
Prior art keywords
domain name
target domain
reputation
information
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711206344.8A
Other languages
Chinese (zh)
Other versions
CN107888606A (en
Inventor
张斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN201711206344.8A priority Critical patent/CN107888606B/en
Publication of CN107888606A publication Critical patent/CN107888606A/en
Application granted granted Critical
Publication of CN107888606B publication Critical patent/CN107888606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the invention provides a domain name credibility assessment method and a domain name credibility assessment system, which are used for improving the accuracy of network security detection. The method provided by the embodiment of the invention comprises the following steps: acquiring website content information corresponding to a target domain name to be detected from the Internet; inputting the website content information into a preset classifier model for domain name classification, and determining a first reputation value corresponding to the target domain name according to a classification result; performing multi-dimensional matching on the target domain name and network security threat information stored in a database, and outputting a second credit value corresponding to the target domain name according to a matching result; and calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value.

Description

Domain name credit assessment method and system
Technical Field
The invention relates to the field of network security, in particular to a domain name credibility assessment method and system.
Background
With the rapid development of internet technology, a large number of malicious attack behaviors appear in the network. An attacker uses physical equipment and resources obtained on a network to develop malicious attack behaviors on the network, such as automatic update downloading of a botnet, automatic update downloading of malicious codes, phishing, automatic sending by using a network automatic scanner or junk mails and the like.
In the existing scheme, a Uniform Resource Locator (URL) or an IP address is often used for evaluating the reputation of a domain name to analyze the corresponding domain name or directly collect the domain name or the IP address of a malicious attacker to perform single-dimensional information matching, so as to detect malicious attack behaviors and further evaluate the reputation of the domain name.
However, an attacker can avoid the detection of antivirus software by means of continuously replacing URLs or updating domain names and the like, so that the detection rate of malicious behaviors is reduced, and secondly, the behavior characteristics of the malicious attacks are usually multidimensional, and the network behaviors are detected based on single-dimensional information of the domain names, so that the risk of false detection exists.
Disclosure of Invention
The embodiment of the invention provides a domain name credibility assessment method and a domain name credibility assessment system, which are used for improving the accuracy of network security detection.
A first aspect of an embodiment of the present invention provides a domain name reputation evaluation method, which may include:
acquiring website content information corresponding to a target domain name to be detected from the Internet;
inputting the website content information into a preset classifier model for domain name classification, and determining a first reputation value corresponding to the target domain name according to a classification result;
performing multi-dimensional matching on the target domain name and network security threat information stored in a database, and outputting a second credit value corresponding to the target domain name according to a matching result;
and calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value.
Optionally, as a possible implementation manner, the acquiring, from the internet, website content information corresponding to a target domain name to be detected includes:
searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name;
and acquiring website content information corresponding to the target domain name from the retrieval record by adopting a web crawler technology.
Optionally, as a possible implementation, the method further includes:
collecting Uniform Resource Locators (URL) fields of a preset number of websites which are ranked at the top in a retrieval record corresponding to the target domain name;
judging whether host parts in URL fields of the websites with the preset number and the top rank are matched with the target domain name or not, and counting the number of the websites successfully matched;
determining a third reputation value corresponding to the target domain name according to the number of successfully matched websites;
and further calculating the reputation corresponding to the target domain name according to the third reputation value.
Optionally, as a possible implementation manner, the performing multidimensional matching on the target domain name and the cyber-security threat information stored in the database includes:
analyzing domain name attribution information corresponding to the target domain name, wherein the domain name attribution information comprises one or more of an IP address, whois information and URL information corresponding to the target domain name;
matching the domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
and/or matching the IP address corresponding to the target domain name with a preset virus network behavior characteristic library.
Optionally, as a possible implementation manner, the domain name reputation evaluation method further includes:
feeding back reputation information corresponding to the target domain name to a user;
the reputation information comprises reputation, domain name type label, domain name attribution information,
and when the credibility corresponding to the target domain name is smaller than a first preset threshold value, marking the domain name type label of the target domain name as a malicious domain name.
Optionally, as a possible implementation manner, the domain name reputation evaluation method further includes:
counting the number of hosts accessing the target domain name;
and when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold value, issuing early warning information aiming at the target domain name.
A second aspect of the embodiments of the present invention provides a domain name reputation evaluation system, which may include:
the information acquisition unit is used for acquiring website content information corresponding to a target domain name to be detected from the Internet;
the domain name classification unit is used for inputting the website content into a preset classifier model for domain name classification and determining a first reputation value corresponding to the target domain name according to a classification result;
and the domain name correlation analysis unit is used for carrying out multi-dimensional matching on the target domain name and the network security threat information stored in the database, and determining a second reputation value corresponding to the target domain name according to a matching result.
And the first calculation unit is used for calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value.
Optionally, as a possible implementation manner, the information collecting unit includes:
the search module is used for searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name;
and the first acquisition module is used for acquiring website content information corresponding to the target domain name from the retrieval record by adopting a web crawler technology.
Optionally, as a possible implementation manner, the information acquisition unit further includes a second acquisition module, configured to acquire Uniform Resource Locators (URL) fields of a preset number of websites ranked at the top in a retrieval record corresponding to the target domain name;
the domain name reputation degree evaluation system further comprises:
the judging unit is used for judging whether host parts in URL fields of the websites with the preset number and the top rank are matched with the target domain name or not, counting the number of the websites successfully matched and determining a third credit value corresponding to the target domain name according to the number of the websites successfully matched;
and the second calculating unit is used for further calculating the reputation corresponding to the target domain name according to the third reputation value.
Optionally, as a possible implementation manner, the domain name association analysis unit includes:
the resolution module is used for resolving domain name attribution information corresponding to the target domain name, wherein the domain name attribution information comprises one or more of an IP address, whois information and URL information corresponding to the target domain name;
the first matching module is used for matching the domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
and the second matching module is used for matching the IP address corresponding to the target domain name with a preset virus network behavior feature library.
Optionally, as a possible implementation manner, the domain name reputation evaluation system further includes:
the feedback unit is used for feeding back reputation information corresponding to the target domain name to a user;
the reputation information comprises reputation degree, domain name type label and domain name attribution information, wherein when the reputation degree corresponding to the target domain name is smaller than a first preset threshold value, the domain name type label marking the target domain name is a malicious domain name.
Optionally, as a possible implementation manner, the domain name reputation evaluation system further includes:
the counting unit is used for counting the number of hosts accessing the target domain name;
and the early warning unit is used for issuing early warning information aiming at the target domain name when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold value.
According to the technical scheme, the embodiment of the invention has the following advantages:
in the embodiment of the invention, a domain name credibility assessment system can input website content information corresponding to a target domain name into a preset classifier model for domain name classification on the first aspect, and determines a first credibility value corresponding to the target domain name according to a classification result, and can perform multi-dimensional matching on the target domain name and network security threat information stored in a database on the second aspect, and output a second credibility value corresponding to the target domain name according to a matching result; and finally, calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value, and providing reference for detecting malicious attack behaviors based on the credibility. Namely, the domain name credibility assessment system in the embodiment of the invention not only can carry out multi-dimensional detection and analysis on the target domain name, but also can carry out multi-dimensional matching on the target domain name based on multi-dimensional network security threat information, thereby increasing the accuracy of network security detection.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a domain name reputation evaluation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of another embodiment of a domain name reputation evaluation method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another embodiment of a domain name reputation evaluation method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an embodiment of a domain name reputation evaluation system according to an embodiment of the present invention;
fig. 5 is a detailed schematic diagram of a module 401 of an information collection unit in a domain name reputation evaluation system according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another embodiment of a domain name reputation evaluation system according to an embodiment of the present invention;
fig. 7 is a detailed schematic diagram of a domain name association analysis unit 403 module in a domain name reputation evaluation system according to an embodiment of the present invention;
fig. 8 is a schematic diagram of another embodiment of a domain name reputation evaluation system according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a domain name credibility assessment method and a domain name credibility assessment system, which are used for improving the accuracy of network security detection.
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow in the embodiment of the present invention is described below, and referring to fig. 1, an embodiment of a domain name reputation evaluation method in the embodiment of the present invention may include:
101. acquiring website content information corresponding to a target domain name to be detected from the Internet;
in order to evaluate the reputation of the target domain name, the domain name reputation evaluation system may employ a web crawler technology to collect website content information corresponding to the target domain name to be detected from the internet, and specifically, the website content information generally includes title, content, and the like.
Optionally, as a possible implementation manner, the acquiring, by the domain reputation evaluation system, website content information corresponding to the target domain name to be detected from the internet includes:
searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name;
web crawler technology can be adopted to collect the website content information corresponding to the target domain name from the retrieval records.
Specifically, for example, the domain name reputation evaluation system may search for the target domain name using a search engine, and extract a preset number of search records ranked at the top in the search records, where the search records may include website records corresponding to the target domain name and sub-domain names thereof, and are stored sequentially according to the ranking order of the search engine. And acquires information such as title, content, URL (uniform resource locator), etc. corresponding to the search result from the search engine.
It can be understood that, in this embodiment, the manner of acquiring the website content information corresponding to the target domain name is only an example, and in practical application, the website content information corresponding to the target domain name may also be acquired according to the network port of the target domain name, and the manner of acquiring the website content information is not limited here.
102. Inputting the website content into a preset classifier model for domain name classification, and determining a first reputation value corresponding to a target domain name according to a classification result;
before classifying the target domain name, the domain name reputation evaluation system may collect a large amount of website content information of white domain names and black domain names in advance to train a preset classifier model, for example, may collect website content information of 10 ten thousand white domain names and website content information of 30 ten thousand black domain names to train a bayesian classifier model, where the specific model training method is the prior art and is not described herein in detail. It is understood that the specific preset classifier model may also be a decision tree classifier model, a bayesian classifier model, a logistic regression classifier model, an SVM classifier model, etc., and the specific classifier model is not limited herein.
After the preset classifier model is trained, the acquired website content information corresponding to the target domain name can be input into the preset classifier model for classification, the classification result can be a white domain name, a black domain name and a gray domain name, the domain names of different types correspond to different reputation values, and the domain name reputation evaluation system can determine a first reputation value corresponding to the target domain name according to the output result of the preset classifier model.
103. Performing multi-dimensional matching on the target domain name and the network security threat information stored in the database, and outputting a second credit value corresponding to the target domain name according to a matching result;
the behavior characteristics of malicious attack behaviors in the internet are often correlated, for example, one IP address of an attacker may forge multiple domain names or multiple URLs to make a malicious attack. Direct or indirect association may also exist among the multiple malicious attack behaviors, for example, the domain name current owner information in the whois information of the domain names corresponding to the multiple malicious attack behaviors may point to the same attacker. The domain name reputation evaluation system can perform multidimensional matching on the target domain name and the network security threat information stored in the database, and output a second reputation value corresponding to the target domain name according to a matching result.
Specifically, the domain name reputation evaluation system may first analyze domain name attribution information corresponding to the target domain name, where the domain name attribution information may include one or more of an IP address, whois information, and URL information corresponding to the target domain name;
optionally, as a possible implementation manner, the domain name reputation evaluation system may match domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
specifically, the domain reputation evaluation system may store a large amount of network security threat information based on big data analysis, where the network security threat information may include a large amount of malicious domain names, malicious IP addresses, URLs corresponding to the malicious domain names, and whois information corresponding to the malicious domain names, the whois information at least includes owner information of the corresponding malicious domain names, the domain reputation evaluation system may match domain attribution information corresponding to the target domain names with the malicious domain names, malicious IP addresses, and whois information corresponding to the malicious domain names stored in advance, the domain reputation evaluation system may determine second reputation values of the corresponding target domain names according to the number of the malicious IP addresses, the whois information, and the URLs corresponding to the malicious domain names that are successfully matched, and the number of the second reputation values that are successfully matched is larger and the second reputation values that are lower.
Further, optionally, in order to expand the dimension that the target domain name can be matched with, and further increase the accuracy of network security detection, the domain name reputation evaluation system may further match the IP address corresponding to the target domain name with a preset virus network behavior feature library.
Specifically, the domain reputation evaluation system may match the target domain name or the corresponding IP address with a preset virus network behavior feature library, for example, information such as the number and frequency of times that the known virus network accesses the target domain name, the type of viruses accessing the target domain name, and the number of domain names determined as malicious domain names in other domain names to which the owner of the target domain name belongs, and the like, and the domain reputation evaluation system may determine a second reputation value corresponding to the target domain name according to the analysis result and according to a predetermined rule, where the higher the frequency and the number of times that the known virus network accesses the target domain name is, the more the types of viruses accessing the target domain name are, the more the number of domain names determined as malicious domain names in other domain names to which the owner of the target domain name belongs is, and the lower the corresponding second reputation value is.
104. And calculating the credit degree corresponding to the target domain name according to the first credit value and the second credit value.
After the first reputation value and the second reputation value corresponding to the target domain name are obtained, the domain name reputation evaluation system may calculate the reputation corresponding to the target domain name based on the first reputation value and the second reputation value, specifically, the domain name reputation evaluation system may directly calculate a sum of the first reputation value and the second reputation value as the reputation corresponding to the target domain name, may also allocate corresponding weights to the first reputation value and the second reputation value, calculate a sum of the first reputation value and the second reputation value, and may also allocate weights to each related item in the calculation process of the first reputation value and the second reputation value to calculate the weights of the first reputation value and the second reputation value respectively, and calculate a sum of the first reputation value and the second reputation value as the reputation corresponding to the target domain name, which is not limited here.
In the embodiment of the invention, a domain name credibility assessment system can input website content information corresponding to a target domain name into a preset classifier model for domain name classification on the first aspect, and determines a first credibility value corresponding to the target domain name according to a classification result, and can perform multi-dimensional matching on the target domain name and network security threat information stored in a database on the second aspect, and output a second credibility value corresponding to the target domain name according to a matching result; and finally, calculating the credit degree corresponding to the target domain name according to the first credit value and the second credit value. Namely, the domain name credibility assessment system in the embodiment of the invention can not only perform multidimensional detection and analysis on the target domain name, but also perform multidimensional matching on the target domain name based on multidimensional network security threat information, and further can provide reference for detecting malicious attack behaviors based on the credibility, thereby increasing the accuracy of network security detection.
On the basis of the foregoing embodiments, in order to further increase the detection accuracy, the detection dimension of the target domain name may be further increased, for example, the reputation of the target domain name may be determined based on a matching condition between URLs of corresponding websites in a preset number of search records ranked at the top and the target domain name, which are searched by a search engine, please refer to fig. 2, another embodiment of the domain name reputation evaluation method in the embodiment of the present invention may include:
201. searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name, and acquiring website content information corresponding to the target domain name from the search record by adopting a web crawler technology;
specifically, for example, the domain name reputation evaluation system may search for the target domain name using a search engine, and extract a preset number of search records ranked at the top in the search records, where the search records include the target domain name and website records corresponding to the sub-domain names thereof, and are stored sequentially according to the ranking order of the search engine. And acquires information such as title, content, URL (uniform resource locator), etc. corresponding to the search result from the search engine.
202. Inputting the website content into a preset classifier model for domain name classification, and determining a first reputation value corresponding to a target domain name according to a classification result;
203. performing multi-dimensional matching on the target domain name and the network security threat information stored in the database, and outputting a second credit value corresponding to the target domain name according to a matching result;
steps 202 to 203 in this embodiment are similar to those described in steps 102 and 103 in the embodiment shown in fig. 1, and please refer to steps 102 and 103 specifically, which are not described herein again.
204. Collecting Uniform Resource Locators (URL) fields of a preset number of websites which are ranked at the top in a retrieval record corresponding to a target domain name;
205. judging whether host parts in URL fields of websites with a preset number and a front rank are matched with the target domain name or not, counting the number of the websites successfully matched and determining a third reputation value corresponding to the target domain name according to the number of the websites successfully matched;
the domain name reputation evaluation system can judge whether the host parts of the URL fields in the preset number of search records with the top rank are matched with the target domain name, if so, the target domain name is higher in ranking value and higher in probability of being a normal domain name, otherwise, the target domain name is lower in ranking value and possibly malicious. Specifically, a third reputation value may be assigned to the target domain name according to whether the host portion of the URL field includes the target domain name, or according to a situation that the host portion of the URL field of the website corresponding to the second record of the search record of the target domain name includes the target domain name. For example, it may be determined whether a URL field of a website ranked 10 in the target domain name retrieval record includes a target domain name, if the URL field includes the target domain name, the matching is successful, otherwise, the matching is unsuccessful, and if the number of websites ranked 10 is greater, the third reputation value corresponding to the target domain name may be set to be higher, and if the number of websites ranked 10 is greater, the third reputation value corresponding to the website ranked higher in the retrieval record may be set to be higher, and the specific details here are not limited. After the third reputation value corresponding to the target domain name is determined, the reputation corresponding to the target domain name can be further calculated according to the third reputation value.
206. And calculating the credit degree corresponding to the target domain name according to the first credit value, the second credit value and the third credit value.
After obtaining the third reputation value corresponding to the target domain name, the domain name reputation evaluation system may further calculate the reputation corresponding to the target domain name based on the third reputation value, specifically, the domain name reputation evaluation system may directly obtain a sum of the first reputation value, the second reputation value and the third reputation value as the reputation corresponding to the target domain name, may also allocate corresponding weights to the first reputation value, the second reputation value and the third reputation value, obtain a sum of the first reputation value, the second reputation value and the third reputation value, may also allocate weights to each related item in the calculation process of the first reputation value, the second reputation value and the third reputation value to calculate the weights of the first reputation value, the second reputation value and the third reputation value respectively, and calculate a sum of the weights of the first reputation value, the second reputation value and the third reputation value as the reputation corresponding to the target domain name, the specific calculation method is not limited herein.
On the basis of the above embodiment, after the reputation of the target domain name to be detected is obtained, the reputation of the target domain name has a guiding significance for the network behavior of the user, and it is necessary to feed back the reputation of the corresponding target domain name to the user. Referring to fig. 3, another embodiment of a domain name reputation evaluation method according to an embodiment of the present invention may include:
301. acquiring website content information corresponding to a target domain name to be detected from the Internet;
302. inputting the website content into a preset classifier model for domain name classification, and determining a first reputation value corresponding to a target domain name according to a classification result;
303. performing multi-dimensional matching on the target domain name and the network security threat information stored in the database, and outputting a second credit value corresponding to the target domain name according to a matching result;
304. calculating the credit degree corresponding to the target domain name according to the first credit value and the second credit value;
steps 201 to 204 in this embodiment are similar to those described in steps 101 to 104 in the embodiment shown in fig. 1, and please refer to steps 101 to 104 specifically, which is not described herein again.
305. Feeding back reputation information corresponding to the target domain name to the user;
after the reputation of the target domain name to be detected is obtained, the reputation of the target domain name has a guiding significance for a user to conduct network behaviors, and information related to the reputation of the corresponding target domain name needs to be fed back to the user. Specifically, the domain name reputation evaluation system may feed back reputation information corresponding to the target domain name to the user, so that the user may prevent possible network attacks.
Optionally, the reputation information includes a reputation degree, a domain name type label, and domain name attribution information, where when the reputation degree corresponding to the target domain name is smaller than a first preset threshold, the domain name type label that marks the target domain name is a malicious domain name, and the specific first preset threshold is not limited here. The further reputation information may further include security threat type information corresponding to the target domain name, latest active time, and other information, for example, the threat types are: c & C server, latest active time: 2017100112: 00:00, it can be understood that the reputation information can be set reasonably according to the behavior characteristics of the possible network attack behavior of the target domain name and the needs of the user, and the specific reputation information is not limited here.
306. And counting the number of hosts accessing the target domain name, and issuing early warning information aiming at the target domain name when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold value.
Optionally, in order to discover or prevent a large-scale network attack behavior in time, the domain reputation evaluation system may count the number of hosts accessing the target domain name, and issue the warning information for the target domain name when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold, so as to prompt the user to prevent a possible network attack.
It is understood that steps 305 and 306 in this embodiment can also be applied to the embodiment shown in fig. 2.
In the embodiment of the invention, the domain name credibility assessment system not only can carry out multi-dimensional detection and analysis on the target domain name, but also can carry out multi-dimensional matching on the target domain name based on multi-dimensional network security threat information, thereby increasing the accuracy of network security detection, and secondly, the domain name credibility assessment system can also feed back the credibility information of the target domain name to a user, so that the user can intuitively know whether the target domain name has threat or not, thereby preventing possible network attack, and when the target domain name is a malicious domain name and the number of affected hosts is large, the domain name credibility assessment system can also issue large-scale early warning information aiming at the target domain name, thereby preventing possible network attack.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the above steps do not mean the execution sequence, and the execution sequence of each step should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The foregoing embodiment describes a domain name reputation evaluation method in the embodiment of the present invention, and referring to fig. 4, a domain name reputation evaluation system in the embodiment of the present invention is described below, where an embodiment of a domain name reputation evaluation system in the embodiment of the present invention may include:
the information acquisition unit 401 is configured to acquire website content information corresponding to a target domain name to be detected from the internet;
a domain name classification unit 402, configured to input the website content into a preset classifier model for domain name classification, and determine a first reputation value corresponding to the target domain name according to a classification result;
and a domain name association analysis unit 403, configured to perform multidimensional matching on the target domain name and the network security threat information stored in the database, and determine a second reputation value corresponding to the target domain name according to a matching result.
A first calculating unit 404, configured to calculate a reputation degree corresponding to the target domain name according to the first reputation value and the second reputation value.
Optionally, as a possible embodiment, referring to fig. 5, in an embodiment of the present invention, the information acquisition unit 401 may further include:
the search module 4011 is configured to search the target domain name to be detected by using a search engine to obtain a search record corresponding to the target domain name;
the first collecting module 4012 is configured to collect, by using a web crawler technology, website content information corresponding to the target domain name from the retrieval record.
Optionally, as a possible embodiment, please refer to fig. 6, in an embodiment of the present invention, the information collecting unit 401 further includes a second collecting module 4013, configured to collect URL fields of a preset number of websites ranked at the top in a retrieval record corresponding to the target domain name;
the domain name reputation degree evaluation system further comprises:
a determining unit 405, configured to determine whether host portions in URL fields of the websites with the preset number of top ranks are matched with the target domain name, count the number of websites successfully matched, and determine a third reputation value corresponding to the target domain name according to the number of websites successfully matched;
a second calculating unit 406, configured to further calculate a reputation degree corresponding to the target domain name according to the third reputation value.
Optionally, as a possible embodiment, referring to fig. 7, in an embodiment of the present invention, the domain name association analysis unit 403 may further include:
the resolution module 4031 is configured to resolve domain name attribution information corresponding to the target domain name, where the domain name attribution information includes one or more of an IP address, whois information, and URL information corresponding to the target domain name;
a first matching module 4032, configured to match domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
and a second matching module 4033, configured to match the IP address corresponding to the target domain name with a preset virus network behavior feature library.
Optionally, as a possible embodiment, please refer to fig. 8, the domain name reputation evaluation system in the embodiment of the present invention may further include:
a feedback unit 407, configured to feed back reputation information corresponding to the target domain name to the user;
the reputation information comprises reputation degree, domain name type labels and domain name attribution information, wherein when the reputation degree corresponding to the target domain name is smaller than a first preset threshold value, the domain name type label for marking the target domain name is a malicious domain name.
Optionally, as a possible embodiment, please refer to fig. 8, the domain name reputation evaluation system in the embodiment of the present invention may further include:
a counting unit 408, configured to count the number of hosts accessing the target domain name;
and the early warning unit 409 is configured to issue early warning information for the target domain name when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold.
It is understood that the statistical unit 408 and the early warning unit 409 in this embodiment may be used in the embodiment shown in fig. 6, and are not limited herein.
In the embodiment of the invention, a domain name credibility assessment system can input website content information corresponding to a target domain name into a preset classifier model for domain name classification on the first aspect, and determines a first credibility value corresponding to the target domain name according to a classification result, and can perform multi-dimensional matching on the target domain name and network security threat information stored in a database on the second aspect, and output a second credibility value corresponding to the target domain name according to a matching result; and finally, calculating the credit degree corresponding to the target domain name according to the first credit value and the second credit value. Namely, the domain name credibility assessment system in the embodiment of the invention not only can carry out multi-dimensional detection and analysis on the target domain name, but also can carry out multi-dimensional matching on the target domain name based on multi-dimensional network security threat information, thereby increasing the accuracy of network security detection.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the unit and the module described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (12)

1. A domain name credibility assessment method is characterized by comprising the following steps:
acquiring website content information corresponding to a target domain name to be detected from the Internet;
inputting the website content information into a preset classifier model for domain name classification, and determining a first reputation value corresponding to the target domain name according to a classification result;
performing multi-dimensional matching on the target domain name and network security threat information stored in a database, and outputting a second credit value corresponding to the target domain name according to a matching result;
and calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value.
2. The method according to claim 1, wherein the collecting website content information corresponding to the target domain name to be detected from the internet comprises:
searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name;
and acquiring website content information corresponding to the target domain name from the retrieval record by adopting a web crawler technology.
3. The method of claim 2, further comprising:
collecting Uniform Resource Locators (URL) fields of a preset number of websites which are ranked at the top in a retrieval record corresponding to the target domain name;
judging whether host parts in URL fields of the websites with the preset number and the top rank are matched with the target domain name or not, and counting the number of the websites successfully matched;
determining a third reputation value corresponding to the target domain name according to the number of successfully matched websites;
and further calculating the reputation corresponding to the target domain name according to the third reputation value.
4. The method of claim 1, wherein the multidimensional matching of the target domain name with cyber-security threat information stored in a database comprises:
analyzing domain name attribution information corresponding to the target domain name, wherein the domain name attribution information comprises one or more of an IP address, whois information and URL information corresponding to the target domain name;
matching the domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
and/or matching the IP address corresponding to the target domain name with a preset virus network behavior characteristic library.
5. The method of any of claims 1 to 4, further comprising:
feeding back reputation information corresponding to the target domain name to a user;
the reputation information comprises reputation, domain name type label, domain name attribution information,
and when the credibility corresponding to the target domain name is smaller than a first preset threshold value, marking the domain name type label of the target domain name as a malicious domain name.
6. The method of claim 5, further comprising:
counting the number of hosts accessing the target domain name;
and when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold value, issuing early warning information aiming at the target domain name.
7. A domain name reputation degree evaluation system, comprising:
the information acquisition unit is used for acquiring website content information corresponding to a target domain name to be detected from the Internet;
the domain name classification unit is used for inputting the website content into a preset classifier model for domain name classification and determining a first reputation value corresponding to the target domain name according to a classification result;
the domain name correlation analysis unit is used for carrying out multi-dimensional matching on the target domain name and the network security threat information stored in the database and determining a second credit value corresponding to the target domain name according to a matching result;
and the first calculation unit is used for calculating the credibility corresponding to the target domain name according to the first credibility value and the second credibility value.
8. The system of claim 7, wherein the information collection unit comprises:
the search module is used for searching a target domain name to be detected by adopting a search engine to obtain a search record corresponding to the target domain name;
and the first acquisition module is used for acquiring website content information corresponding to the target domain name from the retrieval record by adopting a web crawler technology.
9. The system of claim 8,
the information acquisition unit also comprises a second acquisition module which is used for acquiring URL fields of the websites with the preset number and ranked at the top in the retrieval record corresponding to the target domain name;
the domain name reputation degree evaluation system further comprises:
the judging unit is used for judging whether host parts in URL fields of the websites with the preset number and the top rank are matched with the target domain name or not, counting the number of the websites successfully matched and determining a third credit value corresponding to the target domain name according to the number of the websites successfully matched;
and the second calculating unit is used for further calculating the reputation corresponding to the target domain name according to the third reputation value.
10. The system according to claim 7, wherein the domain name association analysis unit comprises:
the resolution module is used for resolving domain name attribution information corresponding to the target domain name, wherein the domain name attribution information comprises one or more of an IP address, whois information and URL information corresponding to the target domain name;
the first matching module is used for matching the domain name attribution information corresponding to the target domain name with a preset malicious domain name information base;
and the second matching module is used for matching the IP address corresponding to the target domain name with a preset virus network behavior feature library.
11. The system of any one of claims 7 to 10, further comprising:
the feedback unit is used for feeding back reputation information corresponding to the target domain name to a user;
the reputation information comprises reputation degree, domain name type label and domain name attribution information, wherein when the reputation degree corresponding to the target domain name is smaller than a first preset threshold value, the domain name type label marking the target domain name is a malicious domain name.
12. The system of claim 11, further comprising:
the counting unit is used for counting the number of hosts accessing the target domain name;
and the early warning unit is used for issuing early warning information aiming at the target domain name when the target domain name is a malicious domain name and the number of hosts accessing the target domain name is not less than a second preset threshold value.
CN201711206344.8A 2017-11-27 2017-11-27 Domain name credit assessment method and system Active CN107888606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711206344.8A CN107888606B (en) 2017-11-27 2017-11-27 Domain name credit assessment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711206344.8A CN107888606B (en) 2017-11-27 2017-11-27 Domain name credit assessment method and system

Publications (2)

Publication Number Publication Date
CN107888606A CN107888606A (en) 2018-04-06
CN107888606B true CN107888606B (en) 2020-11-13

Family

ID=61775345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711206344.8A Active CN107888606B (en) 2017-11-27 2017-11-27 Domain name credit assessment method and system

Country Status (1)

Country Link
CN (1) CN107888606B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510332A (en) * 2018-04-17 2018-09-07 中国互联网络信息中心 A kind of domain name prestige assessment method and device
CN109831545B (en) * 2019-01-31 2020-10-09 中国互联网络信息中心 Domain name abuse processing method and system based on block chain
CN110427540B (en) * 2019-07-30 2021-11-30 国家计算机网络与信息安全管理中心 Implementation method and system for determining IP address responsibility main body
CN111131175A (en) * 2019-12-04 2020-05-08 互联网域名系统北京市工程研究中心有限公司 Threat intelligence domain name protection system and method
CN112511489B (en) * 2020-10-29 2023-06-27 中国互联网络信息中心 Domain name service abuse assessment method and device
CN114640513B (en) * 2022-03-04 2023-06-23 中国互联网络信息中心 Domain name abuse governance method and system based on reputation excitation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103428187A (en) * 2012-05-25 2013-12-04 腾讯科技(深圳)有限公司 Method and system for access controlling, and equipment
CN103905372A (en) * 2012-12-24 2014-07-02 珠海市君天电子科技有限公司 Method and device for removing false alarm of phishing website
CN104615760A (en) * 2015-02-13 2015-05-13 北京瑞星信息技术有限公司 Phishing website recognizing method and phishing website recognizing system
CN106131016A (en) * 2016-07-13 2016-11-16 北京知道创宇信息技术有限公司 Maliciously URL detection interference method, system and device
CN107360185A (en) * 2017-08-18 2017-11-17 中国移动通信集团海南有限公司 A kind of assessing network method and system based on DNS behavioural characteristics

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103428187A (en) * 2012-05-25 2013-12-04 腾讯科技(深圳)有限公司 Method and system for access controlling, and equipment
CN103905372A (en) * 2012-12-24 2014-07-02 珠海市君天电子科技有限公司 Method and device for removing false alarm of phishing website
CN104615760A (en) * 2015-02-13 2015-05-13 北京瑞星信息技术有限公司 Phishing website recognizing method and phishing website recognizing system
CN106131016A (en) * 2016-07-13 2016-11-16 北京知道创宇信息技术有限公司 Maliciously URL detection interference method, system and device
CN107360185A (en) * 2017-08-18 2017-11-17 中国移动通信集团海南有限公司 A kind of assessing network method and system based on DNS behavioural characteristics

Also Published As

Publication number Publication date
CN107888606A (en) 2018-04-06

Similar Documents

Publication Publication Date Title
CN107888606B (en) Domain name credit assessment method and system
Rao et al. Detection of phishing websites using an efficient feature-based machine learning framework
Marchal et al. Know your phish: Novel techniques for detecting phishing sites and their targets
Niakanlahiji et al. Phishmon: A machine learning framework for detecting phishing webpages
Amrutkar et al. Detecting mobile malicious webpages in real time
CA2859131C (en) Systems and methods for spam detection using character histograms
US9130778B2 (en) Systems and methods for spam detection using frequency spectra of character strings
CN112866023B (en) Network detection method, model training method, device, equipment and storage medium
CN104156490A (en) Method and device for detecting suspicious fishing webpage based on character recognition
Tan et al. Phishing website detection using URL-assisted brand name weighting system
CN102622553A (en) Method and device for detecting webpage safety
CN112804210B (en) Data association method and device, electronic equipment and computer-readable storage medium
CN110572359A (en) Phishing webpage detection method based on machine learning
CN113098887A (en) Phishing website detection method based on website joint characteristics
CN111756724A (en) Detection method, device and equipment for phishing website and computer readable storage medium
CN112019519B (en) Method and device for detecting threat degree of network security information and electronic device
CN112769803B (en) Network threat detection method and device and electronic equipment
Le Page et al. Domain classifier: Compromised machines versus malicious registrations
CN105187439A (en) Phishing website detection method and device
Piredda et al. Deepsquatting: Learning-based typosquatting detection at deeper domain levels
CN109064067B (en) Financial risk operation subject determination method and device based on Internet
Almishari et al. Ads-portal domains: Identification and measurements
CN113361597B (en) Training method and device for URL detection model, electronic equipment and storage medium
Jo et al. You're not who you claim to be: Website identity check for phishing detection
CN115309968A (en) Method and device for generating webpage fingerprint rule based on resource search engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant