CN109450886A - A kind of domain name recognition methods, system and electronic equipment and storage medium - Google Patents

A kind of domain name recognition methods, system and electronic equipment and storage medium Download PDF

Info

Publication number
CN109450886A
CN109450886A CN201811277414.3A CN201811277414A CN109450886A CN 109450886 A CN109450886 A CN 109450886A CN 201811277414 A CN201811277414 A CN 201811277414A CN 109450886 A CN109450886 A CN 109450886A
Authority
CN
China
Prior art keywords
domain name
recognition result
identified
result
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811277414.3A
Other languages
Chinese (zh)
Inventor
高杨
范渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Hangzhou Dbappsecurity Technology Co Ltd
Original Assignee
Hangzhou Dbappsecurity Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dbappsecurity Technology Co Ltd filed Critical Hangzhou Dbappsecurity Technology Co Ltd
Priority to CN201811277414.3A priority Critical patent/CN109450886A/en
Publication of CN109450886A publication Critical patent/CN109450886A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Abstract

This application discloses a kind of domain name recognition methods, system and a kind of electronic equipment and computer readable storage mediums, this method comprises: determining domain name to be identified, and judge whether the domain name to be identified is matched to domain name blacklist, obtain the first recognition result;Behavioral statistics are carried out to the domain name to be identified, and the second recognition result is obtained according to statistical result;The text feature of the domain name to be identified is extracted, and the text feature is inputted in the disaggregated model that training is completed, obtains third recognition result;The final domain name recognition result of the domain name to be identified is obtained according to preset weight distribution rule, first recognition result, second recognition result and the third recognition result.It can be seen that domain name detection method provided by the present application identifies that DGA domain name, testing result are more accurate using three domain name blacklist, behavioral statistics and text feature dimensions.

Description

A kind of domain name recognition methods, system and electronic equipment and storage medium
Technical field
This application involves fields of communication technology, more specifically to a kind of domain name recognition methods, system and a kind of electronics Equipment and a kind of computer readable storage medium.
Background technique
With the continuous development of Internet technology, network has incorporated the every aspect of people's life.However, hacker enters The derivative as Internet technology development is invaded, also becomes all-pervasive, threatens network security increasingly seriously.In addition, more Come more rogue programs begin to use specific domain name generate (full name in English: Domain Generation Algorithm, English abbreviation: DGA) algorithm generation domain name.Since the domain name detection method in the prior art based on blacklist can not identify use The domain name that DGA algorithm generates, and the speed for using DGA algorithm to generate domain name is higher, can automatically generate more than 50,000 daily A random domain name.
Detection for DGA domain name, the prior art provide a kind of domain name detection method based on blacklist, wherein black DGA domain name is stored in list, when user is matched by the domain name that terminal is accessed with the domain name in blacklist, which is DGA domain name.The above method can only identify the DGA information of known malicious, change no any sense for lacking domain name in blacklist Know, the accuracy rate of DGA domain name detection is lower.
Therefore, how to improve the accuracy rate of DGA domain name detection is those skilled in the art's problem to be solved.
Summary of the invention
The application be designed to provide a kind of domain name recognition methods, system and a kind of electronic equipment and a kind of computer can Storage medium is read, the accuracy rate of DGA domain name detection is improved.
To achieve the above object, this application provides a kind of domain name recognition methods, comprising:
It determines domain name to be identified, and judges whether the domain name to be identified is matched to domain name blacklist, obtain the first identification As a result;
Behavioral statistics are carried out to the domain name to be identified, and the second recognition result is obtained according to statistical result;
The text feature of the domain name to be identified is extracted, and the text feature is inputted into the disaggregated model that training is completed In, obtain third recognition result;
According to preset weight distribution rule, first recognition result, second recognition result and described the Three recognition results obtain the final domain name recognition result of the domain name to be identified.
Wherein, described to judge whether the domain name to be identified is matched to domain name blacklist, the first recognition result is obtained, is wrapped It includes:
Judge the service in domain name blacklist with the presence or absence of the domain name to be identified and/or the domain name to be identified Device IP and/or DNS processing result obtains first recognition result.
Wherein, behavioral statistics are carried out to the domain name to be identified, and the second recognition result is obtained according to statistical result, wrapped It includes:
Judge whether the query-attack domain name quantity in single attack or unit interval reaches preset value, and according to judgement As a result second recognition result is obtained.
Wherein, described according to preset weight distribution rule, first recognition result, second recognition result The final domain name recognition result of the domain name to be identified is obtained with the third recognition result, comprising:
According to preset weight distribution rule, first recognition result, second recognition result and described the Three recognition results calculate comprehensive weight;
Judge whether the comprehensive weight is greater than preset value;
If so, the domain name to be identified is DGA domain name.
Wherein, described according to preset weight distribution rule, first recognition result, second recognition result Comprehensive weight is calculated with the third recognition result, comprising:
The comprehensive weight is calculated according to weight calculation formula;Wherein, the weight calculation formula are as follows:
W=x*p1+y*p2+z*p3
Wherein, w is the comprehensive weight, and x is first recognition result, p1It is described in weight distribution rule the The corresponding weighted value of one recognition result, y are second recognition result, p2For the second identification described in the weight distribution rule As a result corresponding weighted value, z are the third recognition result, p3For third recognition result pair described in the weight distribution rule The weighted value answered.
Wherein, further includes:
Training domain name collection is obtained, the text feature for each training sample that the trained domain name is concentrated is extracted, and is determined every The domain name recognition result of a training sample;
Using the text feature and domain name recognition result train classification models, the classification that the training is completed is obtained Model.
Wherein, the domain name recognition result of each training sample of the determination, comprising:
Behavioral statistics are carried out to each training sample and/or each training is obtained according to domain name blacklist The domain name recognition result of sample.
To achieve the above object, this application provides a kind of domain name identifying systems, comprising:
First identification module, for determining domain name to be identified, and it is black to judge whether the domain name to be identified is matched to domain name List obtains the first recognition result;
Second identification module for carrying out behavioral statistics to the domain name to be identified, and obtains second according to statistical result Recognition result;
Third identification module for extracting the text feature of the domain name to be identified, and the text feature is inputted and is instructed Practice in the disaggregated model completed, obtains third recognition result;
Weight calculation module, for according to preset weight distribution rule, first recognition result, described second Recognition result and the third recognition result obtain the final domain name recognition result of the domain name to be identified.
To achieve the above object, this application provides a kind of electronic equipment, comprising:
Memory, for storing computer program;
Processor is realized when for executing the computer program such as the step of above-mentioned domain name recognition methods.
To achieve the above object, this application provides a kind of computer readable storage medium, the computer-readable storages It is stored with computer program on medium, the step such as above-mentioned domain name recognition methods is realized when the computer program is executed by processor Suddenly.
By above scheme it is found that a kind of domain name recognition methods provided by the present application, comprising: determine domain name to be identified, and Judge whether the domain name to be identified is matched to domain name blacklist, obtains the first recognition result;The domain name to be identified is carried out Behavioral statistics, and the second recognition result is obtained according to statistical result;The text feature of the domain name to be identified is extracted, and will be described In the disaggregated model that text feature input training is completed, third recognition result is obtained;Regular according to preset weight distribution, First recognition result, second recognition result and the third recognition result obtain the final domain of the domain name to be identified Name recognition result.
In this application, known DGA domain name is stored in domain name blacklist, utilizes domain name to be identified and domain name blacklist Matching result obtains the first recognition result, and utilizes and obtain the second recognition result to the behavioral statistics of domain name to be identified.Classification mould Type is the disaggregated model obtained according to the text feature training of training domain name collection, can be automatically according to the text spy of domain name to be identified Sign output third recognition result, in summary three recognition results obtain final domain name recognition result.It can be seen that the application mentions The domain name detection method of confession identifies DGA domain name, testing result using three domain name blacklist, behavioral statistics and text feature dimensions It is more accurate.Disclosed herein as well is a kind of domain name identifying system and a kind of electronic equipment and a kind of computer-readable storage mediums Matter is equally able to achieve above-mentioned technical effect.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of application for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is a kind of flow chart of domain name recognition methods disclosed in the embodiment of the present application;
Fig. 2 is the flow chart of another kind domain name recognition methods disclosed in the embodiment of the present application;
Fig. 3 is a kind of structure chart of domain name identifying system disclosed in the embodiment of the present application;
Fig. 4 is the structure chart of a kind of electronic equipment disclosed in the embodiment of the present application;
Fig. 5 is the structure chart of another kind electronic equipment disclosed in the embodiment of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
The embodiment of the present application discloses a kind of domain name recognition methods, improves the accuracy rate of DGA domain name detection.
Referring to Fig. 1, a kind of flow chart of domain name recognition methods disclosed in the embodiment of the present application, as shown in Figure 1, comprising:
S101: determining domain name to be identified, and judges whether the domain name to be identified is matched to domain name blacklist, obtains first Recognition result;
In the present embodiment, known DGA domain name is stored in domain name blacklist, utilizes domain name to be identified and domain name blacklist Matching result obtain the first recognition result.Specifically, domain name blacklist can store the request domain name of DGA domain name, server IP or the DNS processing result for requesting domain name are somebody's turn to do correspondingly, may determine that domain name blacklist whether there is in the detection process wait know The server ip or DNS processing result of other domain name or domain name to be identified obtain the first recognition result.
S102: behavioral statistics are carried out to the domain name to be identified, and the second recognition result is obtained according to statistical result;
In specific implementation, behavioral statistics are carried out to domain name to be identified, that is, judged in single attack or unit interval Whether query-attack domain name quantity reaches preset value, and obtains second recognition result according to judging result.It is understood that It is, herein the unlimited specific length for determining unit interval, for example, can be with one day for a unit interval.
S103: the text feature of the domain name to be identified is extracted, and the text feature is inputted into the classification that training is completed In model, third recognition result is obtained;
It in the present embodiment, can be first pre- to domain name progress to be identified is instructed before the text feature for extracting domain name to be identified Processing, extracts main feature representative in domain name to be identified, such as Main Domain, the TLD suffix (Top- of each domain name Level Domain) i.e. domain name last part.For example, domain name " www.google.com ", Main Domain google, TLD suffix is com.
It is understood that in the present embodiment, the text feature of extraction can be it is single, such as only by Main Domain into The differentiation of row DGA domain name;The text feature of extraction is also possible to multiple, such as passes through Main Domain, the TLD for extracting each domain name Suffix expands more features on the Main Domain and TLD suffix also to refine judgment rule, improves DGA domain name and differentiates Accuracy.For example, can by the length of Main Domain, the characteristic of speech sounds of Main Domain, Main Domain character transition probability and The TLD suffix of domain name is extracted collectively as text feature.
Dimension-reduction treatment and normalization can be carried out to the text feature extracted in this step as a preferred implementation manner, Processing improves the computational efficiency of subsequent classification model.It inputs in the disaggregated model that training is completed after extracting text feature with output Third recognition result.The training process of disaggregated model will describe in detail in next embodiment.
It should be noted that the first above-mentioned recognition result, the second recognition result and third recognition result can be with two points As a result it indicates, that is, meeting condition is output TRUE, is unsatisfactory for exporting FALSE when condition, naturally it is also possible to a certain range of spy Value indicative indicates, such as the range of 0-1.
S104: according to preset weight distribution rule, first recognition result, second recognition result and institute It states third recognition result and obtains the final domain name recognition result of the domain name to be identified.
Weight distribution rule, that is, those skilled in the art herein are the first recognition result, the second identification according to the actual situation As a result corresponding weighted value is distributed with third recognition result, and calculates comprehensive weight.It, should when the comprehensive weight is greater than preset value Domain name to be identified is DGA domain name, and when the comprehensive weight is less than or equal to preset value, which is normal domain name.
In the embodiment of the present application, known DGA domain name is stored in domain name blacklist, it is black using domain name to be identified and domain name The matching result of list obtains the first recognition result, and utilizes and obtain the second recognition result to the behavioral statistics of domain name to be identified. Disaggregated model is the disaggregated model obtained according to the text feature training of training domain name collection, can be automatically according to domain name to be identified Text feature exports third recognition result, and in summary three recognition results obtain final domain name recognition result.It can be seen that this Apply for that the domain name detection method that embodiment provides identifies DGA using three domain name blacklist, behavioral statistics and text feature dimensions Domain name, testing result are more accurate.
It describes in detail below to the training process of disaggregated model in a upper embodiment, specific:
Referring to fig. 2, the flow chart of another domain name recognition methods provided by the embodiments of the present application, as shown in Figure 2, comprising:
S201: obtaining training domain name collection, extracts the text feature for each training sample that the trained domain name is concentrated, and really The domain name recognition result of fixed each training sample;
In specific implementation, the text feature for extracting each training sample that training domain name is concentrated first, to text spy Sign carries out dimension-reduction treatment and normalized, and determines the domain name recognition result of each training sample.It is understood that can be with Each training sample domain name identification knot is obtained in the way of the domain name blacklist of upper embodiment introduction and/or behavioral statistics Fruit, since identification step is similar with a upper embodiment, details are not described herein.
S202: utilizing the text feature and domain name recognition result train classification models, obtains the training and completes Disaggregated model.
In specific implementation, classification can be trained by handling the text feature completed and domain name recognition result using previous step Model.Text feature is trained using machine learning algorithm, to establish domain name disaggregated model.It is obtained by machine learning Disaggregated model can fast and accurately identify DGA domain name according to domain name feature, can be used for predicting unknown domain name. It is understood that this implementation is not defined the concrete form of disaggregated model, such as the mould that can be classified using LibLinear Type, LibSVM disaggregated model etc..
The application is introduced in a manner of Application Example below, domain name recognition methods is provided, can specifically include following step It is rapid:
Step 1: matching domain name to be identified using the blacklist of pre-configuration, obtains two points of result x of Bool type (for 0) when output valve is TRUE being 1, FALSE, including but not limited to following characteristics:
A) whether server ip is matched to blacklist, if then exporting TRUE, if otherwise exporting FALSE;
B) whether request domain name is matched to blacklist, if then exporting TRUE, if otherwise exporting FALSE;
C) request whether the DNS processing result of domain name is matched to blacklist, if then exporting TRUE, if otherwise exporting FALSE;
Step 2: it is for statistical analysis to the corelation behaviour of domain name to be identified, two points of result y of Bool type are obtained, are wrapped It includes but is not limited to following characteristics:
A) whether the request domain name quantity that single attack issues reaches preset value, if then exporting TRUE, if otherwise exporting FALSE;
B) whether the request domain name quantity that attack in single day issues reaches preset value, if then exporting TRUE, if otherwise exporting FALSE;
Step 3: extracting the text feature of domain name to be identified, carries out dimension-reduction treatment and normalized to text feature, defeated Enter in the classification results that training is completed and obtains two points of result z of Bool type;
Step 4: the comprehensive weight w is calculated according to weight calculation formula;Wherein, the weight calculation formula are as follows:
W=x*p1+y*p2+z*p3
Wherein, p1For the corresponding weighted value of x in weight distribution rule, p2For the corresponding weight of y in the weight distribution rule Value, p3For the corresponding weighted value of z in the weight distribution rule;
Step 5: judging whether comprehensive weight w is greater than threshold value W, if so, the domain name to be identified is DGA domain name.
A kind of domain name identifying system provided by the embodiments of the present application is introduced below, a kind of domain name described below is known Other system can be cross-referenced with a kind of above-described domain name recognition methods.
Referring to Fig. 3, a kind of structure chart of domain name identifying system provided by the embodiments of the present application, as shown in Figure 3, comprising:
First identification module 301 for determining domain name to be identified, and judges whether the domain name to be identified is matched to domain name Blacklist obtains the first recognition result;
Second identification module 302 obtains for carrying out behavioral statistics to the domain name to be identified, and according to statistical result Two recognition results;
Third identification module 303 is inputted for extracting the text feature of the domain name to be identified, and by the text feature In the disaggregated model that training is completed, third recognition result is obtained;
Weight calculation module 304, for according to preset weight distribution rule, first recognition result, described Second recognition result and the third recognition result obtain the final domain name recognition result of the domain name to be identified.
In the embodiment of the present application, known DGA domain name is stored in domain name blacklist, it is black using domain name to be identified and domain name The matching result of list obtains the first recognition result, and utilizes and obtain the second recognition result to the behavioral statistics of domain name to be identified. Disaggregated model is the disaggregated model obtained according to the text feature training of training domain name collection, can be automatically according to domain name to be identified Text feature exports third recognition result, and in summary three recognition results obtain final domain name recognition result.It can be seen that this Apply for that the domain name detection system that embodiment provides identifies DGA using three domain name blacklist, behavioral statistics and text feature dimensions Domain name, testing result are more accurate.
On the basis of the above embodiments, first identification module 301 is specially true as a preferred implementation manner, Fixed domain name to be identified judges in domain name blacklist with the presence or absence of the domain name to be identified and/or the domain name to be identified Server ip and/or DNS processing result obtain the module of first recognition result.
On the basis of the above embodiments, second identification module 302 is specially to sentence as a preferred implementation manner, Whether the query-attack domain name quantity in disconnected single attack or unit interval reaches preset value, and obtains institute according to judging result State the second recognition result.
On the basis of the above embodiments, the weight calculation module 304 includes: as a preferred implementation manner,
Computing unit, for according to preset weight distribution rule, first recognition result, second identification As a result comprehensive weight is calculated with the third recognition result;
Judging unit, for judging whether the comprehensive weight is greater than preset value;If so, the domain name to be identified is DGA domain name.
On the basis of the above embodiments, the computing unit is specially according to weight as a preferred implementation manner, Calculation formula calculates the unit of the comprehensive weight;Wherein, the weight calculation formula are as follows:
W=x*p1+y*p2+z*p3
Wherein, w is the comprehensive weight, and x is first recognition result, p1It is described in weight distribution rule the The corresponding weighted value of one recognition result, y are second recognition result, p2For the second identification described in the weight distribution rule As a result corresponding weighted value, z are the third recognition result, p3For third recognition result pair described in the weight distribution rule The weighted value answered.
On the basis of the above embodiments, as a preferred implementation manner, further include:
Module is obtained, for obtaining trained domain name collection, extracts the text for each training sample that the trained domain name is concentrated Feature, and determine the domain name recognition result of each training sample;
Training module obtains described for utilizing the text feature and domain name recognition result train classification models The disaggregated model that training is completed.
On the basis of the above embodiments, the acquisition module is specially to obtain training as a preferred implementation manner, Domain name collection extracts the text feature for each training sample that the trained domain name is concentrated, and carries out to each training sample Behavioral statistics and/or obtained according to domain name blacklist each training sample domain name recognition result module.
Present invention also provides a kind of electronic equipment, referring to fig. 4, the knot of a kind of electronic equipment provided by the embodiments of the present application Composition, as shown in Figure 4, comprising:
Memory 100, for storing computer program;
Step provided by above-described embodiment may be implemented in processor 200 when for executing the computer program.
Specifically, memory 100 includes non-volatile memory medium, built-in storage.Non-volatile memory medium storage There are operating system and computer-readable instruction, which is that the operating system and computer in non-volatile memory medium can The operation of reading instruction provides environment.Processor 200 provides calculating and control ability for electronic equipment, executes the memory 100 When the computer program of middle preservation, the step of domain name recognition methods that any of the above-described embodiment provides may be implemented.
In the embodiment of the present application, known DGA domain name is stored in domain name blacklist, it is black using domain name to be identified and domain name The matching result of list obtains the first recognition result, and utilizes and obtain the second recognition result to the behavioral statistics of domain name to be identified. Disaggregated model is the disaggregated model obtained according to the text feature training of training domain name collection, can be automatically according to domain name to be identified Text feature exports third recognition result, and in summary three recognition results obtain final domain name recognition result.It can be seen that this Apply for that embodiment identifies that DGA domain name, testing result are more quasi- using three domain name blacklist, behavioral statistics and text feature dimensions Really.
On the basis of the above embodiments, preferably, referring to Fig. 5, the electronic equipment further include:
Input interface 300 is connected with processor 200, for obtaining computer program, parameter and the instruction of external importing, It saves through the control of processor 200 into memory 100.The input interface 300 can be connected with input unit, and it is manual to receive user The parameter or instruction of input.The input unit can be the touch layer covered on display screen, be also possible to be arranged in terminal enclosure Key, trace ball or Trackpad, be also possible to keyboard, Trackpad or mouse etc..
Display unit 400 is connected with processor 200, the data sent for video-stream processor 200.The display unit 400 It can be display screen, liquid crystal display or the electric ink display screen etc. in PC machine.It, can be with specifically, in the present embodiment Domain name recognition result etc. is shown by display unit 400.
The network port 500 is connected with processor 200, for being communicatively coupled with external each terminal device.The communication link The communication technology used by connecing can be cable communicating technology or wireless communication technique, and such as mobile high definition chained technology (MHL) leads to It is blue with universal serial bus (USB), high-definition media interface (HDMI), adopting wireless fidelity technology (WiFi), Bluetooth Communication Technology, low-power consumption The tooth communication technology, communication technology based on IEEE802.11s etc..Specifically, in the present embodiment, the network port can be passed through 500 import domain name to be identified, disaggregated model that training is completed etc. to processor 200.
Present invention also provides a kind of computer readable storage medium, the storage medium may include: USB flash disk, mobile hard disk, Read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic The various media that can store program code such as dish or CD.Computer program, the calculating are stored on the storage medium Machine program realizes the step of domain name recognition methods that any of the above-described embodiment provides when being executed by processor.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration ?.It should be pointed out that for those skilled in the art, under the premise of not departing from the application principle, also Can to the application, some improvement and modification can also be carried out, these improvement and modification also fall into the protection scope of the claim of this application It is interior.
It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or apparatus that includes the element.

Claims (10)

1. a kind of domain name recognition methods characterized by comprising
It determines domain name to be identified, and judges whether the domain name to be identified is matched to domain name blacklist, obtain the first recognition result;
Behavioral statistics are carried out to the domain name to be identified, and the second recognition result is obtained according to statistical result;
The text feature of the domain name to be identified is extracted, and the text feature is inputted in the disaggregated model that training is completed, is obtained To third recognition result;
Known according to preset weight distribution rule, first recognition result, second recognition result and the third Other result obtains the final domain name recognition result of the domain name to be identified.
2. domain name recognition methods according to claim 1, which is characterized in that described to judge whether the domain name to be identified matches To domain name blacklist, the first recognition result is obtained, comprising:
Judge the server ip in domain name blacklist with the presence or absence of the domain name to be identified and/or the domain name to be identified And/or DNS processing result, obtain first recognition result.
3. domain name recognition methods according to claim 1, which is characterized in that behavioral statistics are carried out to the domain name to be identified, And the second recognition result is obtained according to statistical result, comprising:
Judge whether the query-attack domain name quantity in single attack or unit interval reaches preset value, and according to judging result Obtain second recognition result.
4. domain name recognition methods according to claim 1, which is characterized in that described to be advised according to preset weight distribution Then, first recognition result, second recognition result and the third recognition result obtain the domain name to be identified most Whole domain name recognition result, comprising:
Known according to preset weight distribution rule, first recognition result, second recognition result and the third Other result calculates comprehensive weight;
Judge whether the comprehensive weight is greater than preset value;
If so, the domain name to be identified is DGA domain name.
5. domain name recognition methods according to claim 4, which is characterized in that described to be advised according to preset weight distribution Then, first recognition result, second recognition result and the third recognition result calculate comprehensive weight, comprising:
The comprehensive weight is calculated according to weight calculation formula;Wherein, the weight calculation formula are as follows:
W=x*p1+y*p2+z*p3
Wherein, w is the comprehensive weight, and x is first recognition result, p1Know for described in the weight distribution rule first The corresponding weighted value of other result, y are second recognition result, p2For the second recognition result described in the weight distribution rule Corresponding weighted value, z are the third recognition result, p3It is corresponding for third recognition result described in the weight distribution rule Weighted value.
6. any one of -5 domain name recognition methods according to claim 1, which is characterized in that further include:
Training domain name collection is obtained, the text feature for each training sample that the trained domain name is concentrated is extracted, and determines each institute State the domain name recognition result of training sample;
Using the text feature and domain name recognition result train classification models, the classification mould that the training is completed is obtained Type.
7. domain name recognition methods according to claim 6, which is characterized in that the domain name of each training sample of determination Recognition result, comprising:
Behavioral statistics are carried out to each training sample and/or each training sample is obtained according to domain name blacklist Domain name recognition result.
8. a kind of domain name identifying system characterized by comprising
First identification module for determining domain name to be identified, and judges whether the domain name to be identified is matched to domain name blacklist, Obtain the first recognition result;
Second identification module for carrying out behavioral statistics to the domain name to be identified, and obtains the second identification according to statistical result As a result;
Third identification module has been trained for extracting the text feature of the domain name to be identified, and by text feature input At disaggregated model in, obtain third recognition result;
Weight calculation module, for according to preset weight distribution rule, first recognition result, second identification As a result the final domain name recognition result of the domain name to be identified is obtained with the third recognition result.
9. a kind of electronic equipment characterized by comprising
Memory, for storing computer program;
Processor is realized when for executing the computer program such as any one of claim 1 to 7 domain name recognition methods Step.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes the step such as any one of claim 1 to 7 domain name recognition methods when the computer program is executed by processor Suddenly.
CN201811277414.3A 2018-10-30 2018-10-30 A kind of domain name recognition methods, system and electronic equipment and storage medium Pending CN109450886A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811277414.3A CN109450886A (en) 2018-10-30 2018-10-30 A kind of domain name recognition methods, system and electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811277414.3A CN109450886A (en) 2018-10-30 2018-10-30 A kind of domain name recognition methods, system and electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109450886A true CN109450886A (en) 2019-03-08

Family

ID=65549300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811277414.3A Pending CN109450886A (en) 2018-10-30 2018-10-30 A kind of domain name recognition methods, system and electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109450886A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674370A (en) * 2019-09-23 2020-01-10 鹏城实验室 Domain name identification method and device, storage medium and electronic equipment
CN110808987A (en) * 2019-11-07 2020-02-18 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN112839012A (en) * 2019-11-22 2021-05-25 中国移动通信有限公司研究院 Zombie program domain name identification method, device, equipment and storage medium
CN112866257A (en) * 2021-01-22 2021-05-28 网宿科技股份有限公司 Domain name detection method, system and device
CN114285587A (en) * 2020-09-17 2022-04-05 中国电信股份有限公司 Domain name identification method and device and domain name classification model acquisition method and device
CN114363290A (en) * 2021-12-31 2022-04-15 恒安嘉新(北京)科技股份公司 Domain name identification method, device, equipment and storage medium
CN114785601A (en) * 2022-04-25 2022-07-22 中国农业银行股份有限公司 Rule matching optimization method and device
CN114978558A (en) * 2021-02-20 2022-08-30 中国电信股份有限公司 Domain name recognition method and device, computer device and storage medium
WO2024031884A1 (en) * 2022-08-08 2024-02-15 天翼安全科技有限公司 Method and apparatus for determining domain name homology, electronic device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120084860A1 (en) * 2010-10-01 2012-04-05 Alcatel-Lucent Usa Inc. System and method for detection of domain-flux botnets and the like
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest
CN108632227A (en) * 2017-03-23 2018-10-09 中国移动通信集团广东有限公司 A kind of malice domain name detection process method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120084860A1 (en) * 2010-10-01 2012-04-05 Alcatel-Lucent Usa Inc. System and method for detection of domain-flux botnets and the like
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest
CN108632227A (en) * 2017-03-23 2018-10-09 中国移动通信集团广东有限公司 A kind of malice domain name detection process method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李琳: "《云计算与大数据实验教材系列 MAHOUT实验指南》", 30 April 2017 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674370A (en) * 2019-09-23 2020-01-10 鹏城实验室 Domain name identification method and device, storage medium and electronic equipment
CN110808987A (en) * 2019-11-07 2020-02-18 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN110808987B (en) * 2019-11-07 2022-03-29 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN112839012B (en) * 2019-11-22 2023-05-09 中国移动通信有限公司研究院 Bot domain name identification method, device, equipment and storage medium
CN112839012A (en) * 2019-11-22 2021-05-25 中国移动通信有限公司研究院 Zombie program domain name identification method, device, equipment and storage medium
CN114285587A (en) * 2020-09-17 2022-04-05 中国电信股份有限公司 Domain name identification method and device and domain name classification model acquisition method and device
CN114285587B (en) * 2020-09-17 2023-10-10 中国电信股份有限公司 Domain name identification method and device and domain name classification model acquisition method and device
CN112866257A (en) * 2021-01-22 2021-05-28 网宿科技股份有限公司 Domain name detection method, system and device
CN112866257B (en) * 2021-01-22 2023-09-26 网宿科技股份有限公司 Domain name detection method, system and device
CN114978558A (en) * 2021-02-20 2022-08-30 中国电信股份有限公司 Domain name recognition method and device, computer device and storage medium
CN114363290B (en) * 2021-12-31 2023-08-29 恒安嘉新(北京)科技股份公司 Domain name identification method, device, equipment and storage medium
CN114363290A (en) * 2021-12-31 2022-04-15 恒安嘉新(北京)科技股份公司 Domain name identification method, device, equipment and storage medium
CN114785601A (en) * 2022-04-25 2022-07-22 中国农业银行股份有限公司 Rule matching optimization method and device
WO2024031884A1 (en) * 2022-08-08 2024-02-15 天翼安全科技有限公司 Method and apparatus for determining domain name homology, electronic device, and storage medium

Similar Documents

Publication Publication Date Title
CN109450886A (en) A kind of domain name recognition methods, system and electronic equipment and storage medium
CN109922032B (en) Method, device, equipment and storage medium for determining risk of logging in account
CN109302410B (en) Method and system for detecting abnormal behavior of internal user and computer storage medium
CN104836781B (en) Distinguish the method and device for accessing user identity
WO2017107422A1 (en) Method and device for user gender identification
CN110851835A (en) Image model detection method and device, electronic equipment and storage medium
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN106209862A (en) A kind of steal-number defence implementation method and device
CN112863683B (en) Medical record quality control method and device based on artificial intelligence, computer equipment and storage medium
CN111741002B (en) Method and device for training network intrusion detection model
CN108234472A (en) Detection method and device, computer equipment and the readable medium of Challenging black hole attack
CN111931809A (en) Data processing method and device, storage medium and electronic equipment
CN112395118A (en) Equipment data detection method and device
CN114143049A (en) Abnormal flow detection method, abnormal flow detection device, storage medium and electronic equipment
CN109391620A (en) Method for building up, system, server and the storage medium of abnormal behaviour decision model
CN115758282A (en) Cross-modal sensitive information identification method, system and terminal
CN106301979A (en) The method and system of the abnormal channel of detection
CN113435531B (en) Zero sample image classification method and system, electronic equipment and storage medium
CN113192639A (en) Training method, device and equipment of information prediction model and storage medium
CN110995681B (en) User identification method and device, electronic equipment and storage medium
CN108734011A (en) software link detection method and device
CN115119197B (en) Wireless network risk analysis method, device, equipment and medium based on big data
CN111191238A (en) Webshell detection method, terminal device and storage medium
CN116168403A (en) Medical data classification model training method, classification method, device and related medium
CN117176368A (en) Terminal-side privacy risk assessment method and device, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190308

RJ01 Rejection of invention patent application after publication