CN106230867A - Prediction domain name whether method, system and the model training method thereof of malice, system - Google Patents

Prediction domain name whether method, system and the model training method thereof of malice, system Download PDF

Info

Publication number
CN106230867A
CN106230867A CN201610868349.6A CN201610868349A CN106230867A CN 106230867 A CN106230867 A CN 106230867A CN 201610868349 A CN201610868349 A CN 201610868349A CN 106230867 A CN106230867 A CN 106230867A
Authority
CN
China
Prior art keywords
domain name
feature
model
malice
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610868349.6A
Other languages
Chinese (zh)
Inventor
杨旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Knownsec Information Technology Co Ltd
Original Assignee
Beijing Knownsec Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Knownsec Information Technology Co Ltd filed Critical Beijing Knownsec Information Technology Co Ltd
Priority to CN201610868349.6A priority Critical patent/CN106230867A/en
Publication of CN106230867A publication Critical patent/CN106230867A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of method predicting domain name the most maliciously, including: at least one from the instant communication message of user, note, mail and webpage obtains potential domain name to be accessed;Extraction can embody the fisrt feature of the character composition of this domain name;And whether malice is predicted to this domain name to utilize, according to the fisrt feature extracted, the first model pre-build.The invention also discloses the system of corresponding prediction domain name whether malice and prediction the domain name whether training method of model of malice, system.

Description

Prediction domain name whether method, system and the model training method thereof of malice, system
Technical field
The present invention relates to field of information security technology, particularly relate to a kind of predict the domain name whether method of malice, system and Its model training method, system.
Background technology
Along with the developing rapidly of the network communications technology, the lasting in-depth of internet, applications, the becoming increasingly abundant of carried information, The Internet has become the infrastructure that human society is important, and meanwhile, network security problem is the most serious.Especially, territory Name system (DNS) and supporting technology thereof, as the Internet portal of thumping majority user Yu application system, have and make IP and domain name The function of decoupling and can be universal widely as well as it and apply and as easy as rolling off a log utilized by hacker with the feature of flexible configuration. Generally, hacker can use domain name Core Generator to generate malice domain name, and intranet host accesses Botnet by above-mentioned malice domain name Time, intranet host then becomes compromised slave, and then hacker controls whole Botnet, because the C&C server of fixing domain name is very Easily being controlled adapter by Security Officer, therefore hacker often uses DGA (Domain Generation Algorithm) dynamic domain Name generating algorithm generates random dynamically malice domain name.Due to the most maliciously domain name hazardness to the Internet, Security Officer Necessary in advance those potential domain names to be accessed are predicted, i.e. predict its whether malice.
At present, malice domain name recognition methods based on static nature has performance height, the feature of real-time response, static here Feature refers generally to the word-building characteristic (such as the accounting of spcial character, domain name length etc.) of domain name.But, the method exist accuracy rate with The problem that recall rate is the most undesirable.The malice domain name recognition methods of behavioral characteristics is typically found at, is recorded as with active probe DNS On the initial data of approach, although recognition effect is preferable, but its cannot real-time response, and application conditions is harsh.
Therefore, become the protective barrier of hacker in DNS technology, use domain name to make corpse as the communication infrastructure of Botnet The robustness of network significantly increases and C&C server be more difficult to physical location location in the environment of, need one more to have The scheme carrying out domain name maliciously predicting of effect.
Summary of the invention
To this end, the present invention provides a kind of predicts domain name whether method, system and the model training method thereof of malice, system, To try hard to solve or at least alleviate at least one problem existed above.
According to an aspect of the invention, it is provided a kind of training method predicting domain name model the most maliciously, including: At least one from the instant communication message of user, note, mail and webpage obtains potential domain name to be accessed;Look into Ask whether this domain name exists in preset domain name black and white lists storehouse;If not existing, extracting and can embody this domain name whether malice Feature;To this domain name, whether malice carries out pre-characteristic use prediction domain name according to the domain name extracted model the most maliciously Survey, and judge that domain name is the most maliciously according to predicting the outcome of model;If this domain name is judged as malice, utilize this domain name to model It is trained;If this domain name is judged as normally, utilize this domain name that model is trained with predetermined probability;Wherein utilize this territory Described model is trained by name, including: carry out its flow maliciously detecting by accessing this domain name correspondence chained address, will Testing result is as the actual result of determination of domain name whether malice;And feature and the actual result of determination of the domain name extracted are made It is that model is trained by a training sample.
Alternatively, according in the training method of the present invention, it was predicted that domain name model the most maliciously includes the first model, carries Take and can embody the step of this domain name whether feature of malice and include: extract the first special of the character composition that can embody this domain name Levy;The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample is included: First model is trained by the fisrt feature of the domain name of extraction and actual result of determination as a training sample.
According to another aspect of the present invention, it is provided that a kind of training system predicting domain name model the most maliciously, should System includes: potential domain name discovery module, is suitable to from the instant communication message of user, note, mail and webpage at least One obtains potential domain name to be accessed;Black and white lists enquiry module, is suitable to inquire about whether this domain name exists preset domain name In black and white lists storehouse;Domain name characteristic extracting module, is suitable to extract the feature that can embody this domain name the most maliciously;Maliciously domain name is pre- Survey module, inquire this domain name do not exist in preset domain name black and white lists storehouse, via territory if being suitable to black and white lists enquiry module Name characteristic extracting module extracts the feature that can embody this domain name the most maliciously, whether predicts domain name according to the characteristic use extracted To this domain name, whether malice is predicted the model of malice, and judges that domain name is the most maliciously according to predicting the outcome of model;Mould Type on-line training module, if being suitable to this domain name to be judged as malice, utilizes this domain name to be trained model;If being further adapted for this territory Name is judged as normal, utilizes this domain name to be trained model with predetermined probability;Flow malice detection module, is suitable to by visiting Ask that this domain name correspondence chained address carries out malice and detects its flow;And model on-line training module is further adapted for via flow Maliciously its flow is carried out maliciously detecting, using testing result as territory by detection module by accessing this domain name correspondence chained address The actual result of determination of name whether malice, be further adapted for feature that domain name characteristic extracting module is extracted and actual result of determination as Article one, model is trained by training sample.
Alternatively, according in the training system of the present invention, it was predicted that domain name model the most maliciously includes the first model, territory Name characteristic extracting module is further adapted for extracting the fisrt feature of the character composition that can embody this domain name, and model on-line training module is also Be suitable to the first model is trained fisrt feature and the actual prediction result of the domain name of extraction as a training sample.
According to another aspect of the present invention, it is provided that a kind of method predicting domain name whether malice, including: from user's At least one in instant communication message, note, mail and webpage obtains potential domain name to be accessed;Extraction can body The now fisrt feature of the character composition of this domain name, fisrt feature at least includes one in following characteristics item: with white list domain name Similarity, in domain name by the character sum comprised in the field sum of fullstop symbol segmentation, domain name, remove and white list domain name Domain name character sum, removal and white name after domain name number of packet, removal and white list domain name similar segments after similar segments After single domain name similar segments, in domain name, numerical character accounts for the ratio of character sum, removes and domain name after white list domain name similar segments Medial vowel character account for character sum ratio, remove with white list domain name similar segments after in domain name consonant characters account for character sum Ratio, remove with white list domain name similar segments after in domain name the character in addition to letter and number account for character sum ratio, Remove the character count most to number of repetition in domain name after white list domain name similar segments, remove similar to white list domain name point After Duan in domain name the greatest length of pure digi-tal character string and remove with white list domain name similar segments after in domain name non-vowel continuous The greatest length of the character string of character composition;And utilize the first model pre-build to come this according to the fisrt feature extracted Whether malice is predicted domain name.
Alternatively, in the method according to the invention, also include: extract the second of the log-on message that can embody this domain name Feature and can embody the third feature of host information of this domain name, second feature at least includes in following characteristics item Individual: whether domain name service business and domain name apply for secret protection, third feature at least includes one in following characteristics item: main Machine service supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate issuer, SSL certificate Effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL certificate user, SSL certificate user public affairs Key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, OS name, Operating system version, database-name, database version, WEB component Name and WEB component version;According to first extracted Feature, second feature and third feature utilize the second model pre-build, and to this domain name, whether malice is predicted.
An also aspect according to the present invention, it is provided that a kind of system predicting domain name the most maliciously, this system includes: Potential domain name discovery module, is suitable at least one from the instant communication message of user, note, mail and webpage and obtains Potential domain name to be accessed;Domain name characteristic extracting module, is suitable to extract the first spy of the character composition that can embody this domain name Levying, fisrt feature at least includes one in following characteristics item: with in the similarity of white list domain name, domain name by fullstop symbol point The field sum cut, the character sum comprised in domain name, remove with white list domain name similar segments after domain name number of packet, go Except with white list domain name similar segments after domain name character sum, remove with white list domain name similar segments after numeric word in domain name Symbol account for character sum ratio, remove with white list domain name similar segments after domain name medial vowel character account for character sum ratio, Remove the ratio of character sum that accounts for consonant characters in domain name after white list domain name similar segments, removal similar to white list domain name After segmentation, in domain name, the character in addition to letter and number accounts for the ratio of character sum, removes and white list domain name similar segments converse domain The maximum of pure digi-tal character string in domain name after character count, removal and white list domain name similar segments that in Ming, number of repetition is most Length and remove with white list domain name similar segments after the greatest length of the character string of non-vowel continuation character composition in domain name; And domain name malice prediction module, being suitable to the first model according to the fisrt feature utilization extracted pre-builds to this domain name is No malice is predicted.
Alternatively, in a system in accordance with the invention, domain name characteristic extracting module is further adapted for extraction and can embody this domain name The second feature of log-on message and the third feature of host information of this domain name can be embodied, second feature at least includes In following characteristics item one: whether domain name service business and domain name apply for secret protection, and third feature at least includes following spy Levy in item: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate Issuer, SSL certificate effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL certificate user, SSL certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate label Name, OS name, operating system version, database-name, database version, WEB component Name and WEB assembly version This;Domain name malice prediction module is further adapted for what the fisrt feature according to extraction, second feature and third feature utilization pre-build To this domain name, whether malice is predicted second model.
The present invention, by for malice domain name feature, extracts the feature that can embody domain name malice, and according to the spy extracted Levy and utilize the model pre-build to predict that domain name the most maliciously, improves the accuracy of domain name prediction.Utilize off-line sample simultaneously Model is trained, to ensure that model can realize the adaptation to newly-generated malice domain name, further by this and online sample Improve the accuracy to the prediction of malice domain name, it is not necessary to the artificial training sample data that gather carry out re-training to model, significantly Alleviate human cost.
Accompanying drawing explanation
In order to realize above-mentioned and relevant purpose, herein in conjunction with explained below and accompanying drawing, some illustrative side is described Face, these aspects indicate can to put into practice the various modes of principles disclosed herein, and all aspects and equivalence aspect It is intended to fall under in the range of theme required for protection.By reading in conjunction with the accompanying detailed description below, the disclosure above-mentioned And other purpose, feature and advantage will be apparent from.Throughout the disclosure, identical reference generally refers to identical Parts or element.
Fig. 1 shows the system 100 the most maliciously of the prediction domain name according to an exemplary embodiment of the present invention Structured flowchart;
Fig. 2 shows the stream of the method 200 the most maliciously of prediction domain name according to one exemplary embodiment Cheng Tu;
Fig. 3 shows the training system of the model the most maliciously of prediction domain name according to one exemplary embodiment The structured flowchart of system 300;And
Fig. 4 shows the training side of the model the most maliciously of prediction domain name according to one exemplary embodiment The flow chart of method 400.
Detailed description of the invention
It is more fully described the exemplary embodiment of the disclosure below with reference to accompanying drawings.Although accompanying drawing shows the disclosure Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure and should be by embodiments set forth here Limited.On the contrary, it is provided that these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
Domain name is the necessary requirement that addressing accesses, the core of the various application in the Internet especially.The present invention by set up for The prediction domain name of DGA DDNS generating algorithm model the most maliciously, is made whether malice to the domain name of user's potential access Prediction, and then user can be can suffer from malicious act and remind.Meanwhile, by the analog subscriber actual visit to domain name Ask the actual result of determination obtaining domain name the most maliciously, and with this, parameter of model is carried out on-line tuning, reach self adaptation The purpose of emerging DGA.
Fig. 1 shows the system 100 the most maliciously of prediction domain name according to one exemplary embodiment, system 100 include potential domain name discovery module 110 and malice domain name prediction module 120.
Potential domain name discovery module 110 is suitable to obtain potential domain name to be accessed from least one source following: user Instant communication message, note, mail and webpage.It is to be appreciated that the instant communication message of user, note, mail, He Liu The webpage that device of looking at accesses can propagate some chained addresses, includes domain name in chained address, and these domain names are considered latent User's domain name to be accessed, consideration based on safety, it is necessary to these domain names are carried out malice and predicts.
Potential domain name discovery module 110 can be deployed in the instant communication software service end such as QQ, wechat, it is also possible to is deployed in Operator SMS net gateway or customer service gateway are soft from the request of surfing the Net in real time, note, mail or the instant messaging of user The chained address propagated in the message of part is extracted domain name.
According to an embodiment of the invention, before carrying out domain name maliciously predicting, first it can be carried out black and white name Single-filtering.System 100 can include black and white lists enquiry module 130, black and white lists enquiry module 130 and potential domain name discovery mould Block 110 connects, and presets domain name black and white lists storehouse, and whether the domain name that can inquire about extraction exists in domain name black and white lists storehouse. If inquiring this domain name to be present in domain name black and white lists storehouse, then black and white lists enquiry module 130 can be according to Query Result Determine domain name the most maliciously, and will determine that domain name result the most maliciously is back to user.Wherein, domain name is present in white list storehouse Then this domain name is normal, is present in blacklist storehouse then this domain name malice.
If black and white lists enquiry module 130 inquires this domain name and is not present in domain name black and white lists storehouse, then by this domain name Transmission to malice domain name prediction module 120 is predicted.
As it is shown in figure 1, system 100 can also include domain name characteristic extracting module 140, with malice domain name prediction module 120 Connect, and be suitable to extract the feature that can embody this domain name the most maliciously.Maliciously domain name prediction module 120 receives and to carry out malice After the domain name of prediction, send it to domain name characteristic extracting module 140 and carry out the extraction of feature.Via domain name feature extraction mould After block 140 extracts the feature that can embody this domain name the most maliciously, maliciously domain name prediction module 120 is according to the feature profit extracted With the prediction domain name pre-build model the most maliciously, to this domain name, whether malice is predicted, and according to the prediction of model Result judges that domain name is the most maliciously.
Specifically, it was predicted that domain name model the most maliciously includes the first model, and the first model can be that utilization can here The model that the fisrt feature of the character composition embodying domain name is set up.It is to be appreciated that the character of malice domain name forms according to difference Malicious purposes there is different features.Such as, go fishing and swindle website be frequently used with by counterfeit website domain name vision or The character composition that pronunciation is similar, the domain name propagated for rogue program is generally automatically generated by DGA DDNS generating algorithm, this Sample is in order to ensure persistently to produce the domain name differed in a large number, and the domain name of generation is generally of certain rule, such as comprises numeral, territory Name does not meets the rules such as normal spelling rule.Therefore fisrt feature at least can include in following characteristics item: with white name The similarity of single domain name, can use now widely used similarity of character string comparison algorithm to calculate and obtain;Domain name is passed through The field sum of fullstop symbol segmentation;The character sum comprised in domain name;Remove with white list domain name similar segments after domain name divide Group quantity;Domain name character sum after removal and white list domain name similar segments;Remove and white list domain name similar segments converse domain In Ming, numerical character accounts for the ratio of character sum;Remove with white list domain name similar segments after domain name medial vowel character to account for character total The ratio of number;Remove and after white list domain name similar segments, in domain name, consonant characters accounts for the ratio of character sum;Remove and white name After single domain name similar segments, in domain name, the character in addition to letter and number accounts for the ratio of character sum;Remove and white list domain name phase Like the character count that number of repetition in domain name after segmentation is most;Pure digi-tal word in domain name after removal and white list domain name similar segments The greatest length of symbol string;And remove with white list domain name similar segments after the character string of non-vowel continuation character composition in domain name Greatest length.
Wherein, removing the meaning with white list domain name similar segments and be, DGA DDNS generating algorithm is in order to hide inspection What the inspection of examining system the user that confuses accessed is legitimate site, is further added by one sometimes outside the random domain name part generated Divide the character string of common legitimate domain name, therefore carry out in the random domain name core that DGA DDNS generating algorithm is generated Before analysis, need to clean out interference factor.
After domain name characteristic extracting module 140 correspondingly extracts above-mentioned fisrt feature, maliciously domain name prediction module 120 can root Whether malice is predicted, according to the first model to this domain name to utilize, according to the fisrt feature extracted, the first model pre-build The judgement domain name whether malice that predicts the outcome, and the result judging domain name whether malice is back to user.If wherein the first mould Predicting the outcome as maliciously of type, then judge domain name malice, if predicting the outcome as normally of the first model, then judges that domain name is normal.
According to another implementation of the invention, it was predicted that domain name model the most maliciously can also include the second model, Here the second model can be to utilize the fisrt feature that the above-mentioned character that can embody domain name forms, the note that can embody this domain name Volume information second feature and can embody this domain name host information third feature set up model.Wherein, for Two features, first DGA DDNS generating algorithm and domain name service provider have certain dependency, secondly, and maliciously domain name In order to prevent others from obtaining its log-on message, it will usually enable secret protection.Therefore second feature at least can include that domain name takes Whether business business and domain name apply in secret protection the two characteristic item.
For third feature, on the Internet, open source information investigation understands, the host clothes that hacker's tissue is used for Business there may be certain preference.And, when hacker's tissue is oneself service by controlling puppet's machine, it will usually utilize The particular vulnerability of puppet's machine infrastructure realizes.So, the maliciously operating system of domain name place main frame, data base, WEB assembly The important evidence of malice domain name is determined that etc. infrastructure.Additionally, when hacker organizes use SSL (SSL) agreement, A large amount of main frame would generally be disposed in batches, so the most identical SSL certificate can be used.Naturally, SSL certificate is also one and determines Domain name reliable characteristic the most maliciously.Therefore, third feature at least can include in following characteristics item: host services supplies Answer business, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate issuer, SSL certificate effect duration, The effective from date of SSL certificate, SSL certificate effective date of expiry, SSL certificate user, SSL certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, OS name, operating system Version, database-name, database version, WEB component Name and WEB component version.
After domain name characteristic extracting module 140 correspondingly extracts above-mentioned fisrt feature, second feature and third feature, maliciously territory Name prediction module 120 can also utilize, according to the fisrt feature extracted, second feature and third feature, the second mould pre-build To this domain name, whether malice is predicted type.
Maliciously whether domain name prediction module 120 can judge domain name according to the first model and predicting the outcome of the second model Maliciously, and by judgement domain name result the most maliciously it is back to user.Wherein, if in the first model and the second model any one Predict the outcome as malice, then malice domain name prediction module 120 can be determined that this domain name malice, if the first model and the second model Predicting the outcome, it is normal to be, it is possible to determine that this domain name is normal.
According to another implementation of the invention, it was predicted that domain name model the most maliciously can also include the 3rd model, Here the 3rd model can be to utilize the fisrt feature that the above-mentioned character that can embody domain name forms, the note that can embody this domain name The volume second feature of information and can be used in detecting (logistic regression) model that the fourth feature of Fast-Flux network is set up.This In, Botnet is the set of a series of infected system (sometimes referred to as corpse), and assailant can use order and control System controls it, attacks other users on the Internet and website.Fast-Flux technology can utilize DNS hiding attack Source, is therefore widely used by hacker.Working method original for DNS is that domain name is transmitted to DNS resolver, obtains corresponding IP Address.And using Fast Flux, the set of multiple IP addresses can be linked to certain specific domain name by assailant, and will be new Address swapping in and out from DNS records, can effectively avoid detection, therefore use the Botnet of Fast-flux technology to give Network security brings great threat, and recognition detection Fast-Flux network is significant to network security.
In the present invention, at least can include in following characteristics item for detecting the fourth feature of Fast-Flux network Individual: domain name life cycle, maliciously the life cycle of domain name is the shortest;Unduplicated IP number of addresses in A record, can be used for commenting Estimate the quantity of puppet's main frame in Fast-Flux network;The unduplicated autonomous system that in A record, unduplicated IP address is corresponding Number, can be used for differentiation is Fast-Flux network or CDN;The unduplicated net that in A record, unduplicated IP address is corresponding In network number, A record, in unduplicated tissue number corresponding to unduplicated IP address, A record, to carry out DNS inverse in unduplicated IP address Resolve the unduplicated FQDN number obtained to inquiry, these 3 characteristic items can distinguish Fast-Flux network or same tissue Internal network;The name server quantity used in inquiry of the domain name, Fast-Flux network generally uses relatively more names Claim server;Unduplicated IP address ratio in malicious IP addresses storehouse in DNS record ageing time (TTL), A record, i.e. A IP address sum in the counting/A record in malicious IP addresses storehouse of the IP address in record;And an inquiry of the domain name is used To name server address ratio in malice name server storehouse, the name server used in i.e. one time inquiry of the domain name The name server sum used in a quantity/time inquiry of the domain name in malice name server storehouse.These characteristic items are the most permissible By obtaining after the convergence of DNS data in a period of time and Merging.
After domain name characteristic extracting module 140 correspondingly extracts above-mentioned fisrt feature, second feature and fourth feature, maliciously territory Name prediction module 120 can also utilize, according to the fisrt feature extracted, second feature and fourth feature, the 3rd mould pre-build To this domain name, whether malice is predicted type.
Maliciously domain name prediction module 120 can be sentenced according to the first model, the second model and the 3rd predicting the outcome of model Localization name the most maliciously, and will judge that domain name result the most maliciously is back to user.
According to an embodiment of the invention, if any one is pre-in the first model, the second model and the 3rd model Surveying result is malice, then malice domain name prediction module 120 judges this domain name malice, if the first model, the second model and the 3rd mould Type predicts the outcome and is normally, it is determined that this domain name is normal.
Additionally, malice domain name prediction module 120 can select according in the first model, the second model and the 3rd model extremely Few one determines domain name the most maliciously, and extracts corresponding feature, and the present invention is without limitation.Wherein, the 3rd model needs The feature wanted, owing to obtaining after needing the convergence to DNS data and Merging, is therefore commonly available to domain name is carried out non-solid The scene of time property prediction.
According to the yet another embodiment of the present invention, if malice domain name prediction module 120 judges domain name malice, the most potential Domain name discovery module 110 can also realize linkage according to the result of determination of malice domain name prediction module 120 with subscriber service system, User behavior is intervened, such as, blocks user's surfing flow, filter the note containing malice domain name and at IMU Letter software pushes information etc..
The first model, the second model and the 3rd model being described below in detail in the present invention.
First model, the second model and the 3rd model can be Logic Regression Models, and its loss function and mapping function are such as Following formula:
Z={ (Xj, yj) | j=1,2...M};
yj=h (W, Xj),
Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjCharacteristic vector for j-th strip sample (being made up of the feature setting up model), yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.
Fig. 2 shows the method 200 the most maliciously of prediction domain name according to one exemplary embodiment, the party Method 200 starts from step S210.
In step S210, at least one from the instant communication message of user, note, mail and webpage obtains Potential domain name to be accessed.
Then in step S220, extract the fisrt feature of the character composition that can embody this domain name, wherein fisrt feature At least can include in following characteristics item: with in the similarity of white list domain name, domain name by the word of fullstop symbol segmentation Section sum, the character sum comprised in domain name, remove with white list domain name similar segments after domain name number of packet, remove with white After domain name character sum, removal and white list domain name similar segments after list domain name similar segments, in domain name, numerical character accounts for word Symbol sum ratio, remove with white list domain name similar segments after domain name medial vowel character account for character sum ratio, remove with After white list domain name similar segments, in domain name, consonant characters accounts for the ratio of character sum, removes and after white list domain name similar segments In domain name, the character in addition to letter and number accounts for the ratio of character sum, removes and heavy in domain name after white list domain name similar segments Again count most character count, remove with white list domain name similar segments after the greatest length of pure digi-tal character string in domain name, With the greatest length of the character string of non-vowel continuation character composition in domain name after removal and white list domain name similar segments.
Last in step S230, according to the first model that the fisrt feature utilization extracted pre-builds to this domain name be No malice is predicted.
According to an embodiment of the invention, method 200 can also include step: extracts the note that can embody this domain name The second feature of volume information and can embody the third feature of host information of this domain name, wherein second feature is the most permissible Including in following characteristics item one: whether domain name service business and domain name apply for that secret protection, described third feature are at least wrapped Include in following characteristics item: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark Knowledge, SSL certificate issuer, SSL certificate effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL card Book user, SSL certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, OS name, operating system version, database-name, database version, WEB component Name and WEB component version;The second model pre-build is utilized to come this according to the fisrt feature extracted, second feature and third feature Whether malice is predicted domain name.
Here, if in the first model and the second model arbitrary model predict the outcome as malice, then judge that this domain name is as disliking Meaning, if the first model and predicting the outcome of the second model are normally, then judges that this domain name is as normal.
According to an embodiment of the invention, method 200 can also include step: extracts and can be used in detecting Fast- The fourth feature of Flux network, wherein fourth feature at least can include in following characteristics item: domain name life cycle, A Unduplicated autonomous system number that in unduplicated IP number of addresses in record, A record, unduplicated IP address is corresponding, A record In unduplicated IP address is corresponding in unduplicated network number corresponding to unduplicated IP address, A record unduplicated tissue Number, A record in unduplicated IP address carry out DNS inversely inquire about resolve obtain unduplicated FQDN number, an inquiry of the domain name In use name server quantity, DNS record ageing time (TTL), A record in unduplicated IP address at malicious IP addresses The name server address used in ratio in storehouse and inquiry of the domain name ratio in malice name server storehouse; According to the 3rd model that the fisrt feature of domain name extracted, second feature and fourth feature utilization pre-build to this domain name it is No malice is predicted.
Here, if in the first model, the second model and the 3rd model arbitrary model predict the outcome for malice, then judge should Domain name is malice, if predicting the outcome of the first model, the second model and the 3rd model is normally, then judges that this domain name is as just Often.
Above-mentioned model can be all Logic Regression Models, its loss function and mapping function such as following formula:
Z={ (Xj,yj) | j=1,2...M};
yj=h (W, Xj),
Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjCharacteristic vector for j-th strip sample (being made up of the feature setting up model), yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.
According to the yet another embodiment of the present invention, method 200 can also include step: inquires about whether this domain name exists In preset domain name black and white lists storehouse;If existing, then judge that domain name is the most maliciously according to Query Result;If not existing, then extract The feature of domain name, according to the corresponding model of characteristic use extracted, to domain name, whether malice is predicted.
Combining during the system 100 of prediction domain name whether malice is specifically described by Fig. 1 in each step above Respective handling explained in detail, the most no longer duplicate contents is repeated.
To sum up, the present invention fully takes into account the feature of malice domain name, extracts above-mentioned various static state, behavioral characteristics, it is ensured that The accuracy rate of model prediction.
Additionally, the present invention also needs to utilize sample to be trained model, to ensure that model can realize newly-generated The maliciously adaptation of domain name, improves the accuracy to the prediction of malice domain name further.
Fig. 3 shows the training system of the model the most maliciously of prediction domain name according to one exemplary embodiment System 300, system 300 includes the potential domain name discovery module 110 in system 100, maliciously domain name prediction module 120, black and white lists Enquiry module 130, domain name characteristic extracting module 140, additionally can include that model off-line training module 310, model are instructed online Practice module 320 and flow malice detection module 330.
In order to make model reach reasonable prediction effect when reaching the standard grade, the present invention can be first with the number collected in advance Off-line training is carried out according to model.Specifically, model off-line training module 310 can be using malice domain name blacklist as forward sample This, utilize simultaneously the domain name white list that the website such as Alexa ranking obtains as negative sense sample, in conjunction with via characteristic extracting module The individual features of 120 domain names extracted, is trained above-mentioned model.
Model, after off-line training, i.e. can put into and use on line, and to domain name, whether malice is predicted.Due to model Prediction effect is limited by the quality of the sample set trained, and the sample set of off-line training is limited, when emerging DGA is dynamic When the domain name feature that domain name algorithm generates changes, it usually needs manually model is adjusted, then re-start training.
And the present invention utilizes data on line that model is carried out on-line training via model on-line training module 320, Ke Yijian The on-line study model of vertical dynamic self-adapting.When DGA DDNS generating algorithm is evolved, model can be evolved the most therewith, Discovery identification to emerging malice domain name in real time.Simultaneously, it is not necessary to the artificial training sample data that gather carry out weight to model New training, significantly reduces human cost.
First, it was predicted that domain name training system 300 the most maliciously needs by potential domain name discovery module 110, maliciously territory Name prediction module 120, black and white lists enquiry module 130, domain name characteristic extracting module 140 judge that a domain name is the most maliciously. Concrete the most in the explanation of the system 100 to prediction domain name whether malice, the respective handling in each module is carried out detailed solution Release, the most no longer duplicate contents is repeated.
Model on-line training module 320 is connected with malice domain name prediction module 120, if malice domain name prediction module 120 is sentenced This domain name fixed is malice, then model on-line training module 320 utilizes this domain name to be trained model.Due to malice domain name quantity Account for normal domain name ratio the least, in order to ensure the equilibrium of positive and negative training sample, if therefore malice domain name prediction module 120 judges to be somebody's turn to do Domain name is normal, and model on-line training module 320 utilizes this domain name to be trained model with predetermined probability.Generally this makes a reservation for general Rate can be 0.001.
Specifically, model on-line training module 320 is connected with flow malice detection module 330, flow malice detection module 330 are suitable to, by the chained address that access domain name is corresponding, its flow is carried out malice detects, such as, flow content is carried out disease Poison, wooden horse, swindle webpage etc. detect.
Model on-line training module 320 via flow malice detection module 330 by accessing this domain name correspondence chain ground connection After its flow is carried out maliciously detecting by location, using the testing result of flow malice detection module 330 as domain name the most maliciously Actual result of determination, and the feature corresponding domain name characteristic extracting module 140 extracted and this actual result of determination are as one Model is trained by bar training sample.
Such as, model on-line training module 320 can using the fisrt feature of domain name extracted and actual result of determination as Article one, the first model is trained by training sample, the fisrt feature of domain name, second feature, third feature and the reality that will extract Second model is trained by result of determination as a training sample, by the fisrt feature of domain name extracted, second feature, the 3rd model is trained by four features and actual result of determination as one article of training sample.
According to an embodiment of the invention, black and white lists enquiry module 130 is further adapted for according to via flow malice inspection Survey the actual result of determination that module 230 obtains after malice detects, phase by accessing domain name correspondence chained address to carry out its flow Ground should be added to domain name black and white lists storehouse by this domain name.
The principle of model training is specifically described below as a example by FTRL on-line learning algorithm.
Model on-line training module 320 can be iterated for each training sample, calculates damage in each iteration Mistake function is at current descent direction, and updates feature weight W, until loss function is optimum.Here FTRL (Follow-is used The-regularized-Leader) model is trained by algorithm.
In FTRL algorithm, it is assumed that in the t time iteration, training sample is (xt,yt), wherein xtFor having the feature of i dimension Vector, yt∈ { 0,1}, wtFor having the feature weight vector of i dimension, wt,iFeature weight for feature that dimension is i.Feature The more new formula of weight dimension is as follows:
w t , i = 0 i f | z i | ≤ λ 1 - ( β + n i α + λ 2 ) - 1 ( z i - sgn ( z i ) λ 1 ) o t h e r w i s e
Wherein, λ1For L1 regularization parameter, λ2For L2 regularization parameter, the most all can be with value for 1.OrderVectorWhen the t time iteration,Accordingly The component that dimension is i
Define again pt=σ (xt·wt),The loss function of this sample can be expressed as lt(wt)=- ytlogpt-(1-yt)log(1-pt), in the t time iteration, the gradient of loss function can be expressed asCorresponding dimension be the gradient of the feature of i be gi=(pt-yt)xi
Use σsDefinition learning rate, i.e.Dimension is that the learning rate of the feature of i isα, β are the parameter needing user to input, the most all can be with value for 1.
FTRL on-line learning algorithm is intended only as the framework that above-mentioned model adjusts automatically, and the present invention is not limiting as being used On-line learning algorithm, except FTRL algorithm can also use the training of FOBOS, RDA scheduling algorithm implementation model.
Fig. 4 shows the training side of the model the most maliciously of prediction domain name according to one exemplary embodiment Method 400, method 400 is suitable to step S410.
In step S410, at least one from the instant communication message of user, note, mail and webpage obtains Potential domain name to be accessed.
Then, in the step s 420, inquire about whether this domain name exists in preset domain name black and white lists storehouse, if not existing, Then in step S430, extract the feature that can embody this domain name the most maliciously.
In step S440, predict that domain name model the most maliciously comes this domain name according to the characteristic use of the domain name extracted Whether malice is predicted, and judges that domain name is the most maliciously according to predicting the outcome of model.
If this domain name is judged as malice, utilize this domain name that model is trained in step S450;If this domain name quilt It is judged to normal, then utilizes this domain name that model is trained with predetermined probability in step S460.
Specifically, the step utilizing this domain name to be trained model may include that by accessing the link of this domain name correspondence Address carries out malice and detects its flow, using testing result as the actual result of determination of domain name whether malice;And will carry Model is trained by the feature of the domain name taken and actual result of determination as a training sample.
Wherein, according to an embodiment of the invention, it was predicted that domain name model the most maliciously can include the first model, Extraction can embody the step of this domain name feature the most maliciously and may include that extraction can embody the character composition of this domain name Fisrt feature;The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample May include that and the first model is instructed as a training sample by the fisrt feature of the domain name of extraction and actual result of determination Practice.
Wherein, fisrt feature at least can include in following characteristics item: with similarity, the domain name of white list domain name In by the field sum of fullstop symbol segmentation, the character sum comprised in domain name, remove with white list domain name similar segments after Domain name number of packet, removal and the sum of the domain name character after white list domain name similar segments, similar to white list domain name point of removal After Duan in domain name numerical character account for character sum ratio, remove with white list domain name similar segments after domain name medial vowel character account for After the ratio of character sum, removal and white list domain name similar segments, in domain name, consonant characters accounts for the ratio of character sum, removal With ratio, removal and the white list that the character in addition to letter and number in domain name after white list domain name similar segments accounts for character sum Character count that after domain name similar segments, in domain name, number of repetition is most, remove with white list domain name similar segments after in domain name pure The greatest length of digit strings and remove with white list domain name similar segments after the word of non-vowel continuation character composition in domain name The greatest length of symbol string.
According to another implementation of the invention, it was predicted that domain name model the most maliciously can also include the second model, Extraction can embody the step of this domain name feature the most maliciously and can also include: extract the log-on message that can embody this domain name Second feature and the third feature of host information of this domain name can be embodied.Second feature at least can include following spy Levying in item one: whether domain name service business and domain name apply for secret protection, third feature at least can include following characteristics In Xiang one: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate are issued Originator, SSL certificate effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL certificate user, SSL Certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, behaviour Make systematic name, operating system version, database-name, database version, WEB component Name and WEB component version.
The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample Can also include: using the fisrt feature of domain name, second feature, third feature and the actual judged result of extraction as a training Second model is trained by sample.Characteristic use prediction domain name according to the domain name extracted model the most maliciously comes this territory The name step whether malice is predicted may include that
Whether malice is predicted to this domain name to utilize the first model according to the fisrt feature extracted;According to the extracted One feature, second feature and third feature utilize the second model, and to this domain name, whether malice is predicted;And if above-mentioned Predicting the outcome as maliciously of one model, then judge that this domain name, as malice, is normally if predicting the outcome, it is determined that this domain name is just Often.
Yet another embodiment according to the present invention, it was predicted that domain name model the most maliciously can also include the 3rd mould Type, extraction can embody the step of this domain name feature the most maliciously and also include: extracts and can be used in detecting Fast-Flux network Fourth feature.Fourth feature at least can include one in following characteristics item: does not repeats in domain name life cycle, A record IP number of addresses, unduplicated IP in unduplicated autonomous system number corresponding to unduplicated IP address, A record in A record In unduplicated tissue number that in unduplicated network number corresponding to address, A record, unduplicated IP address is corresponding, A record not The IP address repeated carries out DNS and inversely inquires about the title resolving the unduplicated FQDN number obtained, using in an inquiry of the domain name Unduplicated IP address ratio in malicious IP addresses storehouse in number of servers, DNS record ageing time (TTL), A record, And the ratio that the name server address used in an inquiry of the domain name is in malice name server storehouse.
The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample Can also include: using the fisrt feature of domain name, second feature, fourth feature and the actual result of determination of extraction as a training 3rd model is trained by sample.Characteristic use prediction domain name according to the domain name extracted model the most maliciously comes this territory Whether the name step that malice is predicted may include that and utilize the first model to this domain name to come according to the fisrt feature extracted Malice is predicted;The second model is utilized to this domain name whether to come according to fisrt feature, second feature and the third feature extracted Malice is predicted;The 3rd model is utilized to come this domain name according to the fisrt feature of domain name extracted, second feature, fourth feature Whether malice is predicted;And if any of the above-described model predict the outcome for malice, then judge this domain name as malice, if prediction Result is normally, it is determined that this domain name is normal.
Here, model can be all Logic Regression Models:
Z={ (Xj,yj) | j=1,2...M};
yj=h (W, Xj),
Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjFor j-th strip sample feature to Amount, yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.
According to an embodiment of the invention, using the feature of domain name extracted and actual result of determination as a training The step that model is trained by sample may include that use FTRL algorithm is respectively to the first model, the second model and the 3rd mould Type is trained, and updates the feature weight of each model respectively.
According to another implementation of the invention, method 400 can also include: by malice domain name blacklist as forward Sample, utilize domain name white list that Alexa website ranking obtains as negative sense sample and the individual features of domain name extracted, Model is trained.
According to the yet another embodiment of the present invention, method 400 can also include: according to by accessing domain name correspondence chain Ground connection location carries out the testing result obtained after malice detects to its flow, correspondingly adds this domain name to domain name black and white lists Storehouse.
Above in the specific descriptions combining Fig. 3 and illustrate to predict the domain name whether training system 300 of the model of malice Respective handling in each step is explained in detail, the most no longer duplicate contents is repeated.
Should be appreciated that one or more, the most right in order to simplify that the disclosure helping understands in each inventive aspect In the description of the exemplary embodiment of the present invention, each feature of the present invention be sometimes grouped together into single embodiment, figure or In person's descriptions thereof.But, should not be construed to the method for the disclosure reflect an intention that the most required for protection is sent out The bright feature more features requiring ratio to be expressly recited in each claim.More precisely, as the following claims As book is reflected, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows specifically real The claims executing mode are thus expressly incorporated in this detailed description of the invention, and the most each claim itself is as this Bright independent embodiment.
Present invention additionally comprises: A4, training method as described in A3, wherein, according to the characteristic use of the domain name extracted This domain name step whether malice is predicted is included by prediction domain name model the most maliciously: according to the fisrt feature extracted Whether malice is predicted to this domain name to utilize described first model;According to the fisrt feature, second feature and the 3rd that extract To this domain name, whether malice is predicted second model described in characteristic use;And if any of the above-described model predict the outcome for Maliciously, then judge that this domain name, as malice, is normally if predicting the outcome, it is determined that this domain name is normal.A5, training as described in A3 Method, wherein, described prediction domain name model the most maliciously also includes the 3rd model, and extraction can embody this domain name the most maliciously The step of feature also include: extract the fourth feature that can be used in detecting Fast-Flux network;Feature by the domain name of extraction The step being trained model as a training sample with actual prediction result also includes: special by the first of the domain name of extraction Levy, the 3rd model is trained by second feature, fourth feature and actual prediction result as one article of training sample.A6, such as A5 Described training method, wherein, predicts that according to the characteristic use of the domain name extracted domain name model the most maliciously comes this The domain name step that malice is predicted includes: utilize described first model to this domain name to be according to the fisrt feature extracted No malice is predicted;Fisrt feature, second feature and third feature according to extracting utilize described second model to come this territory Whether malice is predicted name;Described 3rd model is utilized according to the fisrt feature of domain name extracted, second feature, fourth feature To this domain name, whether malice is predicted;And if any of the above-described model predict the outcome for malice, then judge this domain name as Maliciously, if predicting the outcome and being normally, it is determined that this domain name is normal.A7, training method as described in arbitrary in A1-6, wherein, Described model is Logic Regression Models:
Z={ (Xj,yj) | j=1,2...M};yj=h (W, Xj),Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjFor j-th strip sample Characteristic vector, yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.A8, instruction as described in A7 Practicing method, wherein, model is instructed by the described feature using the domain name of extraction and actual prediction result as a training sample The step practiced also includes: uses FTRL algorithm to be trained the first model, the second model and the 3rd model respectively, updates respectively The feature weight of each model.A9, training method as according to any one of A2-8, wherein, described fisrt feature at least includes In following characteristics item one: with in the similarity of white list domain name, domain name by the field sum of fullstop symbol segmentation, domain name Comprise character sum, remove with the domain name number of packet after white list domain name similar segments, removal similar to white list domain name Domain name character sum after segmentation, remove with white list domain name similar segments after in domain name numerical character account for the ratio of character sum After example, removal and white list domain name similar segments, domain name medial vowel character accounts for the ratio of character sum, removes and white list domain name After similar segments, in domain name, consonant characters accounts for the ratio of character sum, removes and remove word after white list domain name similar segments in domain name Character outside female and numeral account for character sum ratio, remove with white list domain name similar segments after in domain name number of repetition most Character count, remove with white list domain name similar segments after the greatest length of pure digi-tal character string and removing with white in domain name The greatest length of the character string of non-vowel continuation character composition in domain name after list domain name similar segments.A10, as arbitrary in A3-9 Training method described in Xiang, wherein, described second feature at least includes one in following characteristics item: domain name service business and Whether domain name applies for that secret protection, described third feature at least include one in following characteristics item: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate issuer, SSL certificate effect duration, SSL certificate Effectively from date, SSL certificate effective date of expiry, SSL certificate user, SSL certificate user public key information, SSL certificate Public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, OS name, operating system version, number According to library name, database version, WEB component Name and WEB component version.A11, training as according to any one of A4-10 Method, wherein, described fourth feature at least includes one in following characteristics item: unduplicated in domain name life cycle, A record Unduplicated IP ground in unduplicated autonomous system number corresponding to unduplicated IP address, A record in IP number of addresses, A record Unduplicated tissue number that in unduplicated network number corresponding to location, A record, unduplicated IP address is corresponding, A record do not weigh Multiple IP address carries out DNS and inversely inquires about the title clothes resolving the unduplicated FQDN number obtained, using in an inquiry of the domain name Unduplicated IP address ratio in malicious IP addresses storehouse in business device quantity, DNS record ageing time (TTL), A record, with And the ratio that the name server address used in an inquiry of the domain name is in malice name server storehouse.A12, as in A1-11 Arbitrary described training method, also includes: using malice domain name blacklist as forward sample, utilize Alexa website ranking to obtain Domain name white list as negative sense sample and the individual features of domain name extracted, described model is trained.A13, as Arbitrary described training method in A1-12, also includes: carry out its flow according to by access domain name correspondence chained address The testing result obtained after malice detection, correspondingly adds domain name to domain name black and white lists storehouse.
B17, training system as described in B16, wherein, described malice domain name prediction module is further adapted for according to first extracted This domain name whether malice is predicted by the first model described in characteristic use, according to the fisrt feature extracted, second feature and Third feature utilizes described second model, and to this domain name, whether malice is predicted, if being further adapted for the prediction of any of the above-described model Result is malice, then judge that this domain name, as malice, if all predicting the outcome as normally, then judges that this domain name is as normal.B18, as Training system described in B16, wherein, described prediction domain name model the most maliciously also includes the 3rd model, domain name feature Extraction module is further adapted for extracting the fourth feature that can be used in detecting Fast-Flux network;Described model on-line training module is also Be suitable to the fisrt feature of domain name, second feature, fourth feature and the actual prediction result of extraction as a training sample pair 3rd model is trained.B19, training system as described in B18, wherein, described malice domain name prediction module is further adapted for basis The fisrt feature extracted utilizes described first model to be predicted this domain name whether malice, according to the fisrt feature extracted, Second feature and third feature utilize described second model, and to this domain name, whether malice is predicted, according to the domain name extracted Fisrt feature, second feature, fourth feature utilize described 3rd model, and to this domain name, whether malice is predicted, if being further adapted for Predicting the outcome as maliciously of any of the above-described model, then judge that this domain name, as malice, if all predicting the outcome as normally, then judges to be somebody's turn to do Domain name is normal.B20, training system as described in arbitrary in B14-19, wherein, described model is Logic Regression Models:
Z={ (Xj,yj) | j=1,2...M};yj=h (W, Xj),Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjFor j-th strip sample Characteristic vector, yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.B21, as described in B20 Training system, wherein, described model on-line training module be further adapted for use FTRL algorithm respectively to the first model, the second model and 3rd model is trained, and updates the feature weight of each model respectively.B22, training system as according to any one of B15-21 System, wherein, described fisrt feature at least includes one in following characteristics item: logical with the similarity of white list domain name, domain name Cross the domain name after fullstop accords with character sum, removal and the white list domain name similar segments comprised in the field sum of segmentation, domain name After domain name character sum, removal and white list domain name similar segments after number of packet, removal and white list domain name similar segments After in domain name, numerical character accounts for the ratio of character sum, removal and white list domain name similar segments, domain name medial vowel character accounts for character The ratio of sum, remove with white list domain name similar segments after in domain name consonant characters account for the ratio of character sum, remove with white After list domain name similar segments, in domain name, the character in addition to letter and number accounts for the ratio of character sum, removes and white list domain name Pure digi-tal in domain name after character count, removal and white list domain name similar segments that after similar segments, in domain name, number of repetition is most The greatest length of character string and remove with white list domain name similar segments after the character string of non-vowel continuation character composition in domain name Greatest length.B23, training system as according to any one of B16-22, wherein, described second feature at least includes following spy Levy in item one: whether domain name service business and domain name apply for that secret protection, described third feature at least include following characteristics In Xiang one: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark, SSL certificate are issued Originator, SSL certificate effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL certificate user, SSL Certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, behaviour Make systematic name, operating system version, database-name, database version, WEB component Name and WEB component version. B23, training system as according to any one of B17-23, wherein, described fourth feature at least includes in following characteristics item Individual: corresponding unduplicated in unduplicated IP address in unduplicated IP number of addresses, A record in domain name life cycle, A record Unduplicated IP address pair in unduplicated network number corresponding to unduplicated IP address, A record in autonomous system number, A record In the unduplicated tissue number answered, A record, unduplicated IP address carries out DNS and inversely inquires about and resolve the unduplicated FQDN obtained The name server quantity, the DNS that use in number, an inquiry of the domain name record unduplicated IP in ageing time (TTL), A record The name server address used in address ratio in malicious IP addresses storehouse and an inquiry of the domain name takes in malice title Ratio in business device storehouse.B24, training system as described in arbitrary in B14-23, also include model off-line training module, be suitable to by Maliciously domain name blacklist as forward sample, utilize domain name white list that Alexa website ranking obtains as negative sense sample and Via the individual features of the domain name that characteristic extracting module 120 is extracted, described model is trained.B25, as arbitrary in B14-24 Described training system, described black and white lists enquiry module is further adapted for passing through to access according to via described flow malice detection module Domain name correspondence chained address carries out the actual prediction result obtained after malice detects to its flow, correspondingly by domain name Add to domain name black and white lists storehouse.
C28, method as described in C27, also include: extracts the fourth feature that can be used in detecting Fast-Flux network, institute State that fourth feature at least includes in following characteristics item one: unduplicated IP number of addresses, A in domain name life cycle, A record Corresponding not the weighing in unduplicated IP address in unduplicated autonomous system number that in record, unduplicated IP address is corresponding, A record In unduplicated tissue number that in multiple network number, A record, unduplicated IP address is corresponding, A record, unduplicated IP address is entered Row DNS inversely inquires about and resolves name server quantity, the DNS used in the unduplicated FQDN number obtained, an inquiry of the domain name In record ageing time (TTL), A record, unduplicated IP address ratio in malicious IP addresses storehouse and a domain name are looked into The name server address used in inquiry ratio in malice name server storehouse;According to extract domain name fisrt feature, Second feature and fourth feature utilize the 3rd model pre-build, and to this domain name, whether malice is predicted.C29, such as C26- Arbitrary described method in 28, also includes: if any of the above-described model predict the outcome as malice, then judge this domain name as malice, It is if all predicting the outcome normal, then judges that this domain name is as normal.C30, method as described in arbitrary in C26-29, also wrap Include: inquire about whether this domain name exists in preset domain name black and white lists storehouse;If existing, then whether determine domain name according to Query Result Maliciously;If not existing, then extracting the feature of domain name, according to the corresponding model of characteristic use extracted, to domain name, whether malice is carried out Prediction.C31, method as described in arbitrary in C26-30, wherein, described model is Logic Regression Models:
Z={ (Xj,yj) | j=1,2...M};yj=h (W, Xj),Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjFor j-th strip sample Characteristic vector, yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.
D34, system as described in D33, wherein said domain name characteristic extracting module is further adapted for extraction and can be used in detection The fourth feature of Fast-Flux network, described fourth feature at least includes one in following characteristics item: domain name life cycle, A Unduplicated autonomous system number that in unduplicated IP number of addresses in record, A record, unduplicated IP address is corresponding, A record In unduplicated IP address is corresponding in unduplicated network number corresponding to unduplicated IP address, A record unduplicated tissue Number, A record in unduplicated IP address carry out DNS inversely inquire about resolve obtain unduplicated FQDN number, an inquiry of the domain name In use name server quantity, DNS record ageing time (TTL), A record in unduplicated IP address at malicious IP addresses The name server address used in ratio in storehouse and inquiry of the domain name ratio in malice name server storehouse; Domain name malice prediction module is further adapted for utilizing according to the fisrt feature of domain name extracted, second feature and fourth feature pre-building The 3rd model this domain name whether malice is predicted.D35, system as described in arbitrary in D32-34, domain name malice is pre- If survey module be further adapted for any of the above-described model predict the outcome for malice, then judge this domain name as malice, if all predicting the outcome It is normal, then judges that this domain name is as normal.D36, system as described in arbitrary in D32-35, also include that domain name black and white lists is looked into Ask module, be suitable to inquire about whether this domain name exists in preset domain name black and white lists storehouse;If existing, then determine according to Query Result Domain name is the most maliciously;If not existing, then domain name characteristic extracting module extracts the feature of domain name, and domain name malice prediction module is according to carrying To domain name, whether malice is predicted the corresponding model of characteristic use taken.D37, system as described in arbitrary in D32-36, its In, described model is Logic Regression Models:
Z={ (Xj,yj) | j=1,2...M};yj=h (W, Xj),Wherein, l (W, Z) is loss function, and W is characterized weight, and Z is sample set, XjFor j-th strip sample Characteristic vector, yjFor predictive value, h (W, Xj) it is characterized vector XjTo predictive value yjMapping function.
Those skilled in the art are to be understood that the module of the equipment in example disclosed herein or unit or group Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in this example In different one or more equipment.Module in aforementioned exemplary can be combined as a module or be segmented into multiple in addition Submodule.
Those skilled in the art are appreciated that and can carry out the module in the equipment in embodiment adaptively Change and they are arranged in one or more equipment different from this embodiment.Can be the module in embodiment or list Unit or assembly are combined into a module or unit or assembly, and can put them in addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit excludes each other, can use any Combine all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed appoint Where method or all processes of equipment or unit are combined.Unless expressly stated otherwise, this specification (includes adjoint power Profit requires, summary and accompanying drawing) disclosed in each feature can be carried out generation by providing identical, equivalent or the alternative features of similar purpose Replace.
Although additionally, it will be appreciated by those of skill in the art that embodiments more described herein include other embodiments Some feature included by rather than further feature, but the combination of the feature of different embodiment means to be in the present invention's Within the scope of and form different embodiments.Such as, in the following claims, embodiment required for protection appoint One of meaning can mode use in any combination.
Additionally, some in described embodiment be described as at this can be by the processor of computer system or by performing The method of other device enforcement of described function or the combination of method element.Therefore, have for implementing described method or method The processor of the necessary instruction of element is formed for implementing the method or the device of method element.Additionally, device embodiment This described element is the example of following device: this device is for implementing by performed by the element of the purpose in order to implement this invention Function.
As used in this, unless specifically stated so, ordinal number " first ", " second ", " the 3rd " etc. is used Describe plain objects and be merely representative of the different instances relating to similar object, and be not intended to imply that the object being so described must Must have the time upper, spatially, sequence aspect or in any other manner to definite sequence.
Although the embodiment according to limited quantity describes the present invention, but benefits from above description, the art In it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that The language that uses in this specification primarily to the readable and purpose of teaching and select rather than in order to explain or limit Determine subject of the present invention and select.Therefore, in the case of without departing from the scope of the appended claims and spirit, for this For the those of ordinary skill of technical field, many modifications and changes will be apparent from.For the scope of the present invention, to this The disclosure that invention is done is illustrative and not restrictive, and it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims (10)

1. predict a training method for domain name model the most maliciously, including:
At least one from the instant communication message of user, note, mail and webpage obtains potential territory to be accessed Name;
Inquire about whether this domain name exists in preset domain name black and white lists storehouse;
If not existing, extract the feature that can embody this domain name the most maliciously;
Whether malice is carried out to this domain name to predict domain name model the most maliciously described in characteristic use according to the domain name extracted Prediction, and judge that domain name is the most maliciously according to predicting the outcome of model;
If this domain name is judged as malice, utilize this domain name that described model is trained;
If this domain name is judged as normally, utilize this domain name that described model is trained with predetermined probability;Wherein
Utilize this domain name that described model is trained, including:
Carry out its flow maliciously detecting by accessing this domain name correspondence chained address, whether testing result is disliked as domain name The actual result of determination of meaning;And
Described model is trained by the feature of the domain name of extraction and actual result of determination as a training sample.
2. training method as claimed in claim 1, wherein, described prediction domain name model the most maliciously includes the first model, Extraction can embody the step of this domain name feature the most maliciously and include: extract the first of the character composition that can embody this domain name Feature;
The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample is included: First model is trained by the fisrt feature of the domain name of extraction and actual result of determination as a training sample.
3. training method as claimed in claim 2, wherein, described prediction domain name model the most maliciously also includes the second mould Type, extraction can embody the step of this domain name feature the most maliciously and also include: extract the log-on message that can embody this domain name Second feature and the third feature of host information of this domain name can be embodied;
The step that model is trained by the feature of the domain name of extraction and actual result of determination as a training sample is also wrapped Include: using the fisrt feature of domain name, second feature, third feature and the actual result of determination extracted as one article of training sample to the Two models are trained.
4. predicting a training system for domain name model the most maliciously, this system includes:
Potential domain name discovery module, is suitable at least one from the instant communication message of user, note, mail and webpage Obtain potential domain name to be accessed;
Black and white lists enquiry module, is suitable to inquire about whether this domain name exists in preset domain name black and white lists storehouse;
Domain name characteristic extracting module, is suitable to extract the feature that can embody this domain name the most maliciously;
Maliciously domain name prediction module, inquires this domain name there is not preset domain name black and white name if being suitable to black and white lists enquiry module In single storehouse, extract the feature that can embody this domain name the most maliciously via domain name characteristic extracting module, according to the feature profit extracted With described prediction domain name model the most maliciously, to this domain name, whether malice is predicted, and sentences according to predicting the outcome of model Localization name is the most maliciously;
Model on-line training module, if being suitable to this domain name to be judged as malice, utilizes this domain name to be trained described model;Also If being suitable to this domain name and being judged as normally, utilize this domain name that described model is trained with predetermined probability;
Flow malice detection module, be suitable to by access this domain name correspondence chained address its flow is carried out malice detect;With And
Described model on-line training module is further adapted for via flow malice detection module by accessing this domain name correspondence chained address Its flow carries out malice detect, using testing result as the actual result of determination of domain name whether malice, be further adapted for domain name Described model is trained by feature and actual result of determination that characteristic extracting module is extracted as a training sample.
5. training system as claimed in claim 4, wherein, described prediction domain name model the most maliciously includes the first model, Domain name characteristic extracting module is further adapted for extracting the fisrt feature of the character composition that can embody this domain name, and described model is online Training module be further adapted for using the fisrt feature of domain name extracted and actual prediction result as a training sample to the first model It is trained.
6. training system as claimed in claim 5, wherein, described prediction domain name model the most maliciously also includes the second mould Type, domain name characteristic extracting module be further adapted for extract can embody this domain name log-on message second feature and can body The now third feature of the host information of this domain name, described model on-line training module is further adapted for the first spy of the domain name extracted Levy, the second model is trained by second feature, third feature and actual prediction result as a training sample.
7. predict a domain name method the most maliciously, including:
At least one from the instant communication message of user, note, mail and webpage obtains potential territory to be accessed Name;
Extraction can embody the fisrt feature of the character composition of this domain name, and fisrt feature at least includes in following characteristics item Individual: total with the character passing through in the similarity of white list domain name, domain name to comprise in the field sum of fullstop symbol segmentation, domain name, The domain name character after domain name number of packet, removal and white list domain name similar segments after removal and white list domain name similar segments After sum, removal and white list domain name similar segments, in domain name, numerical character accounts for the ratio of character sum, removes and white list territory After name similar segments domain name medial vowel character account for character sum ratio, remove with white list domain name similar segments after in domain name auxiliary Sound character account for character sum ratio, remove with white list domain name similar segments after in domain name the character in addition to letter and number account for The ratio of character sum, remove with white list domain name similar segments after number of repetition is most in domain name character count, remove and After white list domain name similar segments in domain name pure digi-tal character string greatest length and remove with white list domain name similar segments after The greatest length of the character string of non-vowel continuation character composition in domain name;And
Whether malice is predicted to this domain name to utilize, according to the fisrt feature extracted, the first model pre-build.
8. method as claimed in claim 7, also includes:
Extraction can embody the log-on message of this domain name second feature and can embody this domain name host information the 3rd Feature, described second feature at least includes one in following characteristics item: whether domain name service business and domain name apply for that privacy is protected Protecting, described third feature at least includes one in following characteristics item: host services supplier, SSL certificate version, SSL certificate Serial number, SSL certificate algorithm mark, SSL certificate issuer, SSL certificate effect duration, the effective from date of SSL certificate, SSL card Book effective date of expiry, SSL certificate user, SSL certificate user public key information, SSL certificate public key algorithm, SSL certificate public affairs Key, SSL certificate signature algorithm, SSL certificate signature, OS name, operating system version, database-name, version database Originally, WEB component Name and WEB component version;
The second model pre-build is utilized to this domain name whether to come according to the fisrt feature extracted, second feature and third feature Malice is predicted.
9. predicting a domain name system the most maliciously, this system includes:
Potential domain name discovery module, is suitable at least one from the instant communication message of user, note, mail and webpage Obtain potential domain name to be accessed;
Domain name characteristic extracting module, is suitable to extract the fisrt feature of the character composition that can embody this domain name, and fisrt feature is at least Including in following characteristics item one:, territory total with the field passing through fullstop symbol segmentation in the similarity of white list domain name, domain name Domain name number of packet, removal and white list domain name after the character sum, removal and the white list domain name similar segments that comprise in Ming After domain name character sum, removal and white list domain name similar segments after similar segments, in domain name, numerical character accounts for character sum After ratio, removal and white list domain name similar segments, domain name medial vowel character accounts for the ratio of character sum, removes and white list territory After name similar segments, consonant characters accounts for the ratio of character sum, removes and remove in domain name after white list domain name similar segments in domain name Character outside letter and number account for character sum ratio, remove with white list domain name similar segments after number of repetition is in domain name Many character count, remove with white list domain name similar segments after in domain name the greatest length of pure digi-tal character string and remove with The greatest length of the character string of non-vowel continuation character composition in domain name after white list domain name similar segments;And
Domain name malice prediction module, being suitable to the first model according to the fisrt feature utilization extracted pre-builds to this domain name is No malice is predicted.
10. system as claimed in claim 9, wherein, domain name characteristic extracting module is further adapted for extraction can embody this territory The second feature of the log-on message of name and can embody the third feature of host information of this domain name, described second feature is extremely Include in following characteristics item one less: whether domain name service business and domain name apply for secret protection, and described third feature is at least Including in following characteristics item one: host services supplier, SSL certificate version, SSL certificate serial number, SSL certificate algorithm mark Knowledge, SSL certificate issuer, SSL certificate effect duration, the effective from date of SSL certificate, SSL certificate effective date of expiry, SSL card Book user, SSL certificate user public key information, SSL certificate public key algorithm, SSL certificate PKI, SSL certificate signature algorithm, SSL certificate signature, OS name, operating system version, database-name, database version, WEB component Name and WEB component version;
Domain name malice prediction module is further adapted for what the fisrt feature according to extraction, second feature and third feature utilization pre-build To this domain name, whether malice is predicted second model.
CN201610868349.6A 2016-09-29 2016-09-29 Prediction domain name whether method, system and the model training method thereof of malice, system Pending CN106230867A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610868349.6A CN106230867A (en) 2016-09-29 2016-09-29 Prediction domain name whether method, system and the model training method thereof of malice, system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610868349.6A CN106230867A (en) 2016-09-29 2016-09-29 Prediction domain name whether method, system and the model training method thereof of malice, system

Publications (1)

Publication Number Publication Date
CN106230867A true CN106230867A (en) 2016-12-14

Family

ID=58076546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610868349.6A Pending CN106230867A (en) 2016-09-29 2016-09-29 Prediction domain name whether method, system and the model training method thereof of malice, system

Country Status (1)

Country Link
CN (1) CN106230867A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106713303A (en) * 2016-12-19 2017-05-24 北京启明星辰信息安全技术有限公司 Malicious domain name detection method and system
CN107645503A (en) * 2017-09-20 2018-01-30 杭州安恒信息技术有限公司 A kind of detection method of the affiliated DGA families of rule-based malice domain name
CN108076041A (en) * 2017-10-23 2018-05-25 中国银联股份有限公司 A kind of DNS flow rate testing methods and DNS flow quantity detecting systems
CN108270761A (en) * 2017-01-03 2018-07-10 中国移动通信有限公司研究院 A kind of domain name legitimacy detection method and device
CN108337358A (en) * 2017-09-30 2018-07-27 广东欧珀移动通信有限公司 Using method for cleaning, device, storage medium and electronic equipment
US20180227321A1 (en) * 2017-02-05 2018-08-09 International Business Machines Corporation Reputation score for newly observed domain
CN108632227A (en) * 2017-03-23 2018-10-09 中国移动通信集团广东有限公司 A kind of malice domain name detection process method and device
CN109309673A (en) * 2018-09-18 2019-02-05 南京方恒信息技术有限公司 A kind of DNS private communication channel detection method neural network based
CN109474509A (en) * 2017-09-07 2019-03-15 北京二六三企业通信有限公司 The recognition methods of spam and device
CN110177123A (en) * 2019-06-20 2019-08-27 电子科技大学 Botnet detection method based on DNS mapping association figure
CN110381089A (en) * 2019-08-23 2019-10-25 南京邮电大学 Means of defence is detected to malice domain name based on deep learning
CN110808987A (en) * 2019-11-07 2020-02-18 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN111131260A (en) * 2019-12-24 2020-05-08 邑客得(上海)信息技术有限公司 Mass network malicious domain name identification and classification method and system
CN111181756A (en) * 2019-07-11 2020-05-19 腾讯科技(深圳)有限公司 Domain name security judgment method, device, equipment and medium
CN111181923A (en) * 2019-12-10 2020-05-19 中移(杭州)信息技术有限公司 Flow detection method and device, electronic equipment and storage medium
CN111200576A (en) * 2018-11-16 2020-05-26 慧盾信息安全科技(苏州)股份有限公司 Method for realizing malicious domain name recognition based on machine learning
CN111371812A (en) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 Virus detection method, device and medium
CN113746814A (en) * 2021-08-17 2021-12-03 上海硬通网络科技有限公司 Mail processing method and device, electronic equipment and storage medium
CN113938314A (en) * 2021-11-17 2022-01-14 北京天融信网络安全技术有限公司 Encrypted flow detection method and device and storage medium
CN114070819A (en) * 2021-10-09 2022-02-18 北京邮电大学 Malicious domain name detection method, device, electronic device and storage medium
CN116708034A (en) * 2023-08-07 2023-09-05 北京安天网络安全技术有限公司 Method, device, medium and equipment for determining security attribute of domain name
CN116760645A (en) * 2023-08-22 2023-09-15 北京长亭科技有限公司 Malicious domain name detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105022960A (en) * 2015-08-10 2015-11-04 济南大学 Multi-feature mobile terminal malicious software detecting method based on network flow and multi-feature mobile terminal malicious software detecting system based on network flow
CN105024969A (en) * 2014-04-17 2015-11-04 北京启明星辰信息安全技术有限公司 Method and device for realizing malicious domain name identification
CN105072214A (en) * 2015-08-28 2015-11-18 携程计算机技术(上海)有限公司 C&C domain name identification method based on domain name feature
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest
CN105939340A (en) * 2016-01-22 2016-09-14 北京匡恩网络科技有限责任公司 Method and system for discovering hidden conficker

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105024969A (en) * 2014-04-17 2015-11-04 北京启明星辰信息安全技术有限公司 Method and device for realizing malicious domain name identification
CN105022960A (en) * 2015-08-10 2015-11-04 济南大学 Multi-feature mobile terminal malicious software detecting method based on network flow and multi-feature mobile terminal malicious software detecting system based on network flow
CN105072214A (en) * 2015-08-28 2015-11-18 携程计算机技术(上海)有限公司 C&C domain name identification method based on domain name feature
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest
CN105939340A (en) * 2016-01-22 2016-09-14 北京匡恩网络科技有限责任公司 Method and system for discovering hidden conficker

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106713303A (en) * 2016-12-19 2017-05-24 北京启明星辰信息安全技术有限公司 Malicious domain name detection method and system
CN108270761A (en) * 2017-01-03 2018-07-10 中国移动通信有限公司研究院 A kind of domain name legitimacy detection method and device
US20180227321A1 (en) * 2017-02-05 2018-08-09 International Business Machines Corporation Reputation score for newly observed domain
CN108632227A (en) * 2017-03-23 2018-10-09 中国移动通信集团广东有限公司 A kind of malice domain name detection process method and device
CN108632227B (en) * 2017-03-23 2020-12-18 中国移动通信集团广东有限公司 Malicious domain name detection processing method and device
CN109474509A (en) * 2017-09-07 2019-03-15 北京二六三企业通信有限公司 The recognition methods of spam and device
CN107645503B (en) * 2017-09-20 2020-01-24 杭州安恒信息技术股份有限公司 Rule-based method for detecting DGA family to which malicious domain name belongs
CN107645503A (en) * 2017-09-20 2018-01-30 杭州安恒信息技术有限公司 A kind of detection method of the affiliated DGA families of rule-based malice domain name
CN108337358A (en) * 2017-09-30 2018-07-27 广东欧珀移动通信有限公司 Using method for cleaning, device, storage medium and electronic equipment
CN108337358B (en) * 2017-09-30 2020-01-14 Oppo广东移动通信有限公司 Application cleaning method and device, storage medium and electronic equipment
CN108076041A (en) * 2017-10-23 2018-05-25 中国银联股份有限公司 A kind of DNS flow rate testing methods and DNS flow quantity detecting systems
CN109309673A (en) * 2018-09-18 2019-02-05 南京方恒信息技术有限公司 A kind of DNS private communication channel detection method neural network based
CN111200576A (en) * 2018-11-16 2020-05-26 慧盾信息安全科技(苏州)股份有限公司 Method for realizing malicious domain name recognition based on machine learning
CN110177123A (en) * 2019-06-20 2019-08-27 电子科技大学 Botnet detection method based on DNS mapping association figure
CN111181756A (en) * 2019-07-11 2020-05-19 腾讯科技(深圳)有限公司 Domain name security judgment method, device, equipment and medium
CN111181756B (en) * 2019-07-11 2021-12-14 腾讯科技(深圳)有限公司 Domain name security judgment method, device, equipment and medium
CN110381089A (en) * 2019-08-23 2019-10-25 南京邮电大学 Means of defence is detected to malice domain name based on deep learning
CN110808987A (en) * 2019-11-07 2020-02-18 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN110808987B (en) * 2019-11-07 2022-03-29 南京亚信智网科技有限公司 Method and computing device for identifying malicious domain name
CN111181923A (en) * 2019-12-10 2020-05-19 中移(杭州)信息技术有限公司 Flow detection method and device, electronic equipment and storage medium
CN111131260B (en) * 2019-12-24 2020-09-15 邑客得(上海)信息技术有限公司 Mass network malicious domain name identification and classification method and system
CN111131260A (en) * 2019-12-24 2020-05-08 邑客得(上海)信息技术有限公司 Mass network malicious domain name identification and classification method and system
CN111371812A (en) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 Virus detection method, device and medium
CN113746814A (en) * 2021-08-17 2021-12-03 上海硬通网络科技有限公司 Mail processing method and device, electronic equipment and storage medium
CN113746814B (en) * 2021-08-17 2024-01-09 上海硬通网络科技有限公司 Mail processing method, mail processing device, electronic equipment and storage medium
CN114070819A (en) * 2021-10-09 2022-02-18 北京邮电大学 Malicious domain name detection method, device, electronic device and storage medium
CN113938314A (en) * 2021-11-17 2022-01-14 北京天融信网络安全技术有限公司 Encrypted flow detection method and device and storage medium
CN113938314B (en) * 2021-11-17 2023-11-28 北京天融信网络安全技术有限公司 Method and device for detecting encrypted traffic and storage medium
CN116708034A (en) * 2023-08-07 2023-09-05 北京安天网络安全技术有限公司 Method, device, medium and equipment for determining security attribute of domain name
CN116708034B (en) * 2023-08-07 2023-10-27 北京安天网络安全技术有限公司 Method, device, medium and equipment for determining security attribute of domain name
CN116760645A (en) * 2023-08-22 2023-09-15 北京长亭科技有限公司 Malicious domain name detection method and device
CN116760645B (en) * 2023-08-22 2023-11-14 北京长亭科技有限公司 Malicious domain name detection method and device

Similar Documents

Publication Publication Date Title
CN106230867A (en) Prediction domain name whether method, system and the model training method thereof of malice, system
CN106131016B (en) Malice URL detects interference method, system and device
US8438386B2 (en) System and method for developing a risk profile for an internet service
US7756987B2 (en) Cybersquatter patrol
US7937336B1 (en) Predicting geographic location associated with network address
US7984500B1 (en) Detecting fraudulent activity by analysis of information requests
CN106161342A (en) The dynamic optimization of safety applications
CN104579773B (en) Domain name system analyzes method and device
US20180131708A1 (en) Identifying Fraudulent and Malicious Websites, Domain and Sub-domain Names
US20090198746A1 (en) Generating anonymous log entries
CN111224941B (en) Threat type identification method and device
US20230040895A1 (en) System and method for developing a risk profile for an internet service
CN111476469B (en) Guest-rubbing method, terminal equipment and storage medium
CN106549959B (en) Method and device for identifying proxy Internet Protocol (IP) address
Chen et al. Phishing detection research based on LSTM recurrent neural network
CN110516173B (en) Illegal network station identification method, illegal network station identification device, illegal network station identification equipment and illegal network station identification medium
CN108353083A (en) The system and method for algorithm (DGA) Malware is generated for detecting domains
CN113572752B (en) Abnormal flow detection method and device, electronic equipment and storage medium
CN106899549A (en) A kind of network security detection method and device
CN108234474A (en) A kind of method and apparatus of website identification
Gajera et al. A novel approach to detect phishing attack using artificial neural networks combined with pharming detection
CN110855716B (en) Self-adaptive security threat analysis method and system for counterfeit domain names
Menon et al. Preventing hijacked research papers in fake (rogue) journals through social media and databases
CN110474890B (en) Data anti-crawling method and device based on intelligent flow guide switching
CN110929185A (en) Website directory detection method and device, computer equipment and computer storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 311501, Unit 1, Building 5, Courtyard 1, Futong East Street, Chaoyang District, Beijing 100102

Applicant after: Beijing Zhichuangyu Information Technology Co., Ltd.

Address before: 100097 Jinwei Building 803, 55 Lanindichang South Road, Haidian District, Beijing

Applicant before: Beijing Knows Chuangyu Information Technology Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161214