CN112769974A - Domain name detection method, system and storage medium - Google Patents

Domain name detection method, system and storage medium Download PDF

Info

Publication number
CN112769974A
CN112769974A CN202011612785.XA CN202011612785A CN112769974A CN 112769974 A CN112769974 A CN 112769974A CN 202011612785 A CN202011612785 A CN 202011612785A CN 112769974 A CN112769974 A CN 112769974A
Authority
CN
China
Prior art keywords
domain name
feature vector
detected
vector
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011612785.XA
Other languages
Chinese (zh)
Inventor
张莹莹
史贵振
陈磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Asiainfo Technologies (chengdu) Inc
Original Assignee
Asiainfo Technologies (chengdu) Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Asiainfo Technologies (chengdu) Inc filed Critical Asiainfo Technologies (chengdu) Inc
Priority to CN202011612785.XA priority Critical patent/CN112769974A/en
Publication of CN112769974A publication Critical patent/CN112769974A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Abstract

The application provides a domain name detection method, a domain name detection system and a storage medium, and relates to the technical field of network security. The problems that in the prior art, efficiency is low and a large amount of domain name data are difficult to process can be effectively solved. The method comprises the following steps: acquiring a domain name to be detected; and generating an input feature vector of the domain name to be detected. And then, coding and compressing the input characteristic vector to obtain a target characteristic vector. And finally, inputting the target feature vector into a classifier trained based on a semi-supervised support vector machine algorithm, and outputting the type of the domain name to be detected. The embodiment of the application is applied to a computer system.

Description

Domain name detection method, system and storage medium
Technical Field
The present application relates to the field of network security technologies, and in particular, to a domain name detection method, system, and storage medium.
Background
In today's internet environment, network security becomes increasingly important, and network security information such as threat intelligence is critical to network security analysis and operation and maintenance. Threat intelligence is evidence-based knowledge that includes context, mechanisms, indicators, implicit and actual suggestions. Threats, intelligence describe existing or imminent threats or dangers to an asset and may be used to notify a subject to take some response to the relevant threat, or danger. At present, threat intelligence can be obtained through a domain name detection mode, however, the current domain name detection usually depends on manual identification, the efficiency is low, and a large amount of domain name data is difficult to process. Therefore, a method for efficiently and accurately detecting a domain name is required.
Disclosure of Invention
The application provides a domain name detection method, a domain name detection system and a domain name detection system, which can effectively solve the problems that the efficiency is low and a large amount of domain name data is difficult to process in the prior art.
In order to achieve the purpose, the technical scheme is as follows:
in a first aspect, the present application provides a domain name detection method, including obtaining a domain name to be detected; and generating an input feature vector of the domain name to be detected. And then, coding and compressing the input characteristic vector to obtain a target characteristic vector. And finally, inputting the target feature vector into a classifier trained based on a semi-supervised support vector machine algorithm, and outputting the type of the domain name to be detected.
In this implementation, a lower-dimensional target feature vector is generated by encoding and compressing data in the input feature vector. The method has the advantages that the correct rate of the classifier for detecting the domain name is not influenced; the calculation pressure of the classifier is reduced, in addition, the classifier generated by training based on the semi-supervised support vector machine algorithm is fully utilized, and the classifier capable of detecting the domain name type can be generated by training only a small amount of labeled samples in the process of training the model by utilizing the semi-supervised support vector machine algorithm.
Optionally, the encoding and compressing the input feature vector to obtain the target feature vector includes the following steps: s1, acquiring a weight matrix, a first offset vector and a second offset vector; the weight matrix is an n multiplied by m matrix; n is more than m; n represents the dimension of the input feature vector. And S2, determining an intermediate feature vector according to the input feature vector, the weight matrix and the first offset vector. And S3, determining an output feature vector according to the intermediate feature vector, the inverse matrix of the weight matrix and the second offset vector. And S4, determining the difference value of the output feature vector and the input feature vector. S5, determining whether the difference value is smaller than a preset threshold value; if yes, go to S6; if not, go to S7. And S6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5. And S7, determining the intermediate feature vector as a target feature vector and outputting the target feature vector.
Optionally, generating an input feature vector of the domain name to be detected includes: and extracting a main domain name and a domain name suffix in the domain name to be detected. Determining characteristic parameters of the main domain name; the characteristic parameters at least comprise: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name. Carrying out one-hot encoding on the domain name suffix to generate a target domain name suffix; an input feature vector is determined based on the feature parameters and the target domain name suffix.
Optionally, acquiring the domain name to be detected includes: and acquiring the domain name to be verified. And under the condition that the domain name to be verified does not meet the preset rule, determining the domain name to be verified as the domain name to be detected.
Optionally, the preset rule includes: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the right, the length of the main domain name of the domain name to be detected is less than 6, and the main domain name of the domain name to be detected starts with xn-.
In a second aspect, the present application provides a domain name detection system, which includes a preprocessing module, a feature extraction module, and a classification module; the preprocessing module is used for acquiring the domain name to be detected. The preprocessing module is further configured to generate an input feature vector of the domain name to be detected. And the characteristic extraction module is used for coding and compressing the input characteristic vector generated by the preprocessing module to obtain a target characteristic vector. And the classification module is used for inputting the feature extraction module into a classifier trained on the basis of a semi-supervised support vector machine algorithm and outputting the type of the domain name to be detected.
Optionally, the feature extraction module is specifically configured to perform the following steps: s1, acquiring a weight matrix, a first offset vector and a second offset vector; the weight matrix is an n multiplied by m matrix; n is more than m; n represents the dimension of the input feature vector. And S2, determining an intermediate feature vector according to the input feature vector, the weight matrix and the first offset vector. And S3, determining an output feature vector according to the intermediate feature vector, the inverse matrix of the weight matrix and the second offset vector. And S4, determining the difference value of the output feature vector and the input feature vector. S5, determining whether the difference value is smaller than a preset threshold value; if yes, go to S6; if not, go to S7. And S6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5. And S7, determining the intermediate feature vector as a target feature vector and outputting the target feature vector.
Optionally, the preprocessing module is specifically configured to extract a main domain name and a domain name suffix in the domain name to be detected. The preprocessing module is also used for determining the characteristic parameters of the main domain name; the characteristic parameters at least comprise: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name. The preprocessing module is also used for carrying out one-hot coding on the domain name suffix to generate a target domain name suffix. And the preprocessing module is also used for determining an input feature vector based on the feature parameters and the target domain name suffix.
Optionally, the preprocessing module is specifically configured to obtain the domain name to be verified. And the preprocessing module is used for determining the domain name to be verified as the domain name to be detected under the condition that the domain name to be verified does not meet the preset rule.
Optionally, the preset rule includes: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the ". multidot.n", the length of the main domain name of the domain name to be detected is less than 6, and the main domain name of the domain name to be detected starts with xn-.
In a third aspect, the present application provides a domain name detection system that includes a memory and a processor. The memory is coupled to the processor. The memory is for storing computer program code comprising computer instructions. When the processor executes the computer instructions, the domain name detection system performs the domain name detection method as provided in the first aspect or any one of the possible designs of the first aspect.
In a fourth aspect, the present application provides a chip system, which is applied to a domain name detection system; the chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected through a line; the interface circuit is configured to receive a signal from a memory of the domain name detection system and send the signal to the processor, the signal including computer instructions stored in the memory. When the processor executes the computer instructions, the domain name detection system performs the domain name detection method as provided by the first aspect or any one of the possible design approaches of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium includes computer instructions, and when the computer instructions are executed on a domain name detection system, the domain name detection system implements the domain name detection method provided in the first aspect or any one of the possible design manners of the first aspect.
In a sixth aspect, the present application provides a computer program product, which includes computer instructions that, when run on a domain name detection system, cause the domain name detection system to perform the domain name detection method as provided in the first aspect or any one of the possible designs of the first aspect.
It should be noted that all or part of the computer instructions may be stored on the computer readable storage medium. The computer-readable storage medium may be packaged with the processor of the domain name detection system, or may be packaged separately from the processor of the domain name detection system, which is not limited in this application.
For the description of the second, third, fourth, fifth and sixth aspects in this application, reference may be made to the detailed description of the first aspect and its various implementations; moreover, for the beneficial effects of the second aspect, the third aspect, the fourth aspect, the fifth aspect and the sixth aspect, reference may be made to beneficial effect analysis in the first aspect and various implementation manners thereof, and details are not repeated here.
In the present application, the names of the domain name detection systems described above do not limit the devices or functional modules themselves, and in practical implementations, these devices or functional modules may appear by other names. Insofar as the functions of the respective devices or functional modules are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalents.
Drawings
Fig. 1 is a schematic flow chart of a domain name detection method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a classifier generation method according to an embodiment of the present application;
fig. 3A is a second schematic flowchart of a domain name detection method according to an embodiment of the present application;
fig. 3B is a schematic diagram of an auto-encoder according to an embodiment of the present application;
fig. 4 is a third schematic flowchart of a domain name detection method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a domain name detection system according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of another domain name detection system provided in the embodiment of the present application;
fig. 7 is a schematic structural diagram of a computer program product of a domain name detection method according to an embodiment of the present application.
Detailed Description
A domain name detection method, a domain name detection system, and a storage medium according to embodiments of the present application are described in detail below with reference to the accompanying drawings.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone.
The terms "first" and "second" and the like in the description and drawings of the present application are used for distinguishing different objects or for distinguishing different processes for the same object, and are not used for describing a specific order of the objects.
Furthermore, the terms "including" and "having," and any variations thereof, as referred to in the description of the present application, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to the listed steps or modules but may alternatively include other steps or modules not listed or inherent to such process, method, article, or apparatus.
It should be noted that in the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as examples, illustrations or descriptions. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.
In the description of the present application, the meaning of "a plurality" means two or more unless otherwise specified.
It can be understood that, in practical application, the step sequence of the specific method may be adjusted, and the embodiment of the present application does not limit this.
Before describing the embodiments of the present application, first, terms related to the embodiments of the present application are described:
DGA
DGA is a domain name generation algorithm that generates random numbers.
DGA domain name
DGA domain names are domain names generated based on DGA algorithms, typically hard-coded in malware.
Entropy of information
The information entropy is used to describe the uncertainty of the information, and the larger the uncertainty, the larger the entropy.
Self-encoder
The self-encoder is an artificial neural network which can represent input information through unsupervised learning to obtain more efficient representation of input data. Its output dimension is typically much smaller than the input data.
Semi-supervised learning
Semi-supervised learning is a learning method combining supervised learning and unsupervised learning. Semi-supervised learning uses large amounts of unlabeled data, and simultaneously labeled data, to perform pattern recognition operations. When the semi-supervised learning is used, the labor can be saved, and higher accuracy can be brought.
Direct push type support vector machine
A direct push support vector machine (TSVM) is the most prominent representative of semi-supervised support vector machines, and its core idea is: an attempt is made to find a suitable label assignment for the unlabeled samples so that the separation after hyperplane division is maximized. The TSVM adopts a strategy of local search to carry out iterative solution, namely, an initial SVM is trained by using a marked sample set, then the learning device is used for marking unmarked samples, all samples are marked, the SVM is retrained again based on the marked samples, and then the error-prone samples are searched for and continuously adjusted.
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present application provides a domain name detection method, including:
and S11, acquiring the domain name to be detected.
For example, the domain name to be detected may be a domain name input by a user for detection, may be any domain name in a large domain name set that needs to be detected, or may be a domain name of a web page to be opened.
And S12, generating an input feature vector of the domain name to be detected.
Specifically, the feature vector of the domain name to be detected is used for representing the feature of the domain name, so as to be used for domain name detection.
And S13, coding and compressing the input feature vector to obtain a target feature vector.
It should be noted that, the input feature vector is encoded and compressed, specifically, by means of encoding and then decoding the input feature vector, when it is determined that the decoded feature vector satisfies the condition, the encoded feature vector is determined to be the target feature vector. The detailed implementation is described below with reference to S1-S7, which will not be described here.
And S14, inputting the target feature vector into a classifier trained on the basis of a semi-supervised support vector machine algorithm, and outputting the type of the domain name to be detected.
In one implementation, referring to fig. 2, a method of a classifier trained based on a semi-supervised support vector machine algorithm includes:
s141, obtaining Dl、DuAnd a compromise parameter ClAnd Cu
Wherein D islRepresenting a set of marked samples, Dl={(x1,y1),(x2,y2),...,(xl,yl)};xqRepresenting a first target feature vector, q ═ 1,2, ·, l; y isqThe mark corresponding to the first target feature vector is represented, and l represents the number of the first target feature vectors; duRepresenting a set of unlabeled samples, Du={xl+1,xl+2,...,xl+u},xpRepresents a second target feature vector, p ═ l +1, l +2, ·, l + u; u denotes the number of second target feature vectors.
It should be noted that C is first determinedlAnd CuSatisfy Cl>>Cu
S142, use DlAnd generating an initial classifier based on a semi-supervised support vector machine algorithm.
S143, utilizing the classifier to pair DuPerforming prediction to generate
Figure BDA0002873355800000071
Wherein the content of the first and second substances,
Figure BDA0002873355800000072
Figure BDA0002873355800000073
is represented by CuA corresponding predictive label;
s144 at Cu<ClBased on Dl,Du
Figure BDA0002873355800000074
Cl,CuDetermining (ω, b), ξi
Specifically, the solution (ω, b), ξ is solved according to the following formulai
Figure BDA0002873355800000075
Figure BDA0002873355800000081
Wherein ξiRepresents a relaxation variable; the larger the relaxation variable, the more
Figure BDA0002873355800000082
The less accurate.
S145, judgment
Figure BDA0002873355800000083
And xiiWhether a first condition is satisfied; if yes, go to step S146; otherwise, S147 is executed.
Specifically, the first condition is
Figure BDA0002873355800000084
Wherein, i, j ═ l +1, l +2, ·, l + u.
S146, order
Figure BDA0002873355800000085
Figure BDA0002873355800000086
Jump to S144.
S147, according to Cu=min(2Cu,Cl) Furthermore, the utility modelNew Cu
S148, judging CuWhether or not it is greater than Cl. If so, obtaining a marking result of the unmarked sample, and finishing classifier training; if not, go to S144.
In this application, the type of the domain name may be, for example, any one of a white domain name and a gray domain name and a DGA domain name, where the white domain name is a domain name with a low security risk, the gray domain name is a domain name with a security risk between a black domain name and the white domain name, and the black domain name is a domain name with a high security risk. For example, for a malicious domain name performing malicious activities by spreading malicious software, sending spam and the like, the domain name can be classified as a black domain name, which indicates that the security risk is high; for a domain name with high breadth, that is, a domain name with access frequency meeting a predetermined condition, for example, a domain name with access frequency greater than a certain threshold in a unit time, the domain name can be classified as a white domain name, indicating that the security risk is low; for a domain name that is suspicious but whose security risk is between a black domain name and a white domain name, it may be classified as a gray domain name. For example, when the domain name to be detected is classified, the probabilities that the domain name belongs to the black domain name, the white domain name and the gray domain name can be respectively calculated, and the class with the highest probability is determined as the class of the domain name to be detected. It should be understood that, although the category of the domain name is selected from the white domain name and the gray domain name, and is combined with the DGA domain name as the type of the domain name to be detected in the domain name detection method provided in the embodiment of the present application, the present application is not limited thereto, and the type of the domain name may be other categories.
In this implementation, a lower-dimensional target feature vector is generated by encoding and compressing data in the input feature vector. The method has the advantages that the correct rate of the classifier for detecting the domain name is not influenced; the calculation pressure of the classifier is reduced, in addition, the classifier generated by training based on the semi-supervised support vector machine algorithm is fully utilized, and the classifier capable of detecting the domain name type can be generated by training only a small amount of labeled samples in the process of training the model by utilizing the semi-supervised support vector machine algorithm.
In one implementation, referring to fig. 3A in conjunction with fig. 1, S13 is specifically implemented by the following steps:
s1, obtaining a weight matrix, a first offset vector and a second offset vector.
Wherein, the weight matrix is an n multiplied by m matrix; n is more than m; n represents the dimension of the input feature vector.
It should be noted that, when the weight matrix, the first offset vector, and the second offset vector are obtained for the first time to perform subsequent calculation, the random values corresponding to each are given first.
And S2, determining an intermediate feature vector according to the input feature vector, the weight matrix and the first offset vector.
Specifically, the intermediate feature vector is determined according to the following formula:
H=f(x)=sf(WX+p)
wherein H represents a target feature vector; x represents an input feature vector; wXRepresents a mapping matrix (i.e., weight matrix) of X to H, W ∈ Rn×m;sfRepresenting a first activation function; p denotes a first bias vector.
Exemplary, when sfTaking a Sigmoid function; namely, it is
Figure BDA0002873355800000091
Then, H ═ f (x) sf(WX+ p) conversion to
Figure BDA0002873355800000092
And S3, determining an output feature vector according to the intermediate feature vector, the inverse matrix of the weight matrix and the second offset vector.
Specifically, the output feature vector is determined according to the following formula:
Y=g(h):=sg(W`H+q);
wherein Y represents an output feature vector; h represents a target feature vector; w' typeHRepresenting the mapping matrix H to Y (i.e., the inverse of the weight matrix), W ″H∈Rn×m;sgRepresenting a second activation function; q denotes a second bias vector.
Exemplary, when sfTaking a Sigmoid function; namely, it is
Figure BDA0002873355800000093
Then, Y ═ g (h): sg(W`H+ q) conversion to
Figure BDA0002873355800000094
And S4, determining the difference value of the output feature vector and the input feature vector.
Optionally, when both the first activation function and the second activation function are Sigmoid functions, a cross-entropy (cross-entropy) function is selected as an Error function (Error) to calculate a difference between X and Y. That is, the difference L (X, Y) between X and Y is determined according to the following formula:
Figure BDA0002873355800000101
wherein x isiRepresents the value of an element in X; y isiRepresenting the value of the element in Y.
And S5, determining whether the difference is smaller than a preset threshold value. If yes, go to S6; if not, go to S7.
And S6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5.
And S7, determining the intermediate feature vector as a target feature vector and outputting the target feature vector.
According to the embodiment of the application, the classifier is generated by learning and training through the semi-supervised support vector machine algorithm, the requirement on the number of sample sets is not high when the algorithm is used for training a model, and training can be performed only by providing a small number of marked samples. In addition, the sample adopts a combination form of a label sample and a non-label sample to obtain the classifier for the training model, so that the detection rate of the domain name to be detected can be greatly improved.
For better understanding, the present embodiment explains the above-mentioned S1-S7. First, referring to fig. 3B, a schematic structural diagram of a self-encoder is provided; the self-encoderS1-S7 for implementing the above includes an input layer, an intermediate layer, and an output layer. Where n represents the scale (i.e., the number of input nodes) of the input layer (which also serves as the output layer); m denotes the number of nodes of the hidden layer. X is formed by Rn,H∈Rm,Y∈RnThe feature vectors corresponding to the input layer, the hidden layer, and the output layer are respectively expressed, where X is Y (ideally), and the feature vector H of the hidden layer is the required compressed target feature vector. It should be noted that the process from the input layer to the middle layer is actually an encoding process; the process of decoding is actually from the middle layer to the output layer. W represents a weight matrix between the input layer and the hidden layer, namely a mapping matrix from X to H; w' represents a weight matrix between the hidden layer and the output layer, namely a mapping matrix from H to Y; w ═ W-1
In this implementation, data in the input feature vector is encoded and compressed to generate a lower-dimensional target feature vector. The method has the advantages that the correct rate of the classifier for detecting the domain name is not influenced; the calculation pressure of the classifier is reduced, in addition, the classifier generated by training of the semi-supervised support vector machine algorithm is fully utilized, and the classifier capable of detecting the domain name type can be generated only by training a small amount of labeled samples in the process of training the model by the semi-supervised support vector machine algorithm.
In one implementation, referring to fig. 4 in combination with fig. 1, S12 specifically includes the following steps:
s121, extracting a main domain name and a domain name suffix in the domain name to be detected.
And S122, determining the characteristic parameters of the main domain name.
Wherein the characteristic parameters at least include: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name.
Specifically, the N-Gram value of the primary domain name is determined by the difference between the scores in the white list and the dictionary corresponding to the domain name based on an N-Gram (sometimes referred to as an N-Gram) algorithm. N-Gram is a very important concept in natural language processing, and in NLP, one can predict or evaluate whether a sentence is reasonable or not by using N-Gram based on a certain corpus. On the other hand, another role of the N-Gram is to evaluate the degree of difference between two strings. It should be noted that the white list specifically refers to a list that is good in reputation or evaluation and is trustworthy in the domain name field. The dictionary is a theoretical system formed by related concepts for describing the domain name in the domain name field, and mainly provides understanding assistance for data in the domain name field.
S123, carrying out one-hot encoding on the domain name suffix to generate a target domain name suffix;
and S124, determining an input feature vector based on the feature parameters and the target domain name suffix.
Further, the feature vector may further include feature parameters of the domain name to be detected, and specifically, feature data extraction is performed on the domain name to be detected to generate the feature parameters of the domain name to be detected. The feature data of the domain name to be detected may include: and extracting at least one part of the domain name character characteristic, the sample association characteristic, the domain name attribute characteristic and the network access characteristic of the domain name to be detected as the characteristic data of the domain name to be detected. The domain name character features may be features associated with characters included in the domain name to be detected, such as character entropy, domain name length, domain name level, domain name number, feature character number, character type conversion times, longest non-top-level domain name, number sub-domain name number, word sub-domain name number, and the like of the domain name to be detected, or character features associated with the domain name, such as mailbox prefix. The sample associated features may be, for example, features associated with a sample containing, accessing, or propagating (e.g., downloading) a domain name, where the sample is, for example, software, a client, or the like. The domain name attribute feature may be, for example, a uniform resource locator (CURL), an internet protocol address (IP address), a canonical name (CCNAME) of the domain name to be detected, or registration information of the domain name, such as country of registration, domain name privacy information ((whois information), docket information, mailbox association, registrar association, registry phone association, etc., the network access feature may be, for example, a maximum value, a minimum value, a variance, etc., of the number of times the domain name to be detected is accessed within a fixed time, the feature data of the domain name may include, for example, a domain name character feature, a sample association feature, a domain name attribute feature, and a part or all of the features of the network access feature, after the feature data of the domain name is acquired, the feature parameter of the domain name to be detected is generated from the feature data, for example, the feature parameter may be generated by digitizing, and generating a feature vector by combining the feature parameters of the domain name to be detected with the feature parameters of the main domain name and the target domain name suffix.
In the implementation mode, the detection rate of the domain name to be detected is ensured by extracting the data which can represent the characteristics of the domain name to be detected from the domain name to be detected, and the efficiency of detecting the domain name to be detected is improved.
In one way, referring to fig. 4 in combination with fig. 1, S11 specifically includes the following steps:
and S111, acquiring the domain name to be verified.
And S112, determining the domain name to be verified as the domain name to be detected under the condition that the domain name to be verified does not meet the preset rule.
Specifically, the preset rule includes: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the right, the length of the main domain name of the domain name to be detected is less than 6, and the main domain name of the domain name to be detected starts with xn-.
It should be noted that the preset rule mainly excludes the possibility that the domain name to be detected is the DGA domain name according to one or more feature information in the domain name. In other words, when the domain name to be verified meets the preset rule, it can be determined that the domain name to be verified is not the DGA domain name; when the domain name to be verified does not meet the preset rule, the domain name to be verified is not necessarily the DGA domain name; therefore, the embodiment of the present application needs to detect the type of the domain name to be detected, with respect to the domain name to be verified that does not satisfy the preset rule, as the domain name to be detected.
In the implementation mode, the domain name to be verified is preprocessed, and the domain name to be verified of the obvious non-DGA domain name is directly judged, so that the false alarm rate can be reduced.
The scheme provided by the embodiment of the application is mainly introduced from the perspective of a method. To implement the above functions, it includes hardware structures and/or software modules for performing the respective functions. Those of skill in the art would readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
Fig. 5 is a schematic structural diagram of a domain name detection system 10 according to an embodiment of the present disclosure. The domain name detection system 10 is used to perform the domain name detection method shown in fig. 1. The domain name detection system 10 includes: a preprocessing module 51, a feature extraction module 52, and a classification module 53.
Specifically, the preprocessing module 51 is configured to acquire a domain name to be detected. For example, the preprocessing module 51 may be used to implement S11 as shown in fig. 1.
The preprocessing module 51 is further configured to generate an input feature vector of the domain name to be detected. For example, the preprocessing module 51 may be used to implement S12 as shown in fig. 1.
And the feature extraction module 52 is configured to perform encoding compression on the input feature vector generated by the preprocessing module 51 to obtain a target feature vector. For example, the feature extraction module 52 may be used to implement S13 as shown in fig. 1.
And the classification module 53 is used for inputting the feature extraction module 52 into a classifier trained based on a semi-supervised support vector machine algorithm and outputting the type of the domain name to be detected. For example, the classification module 53 may be used to implement S14 as shown in fig. 1.
Optionally, the feature extraction module 52 is specifically configured to perform the following steps:
s1, acquiring a weight matrix, a first offset vector and a second offset vector; the weight matrix is an n multiplied by m matrix; n is more than m; n represents the dimension of the input feature vector.
And S2, determining an intermediate feature vector according to the input feature vector, the weight matrix and the first offset vector.
And S3, determining an output feature vector according to the intermediate feature vector, the inverse matrix of the weight matrix and the second offset vector.
And S4, determining the difference value of the output feature vector and the input feature vector.
S5, determining whether the difference value is smaller than a preset threshold value; if yes, go to S6; if not, go to S7.
And S6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5.
And S7, determining the intermediate feature vector as a target feature vector and outputting the target feature vector. For example, the feature extraction module 52 may be used to implement S1-S7 as shown in FIG. 3A.
Optionally, the preprocessing module 51 is specifically configured to extract a main domain name and a domain name suffix in the domain name to be detected. For example, the preprocessing module 51 may be used to implement S121 as shown in fig. 4.
The preprocessing module 51 is further configured to determine a characteristic parameter of the main domain name; the characteristic parameters at least comprise: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name. For example, the preprocessing module 51 may be used to implement S122 shown in fig. 4.
The preprocessing module 51 is further configured to perform one-hot encoding on the domain name suffix to generate a target domain name suffix. For example, the preprocessing module 51 may be used to implement S123 as shown in fig. 4.
The preprocessing module 51 is further configured to determine an input feature vector based on the feature parameter and the target domain name suffix. For example, the preprocessing module 51 may be used to implement S124 as shown in fig. 4.
Optionally, the preprocessing module 51 is specifically configured to obtain a domain name to be verified. For example, the preprocessing module 51 may be used to implement S111 as shown in fig. 4.
The preprocessing module 51 is configured to determine that the domain name to be verified is the domain name to be detected when it is determined that the domain name to be verified does not satisfy the preset rule. For example, the preprocessing module 51 may be used to implement S112 as shown in fig. 4.
Optionally, the preset rule includes: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the ". multidot.n", the length of the main domain name of the domain name to be detected is less than 6, and the main domain name of the domain name to be detected starts with xn-.
Of course, the domain name detection system 10 provided in the embodiment of the present application includes, but is not limited to, the above modules, for example, the domain name detection system 10 may further include a data storage module 54. The data storage module 54 may be used for storing program codes of the domain name detection system 10, and may also be used for storing data generated during the operation of the domain name detection system 10, such as data in a write request.
Here, the system architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not constitute a limitation to the technical solution provided in the embodiment of the present application, and it is known by a person of ordinary skill in the art that the technical solution provided in the embodiment of the present application is also applicable to similar technical problems along with the evolution of the network architecture and the appearance of a new service scenario.
In some embodiments, the disclosed methods may be implemented as computer program instructions encoded on a computer-readable storage medium in a machine-readable format or encoded on other non-transitory media or articles of manufacture.
Fig. 6 shows a hardware structure diagram of a domain name detection system provided in an embodiment of the present application. The domain name detection system includes a processor 61, a communication line 64, and at least one transceiver (illustrated in fig. 6 as including transceiver 63 for example only).
Processor 61 may include one or more processing units, such as: the processor 61 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a Video Processing Unit (VPU) controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.
The controller can be a neural center and a command center of a domain name detection system. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 61 for storing instructions and data. In some embodiments, the memory in the processor 61 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 61. If the processor 61 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 61 and thus increases the efficiency of the system.
In some embodiments, processor 61 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, a Serial Peripheral Interface (SPI) interface, and/or the like.
Communication link 64 may include a path for transmitting information between the aforementioned components.
The transceiver 63 may be any device, such as a transceiver, for communicating with other devices or communication networks, such as an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc.
Optionally, the domain name detection system may further include a memory 62.
The memory 62 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disk read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be separate and coupled to the processor via a communication line 64. The memory may also be integral to the processor.
The memory 62 is used for storing computer-executable instructions for executing the scheme of the application, and is controlled by the processor 61 to execute. The processor 61 is configured to execute computer-executable instructions stored in the memory 62, so as to implement the point cloud data annotation method provided in the following embodiments of the present application.
Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
In particular implementations, processor 61 may include one or more CPUs such as CPU0 and CPU1 in fig. 6, for example, as an example.
In particular implementations, as an example, the domain name detection system may include a plurality of processors, such as processor 61 and processor 65 in fig. 6. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
Fig. 7 schematically illustrates a conceptual partial view of a computer program product comprising a computer program for executing a computer process on a computing device provided by an embodiment of the application.
In one embodiment, the computer program product is provided using a signal bearing medium 410. The signal bearing medium 410 may include one or more program instructions that, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to fig. 1. Thus, for example, referring to the embodiment shown in FIG. 1, one or more features of S11-S14 may be undertaken by one or more instructions associated with the signal bearing medium 410. Further, the program instructions in FIG. 7 also describe example instructions.
In some examples, signal bearing medium 410 may include a computer readable medium 411, such as, but not limited to, a hard disk drive, a Compact Disc (CD), a Digital Video Disc (DVD), a digital tape, a memory, a read-only memory (ROM), a Random Access Memory (RAM), or the like.
In some implementations, the signal bearing medium 410 may comprise a computer recordable medium 412 such as, but not limited to, a memory, a read/write (R/W) CD, a R/W DVD, and the like.
In some implementations, the signal bearing medium 410 may include a communication medium 413, such as, but not limited to, a digital and/or analog communication medium (e.g., a fiber optic cable, a waveguide, a wired communications link, a wireless communication link, etc.).
The signal bearing medium 410 may be conveyed by a wireless form of communication medium 413, such as a wireless communication medium compliant with the IEEE802.41 standard or other transport protocol. The one or more program instructions may be, for example, computer-executable instructions or logic-implementing instructions.
In some examples, a write preprocessing module, a feature extraction module, and a classification module, such as described with respect to fig. 5, may be configured to provide various operations, functions, or actions in response to one or more program instructions via the computer-readable medium 411, the computer-recordable medium 412, and/or the communication medium 413.
In addition, the embodiment of the application also provides a chip system, and the chip system is applied to a domain name detection system; the chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected through a line; the interface circuit is configured to receive a signal from a memory of the domain name detection system and send the signal to the processor, the signal including computer instructions stored in the memory. When the processor executes the computer instructions, the domain name detection system performs the domain name detection method as provided by the first aspect or any one of the possible design approaches of the first aspect.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (13)

1. A domain name detection method is characterized by comprising the following steps:
acquiring a domain name to be detected;
generating an input feature vector of the domain name to be detected;
encoding and compressing the input characteristic vector to obtain a target characteristic vector;
and inputting the target feature vector into a classifier trained based on a semi-supervised support vector machine algorithm, and outputting the type of the domain name to be detected.
2. The domain name detection method according to claim 1, wherein the encoding and compressing the input feature vector to obtain the target feature vector comprises the following steps:
s1, acquiring a weight matrix, a first offset vector and a second offset vector; the weight matrix is an n multiplied by m matrix; n is more than m; the n represents a dimension of the input feature vector;
s2, determining a middle feature vector according to the input feature vector, the weight matrix and the first offset vector;
s3, determining an output eigenvector according to the intermediate eigenvector, the inverse matrix of the weight matrix and the second offset vector;
s4, determining the difference value between the output feature vector and the input feature vector;
s5, determining whether the difference value is smaller than a preset threshold value; if yes, go to S6; if not, executing S7;
s6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5;
and S7, determining the intermediate feature vector as the target feature vector and outputting the target feature vector.
3. The domain name detection method according to claim 1, wherein the generating the input feature vector of the domain name to be detected comprises:
extracting a main domain name and a domain name suffix in the domain name to be detected;
determining characteristic parameters of the main domain name; the characteristic parameters at least comprise: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name;
carrying out one-hot encoding on the domain name suffix to generate a target domain name suffix;
determining the input feature vector based on the feature parameters and the target domain name suffix.
4. The domain name detection method according to claim 1, wherein obtaining the domain name to be detected comprises:
acquiring a domain name to be verified;
and under the condition that the domain name to be verified does not meet the preset rule, determining the domain name to be verified as the domain name to be detected.
5. The domain name detection method according to claim 4, wherein the preset rule comprises: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the 'page', the length of the main domain name of the domain name to be detected is less than 6 and the main domain name of the domain name to be detected starts with the xn < - >.
6. A domain name detection system, comprising: the device comprises a preprocessing module, a feature extraction module and a classification module; wherein the content of the first and second substances,
the preprocessing module is used for acquiring a domain name to be detected;
the preprocessing module is further used for generating an input feature vector of the domain name to be detected;
the characteristic extraction module is used for coding and compressing the input characteristic vector generated by the preprocessing module to obtain a target characteristic vector;
and the classification module is used for inputting the feature extraction module into a classifier trained on the basis of a semi-supervised support vector machine algorithm and outputting the type of the domain name to be detected.
7. The domain name detection system according to claim 6, wherein the feature extraction module is specifically configured to perform the following steps:
s1, acquiring a weight matrix, a first offset vector and a second offset vector; the weight matrix is an n multiplied by m matrix; n is more than m; the n represents a dimension of the input feature vector;
s2, determining a middle feature vector according to the input feature vector, the weight matrix and the first offset vector;
s3, determining an output eigenvector according to the intermediate eigenvector, the inverse matrix of the weight matrix and the second offset vector;
s4, determining the difference value between the output feature vector and the input feature vector;
s5, determining whether the difference value is smaller than a preset threshold value; if yes, go to S6; if not, executing S7;
s6, re-assigning the weight matrix, the first offset vector and the second offset vector according to a gradient descent method, and repeating S2-S5;
and S7, determining the intermediate feature vector as the target feature vector and outputting the target feature vector.
8. The domain name detection system according to claim 6, comprising:
the preprocessing module is specifically used for extracting a main domain name and a domain name suffix in the domain name to be detected;
the preprocessing module is further used for determining characteristic parameters of the main domain name; the characteristic parameters at least comprise: one or more items of domain name length, information entropy, consonant letter number ratio, two continuous consonant letter ratio, two continuous number ratio and N-element language numerical value of the main domain name;
the preprocessing module is further configured to perform one-hot encoding on the domain name suffix to generate a target domain name suffix;
the preprocessing module is further configured to determine the input feature vector based on the feature parameter and the target domain name suffix.
9. The domain name detection system according to claim 6, comprising:
the preprocessing module is specifically used for acquiring a domain name to be verified;
the preprocessing module is further configured to determine that the domain name to be verified is the domain name to be detected when it is determined that the domain name to be verified does not meet a preset rule.
10. The domain name detection system according to claim 9, wherein the preset rule comprises: the domain name suffix of the domain name to be detected is at least one of the domain name of the onion, the domain name to be detected does not contain the character of the'. multidot..
11. A domain name detection system, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the domain name detection method according to any one of claims 1-5.
12. A computer-readable storage medium comprising instructions that, when executed by a processor, cause the processor to perform the domain name detection method of any one of claims 1-5.
13. A computer program product comprising a computer program, characterized in that the computer program realizes the domain name detection method according to any one of claims 1-5 when the computer program is executed by a processor.
CN202011612785.XA 2020-12-30 2020-12-30 Domain name detection method, system and storage medium Pending CN112769974A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011612785.XA CN112769974A (en) 2020-12-30 2020-12-30 Domain name detection method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011612785.XA CN112769974A (en) 2020-12-30 2020-12-30 Domain name detection method, system and storage medium

Publications (1)

Publication Number Publication Date
CN112769974A true CN112769974A (en) 2021-05-07

Family

ID=75696119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011612785.XA Pending CN112769974A (en) 2020-12-30 2020-12-30 Domain name detection method, system and storage medium

Country Status (1)

Country Link
CN (1) CN112769974A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449782A (en) * 2021-06-18 2021-09-28 中电积至(海南)信息技术有限公司 CDN (content delivery network) hosting node detection method based on graph semi-supervised classification
CN113572770A (en) * 2021-07-26 2021-10-29 清华大学 Method and device for detecting domain name generated by domain name generation algorithm
CN114039756A (en) * 2021-10-29 2022-02-11 恒安嘉新(北京)科技股份公司 Detection method, device, equipment and storage medium for illegal domain name
CN114912443A (en) * 2022-06-22 2022-08-16 曲阜师范大学 Domain name detection, classification and feature screening method, system, device and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108270761A (en) * 2017-01-03 2018-07-10 中国移动通信有限公司研究院 A kind of domain name legitimacy detection method and device
CN109889616A (en) * 2018-05-21 2019-06-14 新华三信息安全技术有限公司 A kind of method and device identifying domain name
CN110266647A (en) * 2019-05-22 2019-09-20 北京金睛云华科技有限公司 It is a kind of to order and control communication check method and system
CN110545284A (en) * 2019-09-17 2019-12-06 武汉思普崚技术有限公司 Domain name detection method and system for antagonistic network
US20200112574A1 (en) * 2018-10-03 2020-04-09 At&T Intellectual Property I, L.P. Unsupervised encoder-decoder neural network security event detection
US20200112571A1 (en) * 2018-10-03 2020-04-09 At&T Intellectual Property I, L.P. Network security event detection via normalized distance based clustering
CN111131260A (en) * 2019-12-24 2020-05-08 邑客得(上海)信息技术有限公司 Mass network malicious domain name identification and classification method and system
CN111628970A (en) * 2020-04-24 2020-09-04 中国科学院计算技术研究所 DGA type botnet detection method, medium and electronic equipment
CN111818198A (en) * 2020-09-10 2020-10-23 腾讯科技(深圳)有限公司 Domain name detection method, domain name detection device, equipment and medium
CN111866196A (en) * 2019-04-26 2020-10-30 深信服科技股份有限公司 Domain name traffic characteristic extraction method, device, equipment and readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108270761A (en) * 2017-01-03 2018-07-10 中国移动通信有限公司研究院 A kind of domain name legitimacy detection method and device
CN109889616A (en) * 2018-05-21 2019-06-14 新华三信息安全技术有限公司 A kind of method and device identifying domain name
US20200112574A1 (en) * 2018-10-03 2020-04-09 At&T Intellectual Property I, L.P. Unsupervised encoder-decoder neural network security event detection
US20200112571A1 (en) * 2018-10-03 2020-04-09 At&T Intellectual Property I, L.P. Network security event detection via normalized distance based clustering
CN111866196A (en) * 2019-04-26 2020-10-30 深信服科技股份有限公司 Domain name traffic characteristic extraction method, device, equipment and readable storage medium
CN110266647A (en) * 2019-05-22 2019-09-20 北京金睛云华科技有限公司 It is a kind of to order and control communication check method and system
CN110545284A (en) * 2019-09-17 2019-12-06 武汉思普崚技术有限公司 Domain name detection method and system for antagonistic network
CN111131260A (en) * 2019-12-24 2020-05-08 邑客得(上海)信息技术有限公司 Mass network malicious domain name identification and classification method and system
CN111628970A (en) * 2020-04-24 2020-09-04 中国科学院计算技术研究所 DGA type botnet detection method, medium and electronic equipment
CN111818198A (en) * 2020-09-10 2020-10-23 腾讯科技(深圳)有限公司 Domain name detection method, domain name detection device, equipment and medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
殷云华等: "基于混合卷积自编码极限学习机的RGB-D物体识别", 《红外与激光工程》 *
王辉等: "基于MLP深度学习算法的DGA准确识别技术研究", 《信息安全研究》 *
胡俊等: "一种基于深度学习的层次化钓鱼网站检测方法", 《通信技术》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449782A (en) * 2021-06-18 2021-09-28 中电积至(海南)信息技术有限公司 CDN (content delivery network) hosting node detection method based on graph semi-supervised classification
CN113449782B (en) * 2021-06-18 2022-05-24 中电积至(海南)信息技术有限公司 CDN (content delivery network) hosting node detection method based on graph semi-supervised classification
CN113572770A (en) * 2021-07-26 2021-10-29 清华大学 Method and device for detecting domain name generated by domain name generation algorithm
CN113572770B (en) * 2021-07-26 2022-09-02 清华大学 Method and device for detecting domain name generated by domain name generation algorithm
CN114039756A (en) * 2021-10-29 2022-02-11 恒安嘉新(北京)科技股份公司 Detection method, device, equipment and storage medium for illegal domain name
CN114039756B (en) * 2021-10-29 2024-04-05 恒安嘉新(北京)科技股份公司 Illegal domain name detection method, device, equipment and storage medium
CN114912443A (en) * 2022-06-22 2022-08-16 曲阜师范大学 Domain name detection, classification and feature screening method, system, device and storage medium

Similar Documents

Publication Publication Date Title
CN112769974A (en) Domain name detection method, system and storage medium
CN108073677B (en) Multi-level text multi-label classification method and system based on artificial intelligence
CN110472090B (en) Image retrieval method based on semantic tags, related device and storage medium
CN110298035B (en) Word vector definition method, device, equipment and storage medium based on artificial intelligence
CN111159485B (en) Tail entity linking method, device, server and storage medium
CN113806582B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN112052451A (en) Webshell detection method and device
WO2023231954A1 (en) Data denoising method and related device
CN111831826A (en) Training method, classification method and device of cross-domain text classification model
US11373043B2 (en) Technique for generating and utilizing virtual fingerprint representing text data
CN115862040A (en) Text error correction method and device, computer equipment and readable storage medium
CN113609819B (en) Punctuation mark determination model and determination method
CN114817612A (en) Method and related device for calculating multi-modal data matching degree and training calculation model
CN114581702A (en) Image classification method and device, computer equipment and computer readable storage medium
CN110674370A (en) Domain name identification method and device, storage medium and electronic equipment
CN112464689A (en) Method, device and system for generating neural network and storage medium for storing instructions
CN113128526A (en) Image recognition method and device, electronic equipment and computer-readable storage medium
CN115858002B (en) Binary code similarity detection method and system based on graph comparison learning and storage medium
US20230186668A1 (en) Polar relative distance transformer
JP2024507029A (en) Web page identification methods, devices, electronic devices, media and computer programs
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium
CN111783088B (en) Malicious code family clustering method and device and computer equipment
JP5824429B2 (en) Spam account score calculation apparatus, spam account score calculation method, and program
Vrachimis et al. Resilient edge machine learning in smart city environments
CN116821408B (en) Multi-task consistency countermeasure retrieval method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination