CN110493208B - Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method - Google Patents

Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method Download PDF

Info

Publication number
CN110493208B
CN110493208B CN201910734488.3A CN201910734488A CN110493208B CN 110493208 B CN110493208 B CN 110493208B CN 201910734488 A CN201910734488 A CN 201910734488A CN 110493208 B CN110493208 B CN 110493208B
Authority
CN
China
Prior art keywords
malicious
https
protocol
extracting
dns
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910734488.3A
Other languages
Chinese (zh)
Other versions
CN110493208A (en
Inventor
陈虎
唐开达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Juming Network Technology Co ltd
Original Assignee
Nanjing Juming Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Juming Network Technology Co ltd filed Critical Nanjing Juming Network Technology Co ltd
Priority to CN201910734488.3A priority Critical patent/CN110493208B/en
Publication of CN110493208A publication Critical patent/CN110493208A/en
Application granted granted Critical
Publication of CN110493208B publication Critical patent/CN110493208B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a multi-feature malicious encrypted traffic identification method combining DNS and HTTPS, which comprises the following steps: the method comprises the following steps: extracting all sample DNS communication protocols in the learning network, and analyzing the DNS communication protocols: step two: extracting all malicious/non-malicious HTTPS communication protocol handshake parts (non-encrypted contents) in the learning network, and analyzing the HTTPS communication protocol handshake parts: step three: extracting the session related feature information of the malicious/non-malicious HTTPS protocol session, and the fourth step: correlating related contents of the DNS protocol and the HTTPS protocol, and performing a fifth step: through data learning of normal encryption flow, extracting normal encryption communication data characteristics, and carrying out the sixth step: classifying the data by using a regression method, and performing a seventh step: storing the weight data to a persistent medium through the training result for subsequent use; step eight: and performing characteristic extraction and substitution solving on the related encrypted flow data in the existing network by using the solving result.

Description

Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method
Technical Field
The invention relates to an identification method, in particular to a multi-feature malicious encrypted traffic identification method combining DNS and HTTPS, and belongs to the technical field of software encryption identification.
Background
With the continuous development of encryption technology and the upgrading of computer security attack and defense technology, the content of plaintext transmission in a network is less and less, the proportion of encryption flow is higher and higher, and statistics is made that the encryption method is used for more than 60% of the Internet transmission content currently, wherein the HTTPS encryption transmission ratio is the highest; along with this, hackers often use encryption algorithms to encrypt control commands and data transmitted by hackers, thereby evading the detection of various kinds of killing tools, which results in undetected malicious network traffic and thus missing important information.
As mentioned above, the encryption protocol used by hackers is generally preferred to be HTTPS, and the reason is that HTTPS can easily penetrate the firewall setting, i.e. the firewall generally does not set a policy to block network access of 80 or 443 ports, so that the hacker's control commands and some backhaul data can be simply transmitted in the network without any restrictions.
In view of the above situation, how to detect malicious communication data in encrypted traffic becomes a difficult problem for people. Moreover, the HTTPS protocol generally uses the Diffie-Hellman algorithm to perform dynamic key negotiation, and it is almost everywhere at night to try to break the session key, so a new approach should be made to detect such network traffic, especially malicious traffic, and the idea is mainly performed by a machine learning method.
Disclosure of Invention
The invention provides a method for identifying malicious encrypted traffic by combining a multi-feature DNS with HTTPS (hypertext transfer protocol secure protocol), aiming at the problems in the prior art, the technical scheme makes full use of the DNS protocol, because before HTTPS related communication, domain name request is generally required to cover or avoid hard coding of a hack loopback address in a code) to prepare the related malicious traffic, and partial dimensionality of a feature vector is formed.
The relevant explanation in this scheme is as follows:
DNS: the domain name resolution protocol provides an addressing method under the Internet environment by utilizing the mapping relation between a domain name and an IP address, so that a user can conveniently memorize related websites;
HTTPS: the encrypted hypertext transfer protocol, HTTP over TLS/SSL.
In order to achieve the above object, a technical solution of the present invention is as follows, a traffic identification method using a multi-feature DNS in combination with HTTPS for malicious encryption, wherein the method comprises the following steps:
the method comprises the following steps: extracting all sample DNS communication protocols in the learning network, and analyzing the DNS communication protocols:
step two: extracting all malicious/non-malicious HTTPS communication protocol handshake parts (non-encrypted contents) in the learning network, and analyzing the HTTPS communication protocol handshake parts:
step three: extracting session related characteristic information of malicious/non-malicious HTTPS protocol sessions, wherein the information comprises the following main aspects:
step four: the related contents of the DNS protocol and the HTTPS protocol are related according to the IP address query returned by the DNS and the destination address connected in the HTTPS protocol;
step five: extracting the characteristics of normal encrypted communication data through data learning of normal encrypted flow, wherein the data can come from well-known HTTPS websites such as Baidu and Xinlang and can be labeled in the forward direction; learning abnormal encryption flow generated by malicious software, extracting relevant characteristics, and carrying out negative direction labeling on the characteristics;
step six: classifying the data by using a regression method, wherein a Lasso regression method is used in the method in consideration of relevant application scenes and calculation speed;
step seven: storing the weight data to a persistent medium through the training result for subsequent use;
step eight: and performing feature extraction and substitution solution on the related encrypted traffic data in the existing network by using the solution result, if the result is biased to be positive, considering the result to be normal encrypted traffic, and if the result is not biased to be positive, considering the result to be malicious encrypted traffic, wherein an absolute value of the result can be given as a confidence coefficient to serve as a reference or measurement of the related accuracy.
As an improvement of the invention, the method comprises the following steps: all sample DNS communication protocols in the learning network are extracted and analyzed, and concretely, the DNS communication protocols are analyzed as follows,
the domain name information (FQDN) requested by the DNS and the returned actual domain name information, whether the two domain names appear in ten million common domain names, namely whether the ranking is within the first ten million of the common domain names, and the dimensionality in the characteristics is formed by respectively taking the values of 1 and 0 aiming at whether the two domain names appear;
acquiring a Time to Live (Time to Live, that is, the survival Time of the domain name) value of the domain name from the domain name query response information, wherein the unit of the Time to Live is generally second, such as 100 seconds, 200 seconds, 300 seconds and the like, and forming a dimension in the feature; according to a DNS request initiated by collected malicious traffic, the TTL value is generally rare;
acquiring the analyzed average address number from the domain name query one-time response information, wherein the number of the general request addresses is different from that of the normal DNS request to form dimensionality in the characteristics;
requesting domain name flash characteristics, i.e. requesting the relation between domain name and address number in unit time period (generally within one hour), forming the dimension in the feature;
obtaining the country (using specific IPGeo library to search) condition of the return addresses from the response information of the domain name query, forming the dimension in the characteristics;
in general malicious encrypted traffic, the number of domain names is large through back-checking the domain names (namely the number of the domain names corresponding to a certain IP address), so that dimensionality in characteristics is formed;
checking spelling of the queried domain name, including letter-to-number ratio, spelling ratio of 2-gram (i.e. two consecutive letters) (obtained from spelling ratio of general domain name, the lower the ratio, the more rare the ratio, the more possible there is a problem), 3-gram (spelling ratio of three consecutive letters) letter spelling transition probability, and 3-gram letter consonant ratio; the above spellings are each checked to form a dimension in a feature; the 2-gram and the 3-gram mentioned above refer to the number of continuous characters, and the domain names requested by the general malicious software are common in DGA (dynamic generation algorithm) domain names, so that the composition check of the domain names is very important;
as an improvement of the present invention, the second step: extracting all malicious/non-malicious HTTPS communication protocol handshake parts (non-encrypted contents) in the learning network, and analyzing the HTTPS communication protocol handshake parts, wherein the details are as follows:
checking the version information of the communication protocol, namely checking whether the encrypted communication protocol is TLS1.0, TLS1.1, TLS1.2 or TLS 1.3; the versions are further mapped to a numerical value to form a dimension in a characteristic, and most of malicious encrypted traffic uses a lower TLS version through sample analysis;
acquiring relevant information of a certificate issuer from a handshake protocol, inquiring the ranking condition of the certificate issuer in the known certificate issuer, and forming one dimension in the characteristics;
acquiring the type of an encryption algorithm suite from a handshake protocol, and respectively mapping the encryption algorithm suite into different values, namely mapping a public key algorithm, a communication key exchange algorithm, a data communication symmetric encryption algorithm (such as AES, 3DES, RC2, RC4, RC5 and the like), and a data digest algorithm (such as MD5, SHA-1, SHA-256 and the like) into different values, and forming different dimensions in characteristics, because according to sample observation, malicious encryption traffic can often use the encryption algorithm suite with lower strength due to technical reasons, for example, the data communication symmetric encryption algorithm can often use the basically abandoned algorithms such as RC4, RC2 and the like, and can less use the algorithms with higher strength such as AES and the like;
and acquiring address information (Server Name) of the communication Server from the handshake protocol, and checking the ranking of the communication Server in the common domain Name to form one dimension in the feature vector.
As an improvement of the invention, step three: extracting session related characteristic information of malicious/non-malicious HTTPS protocol sessions, wherein the information comprises the following main aspects: the average load packet length of the protocol session, namely, the related average data packet length in the HTTPS protocol is calculated, here, only the seven-layer load length is calculated, and four layers and less than four layers should be ignored (namely, part of the handshake packet, the response packet and the end data packet are ignored) as one dimension of the feature vector;
respectively calculating the average load packet length of a client/server of a protocol session, namely calculating the average data packet length related to an HTTPS protocol, wherein the average load packet length is similar to that of the previous protocol, only seven layers of load lengths are calculated, four layers and less than four layers are ignored (namely, part of handshake packets, response packets and end data packets are ignored), and the average load packet lengths of communication flows of the client and the server are respectively used as one dimension of a feature vector;
acquiring the average packet number of the protocol session, namely acquiring the average data packet number of the malicious encrypted flow session in all the learning samples as one dimension of the feature vector;
acquiring the ratio of the number of outgoing packets and the number of incoming packets of the protocol session, namely dividing the number of outgoing packets by the number of incoming packets to be used as one dimension of the feature vector; acquiring the ratio of the number of outgoing bytes and the number of incoming bytes of the protocol session, namely dividing the number of outgoing bytes by the number of incoming bytes to serve as one dimension of the feature vector;
acquiring the average packet number of a client/server of the protocol session, namely acquiring the average data packet number of malicious encrypted flow sessions in all learning samples, and taking the average data packet number as one dimension of a feature vector;
calculating the average entropy of all encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters and is taken as one dimension in the feature vector;
and respectively calculating the average entropy of all client/server encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters, and the two entropy values are taken as one dimension in the characteristic vector.
As an improvement of the invention, step six: the regression method is used for classifying the data, and in consideration of relevant application scenes and calculation speed, the Lasso regression method is used in the patent, and the method specifically comprises the following steps:
where X is a sample vector of how many samples there are, the dimensions of each vector depending on the number of features extracted, these dimensions may be a subset or the full set of those mentioned above; and Y (or f (x)k) Is a scalar, only two values are taken in this patent, namely {1, -1}, where the positive sample is taken as 1 and the negative standard sample is taken as-1, the following is the main part of the regression algorithm, and the most important task is to obtain the relevant weight value by learning to minimize the objective function:
objective function (with regularization portion):
Figure BDA0002161709560000041
in the above formula f (x)k) That is, the function values, take on values of 1 and-1, and xkIs a certain sample, wTThe method is characterized in that the method is a coefficient matrix transpose of a linear equation, which is also an object to be solved, N is the number of samples and m is the dimensionality of a vector; λ in the latter part of the formula is the regularization coefficient; however, since the absolute value function is included, the whole formula is not conductive, and the extreme value is obtained by using some auxiliary methods;
matrix form of the objective function (convert the above equation to matrix form as follows):
min||wTX-Y||2+λ||w||
the difficulty in solving the above equation is that a norm is not derivable at zero (the reason for absolute value), so it does not have a closed solution unlike a general regression equation, but rather requires the use of a FIST (fast Iterative shock threshold) method, which can be used to solve an objective function (i.e., using the FIST method) shaped as the following section;
FIST method:
minF(x)=minf(x)+g(x)
where g (x) is a continuous convex function, which may not be smooth, and f (x) is a smooth function, whose derivative should satisfy the Lipschitz Continuity requirement, which is stronger than the general requirement of consistent Continuity, i.e. there is a constant L (greater than zero) that satisfies the following requirements for any two different real numbers x and z (which may be extended to other satisfactory spaces, not necessarily real spaces) on the definition domain D:
Figure BDA0002161709560000051
for L that satisfies the condition minimum, it is called the Lipschitz constant, and if L <1, f is called the contraction mapping.
Figure BDA0002161709560000052
A gradient of f (x); the following equation can be obtained:
Figure BDA0002161709560000053
in the above formula, <., > is the inner product sign, the right part of the formula is expanded using a taylor-like formula (expansion of the function f (x) at point z).
Order to
Figure BDA0002161709560000054
Let g (w) be λ w | |, and then f (w) be added to w(t)And (where t denotes w iterates t times), as disclosed aboveThe formula can be obtained:
Figure BDA0002161709560000055
however, the above formula is still not derivable, and now, the above formula is transformed into the following formula by using an upper bound function of FIST:
Figure BDA0002161709560000057
and finally solving the formula by using a soft domain shrinkage operator method to obtain a correlation result so as to obtain the weight.
As an improvement of the invention, step seven: and storing the weight data to a persistent medium through a training result for subsequent use, wherein the specific process comprises the following steps: analyzing DNS sample data, extracting domain name ranking characteristics, extracting domain name TTL characteristics, extracting domain name address analysis characteristics, extracting domain name flash characteristics, removing country distribution characteristics, extracting domain name spelling and pronunciation characteristics, analyzing encryption flow sample data, extracting encryption version characteristics, extracting certificate ranking characteristics, extracting algorithm suite characteristics, extracting communication server ranking characteristics, extracting encryption flow data packet characteristics, and training data by using a Lasso regression method to form a training result and store the training result.
As an improvement of the invention, step eight: performing feature extraction and substitution solution on the related encrypted traffic data in the existing network by using the solution result, if the result is biased to be positive, considering the result to be normal encrypted traffic, and if the result is not biased to be positive, considering the result to be malicious encrypted traffic, wherein an absolute value of the malicious encrypted traffic can be given as a confidence coefficient to serve as a reference or measurement of related accuracy; loading a weight formed by training data during initialization;
the system enters a network card packet receiving process;
judging whether the received data packet is a DNS protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, turning to the next step;
judging whether the received data packet is an encrypted flow protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, continuing to receive the packet;
and checking whether the malicious encrypted traffic characteristics are met, if so, alarming and continuing to process, and otherwise, continuing to process.
Compared with the prior art, the invention has the following technical effects:
1) for increasingly encrypted network traffic, the patent provides a means by which methods can examine them; with regard to the encryption case, the patent provides a method for detecting unknown threats, and particularly with the combination of DNS-related detection, the capabilities of these aspects can be further enhanced.
2) The technical scheme fully utilizes some relevant characteristics of a DNS protocol (before HTTPS relevant communication is carried out, a domain name request is generally required to cover or avoid hard coding a loopback address of a hacker in a code) to carry out preparation processing on relevant malicious traffic to form part of dimensions of a feature vector, and the features are very important;
3) the technical scheme fully utilizes the non-encryption part of the HTTPS protocol, such as a plurality of data packets of HTTPS handshake, and checks the characteristics of the data packets; the important part comprises the encryption algorithm suite type (including contents such as a digest algorithm, a key agreement algorithm, a session key algorithm and the like), a server name, a certificate owner, an issuer and the like, and forms part of dimensionalities of the feature vector;
4) the technical scheme fully learns the characteristics of encrypted communication data streams of some known malicious software, and finds out the possibly existing same points by comparing the characteristics with the encrypted communication data streams so as to discriminate whether the malicious encrypted communication data streams exist and form partial dimensions of related characteristic vectors;
5) according to the technical scheme, a certain machine learning algorithm is utilized to model the characteristics of the malicious encrypted traffic so as to distinguish normal communication traffic and the malicious encrypted traffic.
Drawings
FIG. 1 is a schematic view of a seventh process flow;
fig. 2 is a schematic view of the verification processing flow in step eight.
The specific implementation mode is as follows:
for the purpose of enhancing an understanding of the present invention, the present embodiment will be described in detail below with reference to the accompanying drawings.
Example 1: referring to fig. 1, a multi-feature DNS in combination with HTTPS malicious encrypted traffic identification method includes the following steps:
the method comprises the following steps: extracting all sample DNS communication protocols in the learning network, and analyzing the DNS communication protocols:
step two: extracting all malicious/non-malicious HTTPS communication protocol handshake parts (non-encrypted contents) in the learning network, and analyzing the HTTPS communication protocol handshake parts:
step three: extracting session related characteristic information of malicious/non-malicious HTTPS protocol sessions, wherein the information comprises the following main aspects:
step four: the related contents of the DNS protocol and the HTTPS protocol are related according to the IP address query returned by the DNS and the destination address connected in the HTTPS protocol;
step five: extracting the characteristics of normal encrypted communication data through data learning of normal encrypted flow, wherein the data can come from well-known HTTPS websites such as Baidu and Xinlang and can be labeled in the forward direction; learning abnormal encryption flow generated by malicious software, extracting relevant characteristics, and carrying out negative direction labeling on the characteristics;
step six: classifying the data by using a regression method, wherein a Lasso regression method is used in the method in consideration of relevant application scenes and calculation speed;
step seven: storing the weight data to a persistent medium through the training result for subsequent use;
step eight: and performing feature extraction and substitution solution on the related encrypted traffic data in the existing network by using the solution result, if the result is biased to be positive, considering the result to be normal encrypted traffic, and if the result is not biased to be positive, considering the result to be malicious encrypted traffic, wherein an absolute value of the result can be given as a confidence coefficient to serve as a reference or measurement of the related accuracy.
The method comprises the following steps: extracting all sample DNS communication protocols in a learning network, and analyzing the DNS communication protocols, wherein the DNS request domain name information (FQDN) and the returned actual domain name information show whether the two domain names appear in ten million common domain names, namely whether the ranking is within the previous ten million of the common domain names, and respectively taking values of 1 and 0 aiming at whether the two domain names appear to form dimensionality in the characteristics;
acquiring a Time to Live (Time to Live, that is, the survival Time of the domain name) value of the domain name from the domain name query response information, wherein the unit of the Time to Live is generally second, such as 100 seconds, 200 seconds, 300 seconds and the like, and forming a dimension in the feature; according to a DNS request initiated by collected malicious traffic, the TTL value is generally rare;
acquiring the analyzed average address number from the domain name query one-time response information, wherein the number of the general request addresses is different from that of the normal DNS request to form dimensionality in the characteristics;
requesting domain name flash characteristics, i.e. requesting the relation between domain name and address number in unit time period (generally within one hour), forming the dimension in the feature;
obtaining the country (using specific IPGeo library to search) condition of the return addresses from the response information of the domain name query, forming the dimension in the characteristics;
in general malicious encrypted traffic, the number of domain names is large through back-checking the domain names (namely the number of the domain names corresponding to a certain IP address), so that dimensionality in characteristics is formed;
checking spelling of the queried domain name, including letter-to-number ratio, spelling ratio of 2-gram (i.e. two consecutive letters) (obtained from spelling ratio of general domain name, the lower the ratio, the more rare the ratio, the more possible there is a problem), 3-gram (spelling ratio of three consecutive letters) letter spelling transition probability, and 3-gram letter consonant ratio; the above spellings are each checked to form a dimension in a feature; the 2-gram and the 3-gram mentioned above refer to the number of continuous characters, and the domain names requested by the general malicious software are common in DGA (dynamic generation algorithm) domain names, so that the composition check of the domain names is very important;
the second step is as follows: extracting all malicious/non-malicious HTTPS communication protocol handshake parts (non-encrypted contents) in the learning network, and analyzing the HTTPS communication protocol handshake parts, wherein the details are as follows:
checking the version information of the communication protocol, namely checking whether the encrypted communication protocol is TLS1.0, TLS1.1, TLS1.2 or TLS 1.3; the versions are further mapped to a numerical value to form a dimension in a characteristic, and most of malicious encrypted traffic uses a lower TLS version through sample analysis;
acquiring relevant information of a certificate issuer from a handshake protocol, inquiring the ranking condition of the certificate issuer in the known certificate issuer, and forming one dimension in the characteristics;
acquiring the type of an encryption algorithm suite from a handshake protocol, and respectively mapping the encryption algorithm suite into different values, namely mapping a public key algorithm, a communication key exchange algorithm, a data communication symmetric encryption algorithm (such as AES, 3DES, RC2, RC4, RC5 and the like), and a data digest algorithm (such as MD5, SHA-1, SHA-256 and the like) into different values, and forming different dimensions in characteristics, because according to sample observation, malicious encryption traffic can often use the encryption algorithm suite with lower strength due to technical reasons, for example, the data communication symmetric encryption algorithm can often use the basically abandoned algorithms such as RC4, RC2 and the like, and can less use the algorithms with higher strength such as AES and the like;
and acquiring address information (Server Name) of the communication Server from the handshake protocol, and checking the ranking of the communication Server in the common domain Name to form one dimension in the feature vector.
Step three: extracting session related characteristic information of malicious/non-malicious HTTPS protocol sessions, wherein the information comprises the following main aspects: the average load packet length of the protocol session, namely, the related average data packet length in the HTTPS protocol is calculated, here, only the seven-layer load length is calculated, and four layers and less than four layers should be ignored (namely, part of the handshake packet, the response packet and the end data packet are ignored) as one dimension of the feature vector;
respectively calculating the average load packet length of a client/server of a protocol session, namely calculating the average data packet length related to an HTTPS protocol, wherein the average load packet length is similar to that of the previous protocol, only seven layers of load lengths are calculated, four layers and less than four layers are ignored (namely, part of handshake packets, response packets and end data packets are ignored), and the average load packet lengths of communication flows of the client and the server are respectively used as one dimension of a feature vector;
acquiring the average packet number of the protocol session, namely acquiring the average data packet number of the malicious encrypted flow session in all the learning samples as one dimension of the feature vector;
acquiring the ratio of the number of outgoing packets and the number of incoming packets of the protocol session, namely dividing the number of outgoing packets by the number of incoming packets to be used as one dimension of the feature vector; acquiring the ratio of the number of outgoing bytes and the number of incoming bytes of the protocol session, namely dividing the number of outgoing bytes by the number of incoming bytes to serve as one dimension of the feature vector;
acquiring the average packet number of a client/server of the protocol session, namely acquiring the average data packet number of malicious encrypted flow sessions in all learning samples, and taking the average data packet number as one dimension of a feature vector;
calculating the average entropy of all encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters and is taken as one dimension in the feature vector;
and respectively calculating the average entropy of all client/server encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters, and the two entropy values are taken as one dimension in the characteristic vector.
Step six: the regression method is used for classifying the data, and in consideration of relevant application scenes and calculation speed, the Lasso regression method is used in the patent, and the method specifically comprises the following steps:
where X is a sample vector of how many samples there are, the dimensions of each vector depending on the number of features extracted, these dimensions may be a subset or the full set of those mentioned above; and Y (or f (x)k) Is a scalar, only two values are taken in this patent, namely {1, -1}, wherein the positive sample is taken as 1 and the negative standard sample is taken as-1, the main part of the regression algorithm is involved, and the most important task is to obtain the relevant weight value through learning to achieve the target functionMinimization of the number:
objective function (with regularization portion):
Figure BDA0002161709560000101
in the above formula f (x)k) That is, the function values, take on values of 1 and-1, and xkIs a certain sample, wTThe method is characterized in that the method is a coefficient matrix transpose of a linear equation, which is also an object to be solved, N is the number of samples and m is the dimensionality of a vector; λ in the latter part of the formula is the regularization coefficient; however, since the absolute value function is included, the whole formula is not conductive, and the extreme value is obtained by using some auxiliary methods;
matrix form of the objective function (convert the above equation to matrix form as follows):
min||wTX-Y||2+λ||w||
the difficulty in solving the above equation is that a norm is not derivable at zero (the reason for absolute value), so it does not have a closed solution unlike a general regression equation, but rather requires the use of a FIST (fast Iterative shock threshold) method, which can be used to solve an objective function (i.e., using the FIST method) shaped as the following section;
FIST method:
minF(x)=minf(x)+g(x)
where g (x) is a continuous convex function, which may not be smooth, and f (x) is a smooth function, whose derivative should satisfy the Lipschitz Continuity requirement, which is stronger than the general requirement of consistent Continuity, i.e. there is a constant L (greater than zero) that satisfies the following requirements for any two different real numbers x and z (which may be extended to other satisfactory spaces, not necessarily real spaces) on the definition domain D:
Figure BDA0002161709560000102
for L that satisfies the condition minimum, it is called the Lipschitz constant, and if L <1, f is called the contraction mapping.
Figure BDA0002161709560000103
A gradient of f (x); the following equation can be obtained:
Figure BDA0002161709560000104
in the above formula, <., > is the inner product sign, the right part of the formula is expanded using a taylor-like formula (expansion of the function f (x) at point z).
Order to
Figure BDA0002161709560000105
Let g (w) be λ w | |, and then f (w) be added to w(t)The evolution (where t denotes w iterates t times) can be obtained from the above equation:
Figure BDA0002161709560000107
however, the above formula is still not derivable, and now, the above formula is transformed into the following formula by using an upper bound function of FIST:
Figure BDA0002161709560000111
and finally solving the formula by using a soft domain shrinkage operator method to obtain a correlation result so as to obtain the weight.
Step seven: the weight data is saved to the persistent medium through the training result for subsequent use, and referring to fig. 1, the specific flow is as follows: analyzing DNS sample data, extracting domain name ranking characteristics, extracting domain name TTL characteristics, extracting domain name address analysis characteristics, extracting domain name flash characteristics, removing country distribution characteristics, extracting domain name spelling and pronunciation characteristics, analyzing encryption flow sample data, extracting encryption version characteristics, extracting certificate ranking characteristics, extracting algorithm suite characteristics, extracting communication server ranking characteristics, extracting encryption flow data packet characteristics, and training data by using a Lasso regression method to form a training result and store the training result.
Step eight: performing feature extraction and substitution solution on the related encrypted traffic data in the existing network by using the solution result, if the result is biased to be positive, considering the result to be normal encrypted traffic, and if the result is not biased to be positive, considering the result to be malicious encrypted traffic, wherein an absolute value of the malicious encrypted traffic can be given as a confidence coefficient to serve as a reference or measurement of related accuracy;
loading a weight formed by training data during initialization;
the system enters a network card packet receiving process;
judging whether the received data packet is a DNS protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, turning to the next step;
judging whether the received data packet is an encrypted flow protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, continuing to receive the packet;
and checking whether the malicious encrypted traffic characteristics are met, if so, alarming and continuing to process, and otherwise, continuing to process, specifically referring to fig. 2.
The following is an example of feature extraction for a Trojan variety "Kryptik" (this Trojan variety was discovered first in 2019, 6 months, it was propagated through mail, infected persons are at risk of encrypted lasso of important documents; furthermore, for simplicity, the method labels the values of the dimensions with floating point numbers between [0,1], where more normal the closer to 1, the higher the anomaly, and vice versa):
1) according to the DNS request analysis, the Trojan horse requests a domain name batty.duckdnsadsrf.org, the domain name is not within ten million common domain names, so the characteristic is 0;
2) according to DNS request analysis, the Trojan horse requests domain names to have spelling abnormality due to the fact that the 3-gram transfer probability is too low and continuous consonants ('dsrf') exist; therefore, the two characteristics are different from the normal domain name spelling characteristics, and the lowest transfer probabilities are respectively marked;
3) according to the DNS request analysis, there is only one return address for the domain name (192.3.205.98), and the feature is normal, i.e. there is no abnormal behavior in the flash feature, here labeled 1;
4) according to the DNS request analysis, the domain name return address is an overseas IP (United states), so that the overseas and overseas characteristics are abnormal and are marked as 0;
5) the method adopts an HTTPS mode for communicating with a zombie host, the version of the HTTPS mode is TLS1.0, and the version is marked as 0 because the version is too low and abnormal;
6) in the algorithm suite, the symmetric encryption method uses RC4, and is marked as 0 if abnormal;
7) the communication certificate is also a self-signed certificate and is not in the range of common TLS/SSL certificates, so that the communication certificate is marked as 0;
8) the rank of the HTTPS communication server is not within normal ten million, so that the rank is marked as 0;
9) the average data packet byte of HTTPS communication is 264, which is much lower than the average byte number of normal communication session (normally 500-;
10) and putting the extracted features into a formed regression formula for verification, and determining the features as malicious encrypted traffic.
It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and all equivalent modifications and substitutions based on the above-mentioned technical solutions are within the scope of the present invention as defined in the claims.

Claims (6)

1. A multi-feature DNS combined with HTTPS malicious encrypted traffic identification method is characterized by comprising the following steps:
the method comprises the following steps: extracting all sample DNS communication protocols in the learning network, and analyzing the DNS communication protocols:
step two: extracting all non-encrypted contents of the malicious/non-malicious HTTPS communication protocol handshake part in the learning network, and analyzing the HTTPS communication protocol handshake part:
step three: extracting session-related characteristic information of malicious/non-malicious HTTPS protocol sessions,
step four: the related contents of the DNS protocol and the HTTPS protocol are related according to the IP address query returned by the DNS and the destination address connected in the HTTPS protocol;
step five: extracting the characteristics of normal encrypted communication data through data learning of normal encrypted flow, wherein the data can come from well-known HTTPS websites such as Baidu and Xinlang and can be labeled in the forward direction; learning abnormal encryption flow generated by malicious software, extracting relevant characteristics, and carrying out negative direction labeling on the characteristics;
step six: classifying the data by using a regression method, and considering relevant application scenes and calculation speed, using a Lasso regression method;
step seven: storing the weight data to a persistent medium through the training result for subsequent use;
step eight: performing feature extraction and substitution solving on the related encrypted traffic data in the current network by using the weight data obtained in the step seven, if the result is biased to be a positive number, considering the result to be normal encrypted traffic, otherwise, considering the result to be malicious encrypted traffic, and giving an absolute value of the result as a confidence coefficient to be used as a reference or measurement of the related accuracy;
the method comprises the following steps: all sample DNS communication protocols in the learning network are extracted and analyzed, and concretely, the DNS communication protocols are analyzed as follows,
the domain name information FQDN requested by the DNS and the returned actual domain name information, whether the two domain names appear in ten million common domain names, namely whether the ranking is within the first ten million of the common domain names, and the dimensionality in the characteristics is formed aiming at whether the two domain names respectively take the values of 1 and 0;
acquiring TTL (Time to Live) of the domain name, namely the survival Time value of the domain name from the domain name query response information, wherein the unit of the survival Time value is generally second, and forming dimensionality in the characteristics;
acquiring the analyzed average address number from the domain name query one-time response information, wherein the request address number is different from a normal DNS request to a certain extent, and forming dimensionality in the characteristics;
requesting domain name flash characteristics, namely requesting the relation between domain names and the number of addresses in a unit time period to form dimensionality in the characteristics;
obtaining the countries of the return addresses from the response information of the domain name query, using a specific IPGeo library to search the condition, and forming the dimensionality in the characteristics in unit time;
forming dimensionality in the characteristics by the obtained address reverse-checking domain name, namely the number of domain names corresponding to a certain IP address;
checking the spelling mode of the queried domain name, wherein the checking comprises the occupation ratio of letters and numbers, the spelling ratio of 2-gram, namely two continuous letters, the spelling ratio of 3-gram three continuous letters, the letter spelling transition probability and the consonant occupation ratio of 3-gram letters; the above spell checks each form a dimension in a feature.
2. The multi-feature DNS in combination with HTTPS malicious encrypted traffic identification method according to claim 1, wherein said step two: extracting all non-encrypted contents of the malicious/non-malicious HTTPS communication protocol handshake part in the learning network, and analyzing the HTTPS communication protocol handshake part, wherein the details are as follows:
checking the version information of the communication protocol, namely checking whether the encrypted communication protocol is TLS1.0, TLS1.1, TLS1.2 or TLS 1.3; the versions are further mapped to a numerical value to form a dimension in a characteristic, and most of malicious encrypted traffic uses a lower TLS version through sample analysis;
acquiring relevant information of a certificate issuer from a handshake protocol, inquiring the ranking condition of the certificate issuer in the known certificate issuer, and forming one dimension in the characteristics;
acquiring the type of an encryption algorithm suite from a handshake protocol, and mapping the type of the encryption algorithm suite into different values respectively, namely mapping a public key algorithm, a communication key exchange algorithm, a data communication symmetric encryption algorithm and a data digest algorithm into different values and forming different dimensions in characteristics;
and acquiring the address information Server Name of the communication Server from the handshake protocol, and checking the ranking of the communication Server Name in the common domain Name to form one dimension in the feature vector.
3. The multi-feature DNS combined HTTPS malicious encrypted traffic identification method according to claim 2, characterized by the steps of three: extracting session related characteristic information of a malicious/non-malicious HTTPS protocol session, wherein the average load packet length of the protocol session is calculated, namely the average data packet length related to the HTTPS protocol is calculated, and only the seven-layer load length is calculated, and four layers and below four layers are ignored and used as one dimension of a characteristic vector;
respectively calculating the average load packet length of a client/server of a protocol session, namely calculating the average data packet length related to an HTTPS protocol, wherein the average load packet length is similar to that of the previous part, only seven layers of load length are calculated, less than four layers of load length are ignored, namely, part of handshake packets, response packets and end data packets are ignored, and the average load packet lengths of communication flows of the client and the server are respectively used as one dimension of a feature vector; acquiring the average packet number of the protocol session, namely acquiring the average data packet number of the malicious encrypted flow session in all the learning samples as one dimension of the feature vector; acquiring the ratio of the number of outgoing packets and the number of incoming packets of the protocol session, namely dividing the number of outgoing packets by the number of incoming packets to be used as one dimension of the feature vector; acquiring the ratio of the number of outgoing bytes and the number of incoming bytes of the protocol session, namely dividing the number of outgoing bytes by the number of incoming bytes to serve as one dimension of the feature vector; acquiring the average packet number of a client/server of the protocol session, namely acquiring the average data packet number of malicious encrypted flow sessions in all learning samples, and taking the average data packet number as one dimension of a feature vector;
calculating the average entropy of all encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters and is taken as one dimension in the feature vector; and respectively calculating the average entropy of all client/server encrypted data packets, wherein the entropy is calculated according to the distribution condition of 0-255 characters, and the two entropy values are taken as one dimension in the characteristic vector.
4. The multi-feature DNS in combination with HTTPS malicious encrypted traffic identification method according to claim 3, wherein step six: classifying the data by using a regression method, namely using a Lasso regression method, which comprises the following specific steps:
where X is a sample vector of how many samples there are, the dimensions of each vector depending on the number of features extracted, these dimensions may be a subset or the full set of those mentioned above; and Y or f (x)k) The method is a scalar, only two values are taken in the scheme, namely {1, -1}, wherein a positive sample is taken as 1, and a negative standard sample is taken as-1, the following relates to a main part of a regression algorithm, and the most important task is to obtain a relevant weight value through learning so as to achieve the minimization of an objective function:
an objective function, comprising a regularization portion:
Figure FDA0003324788630000031
in the above formula f (x)k) That is, the function values, take on values of 1 and-1, and xkIs a certain sample, wTThe method is characterized in that the method is a coefficient matrix transpose of a linear equation, which is also an object to be solved, N is the number of samples and m is the dimensionality of a vector; λ in the latter part of the formula is the regularization coefficient; however, since the absolute value function is included, the whole formula is not conductive, and the extreme value is obtained by using some auxiliary methods;
the matrix form of the objective function converts the above equation into the following matrix form:
min||wTX-Y||2+λ||w||
the difficulty in solving the above equation is the reason that a norm cannot lead to an absolute value at zero, so it does not have a closed solution unlike a general regression equation, but needs to use the Fast Iterative shocking threshold method, which can be used to solve the objective function formed as the following section, i.e. the fish method;
FIST method:
minF(x)=minf(x)+g(x)
wherein g (x) is a continuous convex function, which may not be smooth, and f (x) is a smooth function, whose derivative should satisfy the Lipschitz continuous Lipschitz Continuity requirement, which is stronger than the general requirement of consistent Continuity, i.e. there is a constant L greater than zero, which satisfies the requirement for any two different real numbers x and z on the domain D, which may be extended to other satisfactory spaces, not necessarily real space:
Figure FDA0003324788630000041
for L that satisfies the condition minimum, it is called the Lipschitz constant, if L <1 then f is called the contraction mapping,
Figure FDA0003324788630000042
a gradient of f (x); the following equation can be obtained:
Figure FDA0003324788630000043
in the above formula, <., > is the inner product sign, the right part of the formula is expanded using a taylor-like formula, the function f (x) is expanded at point z,
order to
Figure FDA0003324788630000044
Let g (w) be λ w | |, and then f (w) be added to w(t)And (3) expanding, wherein t represents w iterated for t times, and the formula can be obtained:
Figure FDA0003324788630000045
however, the above formula is still not derivable, and now, the above formula is transformed into the following formula by using an upper bound function of FIST:
Figure FDA0003324788630000046
and finally solving the formula by using a soft domain shrinkage operator method to obtain a correlation result so as to obtain the weight.
5. The multi-feature DNS in combination with HTTPS malicious encrypted traffic identification method according to claim 3, characterized by the seventh step of: and storing the weight data to a persistent medium through a training result for subsequent use, wherein the specific process comprises the following steps: analyzing DNS sample data, extracting domain name ranking characteristics, extracting domain name TTL characteristics, extracting domain name address analysis characteristics, extracting domain name flash characteristics, removing country distribution characteristics, extracting domain name spelling and pronunciation characteristics, analyzing encryption flow sample data, extracting encryption version characteristics, extracting certificate ranking characteristics, extracting algorithm suite characteristics, extracting communication server ranking characteristics, extracting encryption flow data packet characteristics, and training data by using a Lasso regression method to form a training result and store the training result.
6. The multi-feature DNS in combination with HTTPS malicious encrypted traffic identification method according to claim 5, wherein step eight: performing feature extraction and substitution solution on the related encrypted traffic data in the existing network by using the solution result, if the result is biased to be positive, considering the result to be normal encrypted traffic, and if the result is not biased to be positive, considering the result to be malicious encrypted traffic, wherein an absolute value of the malicious encrypted traffic can be given as a confidence coefficient to serve as a reference or measurement of related accuracy; loading a weight formed by training data during initialization;
the system enters a network card packet receiving process; judging whether the received data packet is a DNS protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, turning to the next step; judging whether the received data packet is an encrypted flow protocol, if so, extracting relevant characteristic information and continuing to receive the packet, and if not, continuing to receive the packet; and checking whether the malicious encrypted traffic characteristics are met, if so, alarming and continuing to process, and otherwise, continuing to process.
CN201910734488.3A 2019-08-09 2019-08-09 Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method Active CN110493208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910734488.3A CN110493208B (en) 2019-08-09 2019-08-09 Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910734488.3A CN110493208B (en) 2019-08-09 2019-08-09 Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method

Publications (2)

Publication Number Publication Date
CN110493208A CN110493208A (en) 2019-11-22
CN110493208B true CN110493208B (en) 2021-12-14

Family

ID=68550459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910734488.3A Active CN110493208B (en) 2019-08-09 2019-08-09 Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method

Country Status (1)

Country Link
CN (1) CN110493208B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111526099B (en) * 2020-03-25 2022-08-16 华东师范大学 Internet of things application flow detection method based on deep learning
CN113595967A (en) * 2020-04-30 2021-11-02 深信服科技股份有限公司 Data identification method, equipment, storage medium and device
CN112073362B (en) * 2020-06-19 2022-04-26 北京邮电大学 APT (advanced persistent threat) organization flow identification method based on flow characteristics
CN111901300B (en) * 2020-06-24 2023-02-03 武汉绿色网络信息服务有限责任公司 Method and device for classifying network traffic
CN112261007B (en) * 2020-09-27 2022-07-05 北京六方云信息技术有限公司 Https malicious encryption traffic detection method and system based on machine learning and storage medium
CN112422589B (en) * 2021-01-25 2021-06-08 腾讯科技(深圳)有限公司 Domain name system request identification method, storage medium and electronic device
CN113438332B (en) * 2021-05-21 2022-08-23 中国科学院信息工程研究所 DoH service identification method and device
CN113726615B (en) * 2021-11-02 2022-02-15 北京广通优云科技股份有限公司 Encryption service stability judgment method based on network behaviors in IT intelligent operation and maintenance system
CN113938314B (en) * 2021-11-17 2023-11-28 北京天融信网络安全技术有限公司 Method and device for detecting encrypted traffic and storage medium
CN114465786B (en) * 2022-01-21 2023-10-20 积至(海南)信息技术有限公司 Monitoring method for encrypted network traffic
CN115834097B (en) * 2022-06-24 2024-03-22 电子科技大学 HTTPS malicious software flow detection system and method based on multiple views
CN115567289A (en) * 2022-09-23 2023-01-03 清华大学 Malicious domain name detection method and system based on federal graph model under encrypted DNS protocol
CN115296937B (en) * 2022-10-09 2023-04-18 中孚信息股份有限公司 Method and equipment for identifying real-time encrypted malicious traffic

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833360B (en) * 2018-05-23 2019-11-08 四川大学 A kind of malice encryption method for recognizing flux based on machine learning
CN109450721B (en) * 2018-09-06 2023-04-18 南京聚铭网络科技有限公司 Network abnormal behavior identification method based on deep neural network
CN109450842B (en) * 2018-09-06 2023-06-13 南京聚铭网络科技有限公司 Network malicious behavior recognition method based on neural network
CN109450895B (en) * 2018-11-07 2021-07-02 北京锐安科技有限公司 Traffic identification method, traffic identification device, server and storage medium

Also Published As

Publication number Publication date
CN110493208A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN110493208B (en) Multi-feature DNS (Domain name System) combined HTTPS (Hypertext transfer protocol secure) malicious encrypted traffic identification method
US11863587B2 (en) Webshell detection method and apparatus
US11799823B2 (en) Domain name classification systems and methods
Prasse et al. Malware detection by analysing network traffic with neural networks
US7835390B2 (en) Network traffic identification by waveform analysis
US9009474B2 (en) Method and system for detecting data modification within computing device
US11544575B2 (en) Machine-learning based approach for malware sample clustering
CN106576051B (en) It is a kind of to detect the method threatened for 1st, the network equipment, non-transitory machine-readable media
WO2018076697A1 (en) Method and apparatus for detecting zombie feature
CN111756724A (en) Detection method, device and equipment for phishing website and computer readable storage medium
Sammour et al. Dns tunneling: a review on features
Yu et al. Behavior Analysis based DNS Tunneling Detection and Classification with Big Data Technologies.
CN110855716B (en) Self-adaptive security threat analysis method and system for counterfeit domain names
CN113992625B (en) Domain name source station detection method, system, computer and readable storage medium
De Lucia et al. Identifying and detecting applications within TLS traffic
CN111475690B (en) Character string matching method and device, data detection method and server
JP5682089B2 (en) Communication classification apparatus and method
CN112583827A (en) Data leakage detection method and device
CN110363023B (en) Anonymous network tracing method based on PHMM
Shahriar et al. Information source-based classification of automatic phishing website detectors
Kumar et al. Operating System Fingerprinting Using Machine Learning
CN112087448B (en) Security log extraction method and device and computer equipment
US20140078913A1 (en) Data packet stream fingerprint
Chen et al. Doctrina: annotated bipartite graph mining for malware-control domain detection
CN115134095A (en) Botnet control terminal detection method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant