CN111884813A - Malicious certificate detection method - Google Patents

Malicious certificate detection method Download PDF

Info

Publication number
CN111884813A
CN111884813A CN202010775718.3A CN202010775718A CN111884813A CN 111884813 A CN111884813 A CN 111884813A CN 202010775718 A CN202010775718 A CN 202010775718A CN 111884813 A CN111884813 A CN 111884813A
Authority
CN
China
Prior art keywords
certificate
information
malicious
data
chain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010775718.3A
Other languages
Chinese (zh)
Other versions
CN111884813B (en
Inventor
闫健恩
李佳欣
程亚楠
张兆心
黄俊凯
姚雨辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202010775718.3A priority Critical patent/CN111884813B/en
Publication of CN111884813A publication Critical patent/CN111884813A/en
Application granted granted Critical
Publication of CN111884813B publication Critical patent/CN111884813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3263Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving certificates, e.g. public key certificate [PKC] or attribute certificate [AC]; Public key infrastructure [PKI] arrangements
    • H04L9/3265Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving certificates, e.g. public key certificate [PKC] or attribute certificate [AC]; Public key infrastructure [PKI] arrangements using certificate chains, trees or paths; Hierarchical trust model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3247Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving digital signatures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method for detecting a malicious certificate, which solves the technical problems of low accuracy rate and low malicious certificate detection range of a method for detecting the malicious certificate, and comprises the following steps: carrying out basic content analysis and normalization test on the certificate, and judging whether the certificate conforms to RFC 5280; the trusted root certificate and the intermediate certificate acquired from the CCADB are combined with CERT _ ISSUER in AIA expansion information of the certificate to construct a complete certificate chain, certificate signatures are verified, and certificates on the whole certificate chain are verified; extracting the characteristics of the content of the certificate and the related verification information which are obtained before; benign certificate data and malicious certificate data are collected, and the characteristics of the certificates are extracted; and after the data features are extracted, a detection model is constructed and the verification of the malicious certificate is realized. The invention can be widely applied to detecting the malicious X.509 certificate.

Description

Malicious certificate detection method
Technical Field
The invention relates to the field of certificate encryption, in particular to a malicious certificate detection method.
Background
The x.509 certificate is the basis of the HTTPS protocol, which is authentication information issued by a certificate authority for encrypting a public key of transmission information. The x.509 certificate contains a signature algorithm used in signing by the certification authority, a signature processed by a private key of the certification authority, and some basic information about the certification authority, the certificate authority and the public key. When communication based on the HTTPS protocol is carried out, the server side sends an X.509 certificate of the server side to the client side in a handshake phase, and the client side determines whether the certificate is credible after certificate information analysis and certificate chain construction. The determined indicators generally include whether the certificate has expired, whether a root certificate of the certificate is trusted, whether the certificate is in a certificate revocation list and whether an online verification status of the certificate is valid. Since the x.509 certificate can guarantee the security of access and connection to some extent, more and more websites use the HTTPS protocol for information transfer. This also raises some problems, and control servers of phishing websites and some dead networks also start to use certificates to encrypt data information, thereby evading detection by some malicious traffic analysis tools. The Phishing activity trend report published by the Phishing Group APWG (Anti-Phishing Working Group) in the first quarter of 2020 shows that approximately 75% of Phishing websites use x.509 certificates to transfer data and disguise themselves. The SSLBlacklist project collects information about certificates used in communication between some dead web servers and dead nodes, which have been discovered from 2014 to date. These certificates used for malicious activities are called malicious certificates, and how to detect these malicious x.509 certificates becomes a troublesome problem.
At present, methods for detecting malicious x.509 certificates mainly include: the black chain is adopted to filter the certificate, and the method has low detection rate on the newly appeared malicious certificate; secondly, the certificate information is counted by adopting a statistical method, if some information appears, the certificate is judged to be a malicious certificate, and the method has low accuracy and is easy to cheat; some methods of classical machine learning are adopted to model benign and malicious certificates to detect malicious certificates, and the accuracy of the methods also has a space for improvement. The above methods are different in the adopted technology, and the malicious certificate range covered by different methods is also different, that is, one of the existing phishing website certificate and dead network certificate analysis is not comprehensive enough.
Disclosure of Invention
The invention provides a malicious certificate detection method with a wider detection range based on integrated learning, aiming at the technical problems that the existing malicious certificate detection method is low in accuracy rate and relates to a malicious certificate with a narrow range.
The invention provides a malicious certificate detection method, which comprises the following steps:
firstly, performing basic content analysis and normalization inspection on a certificate based on Cryptographic and RFC 5280, preliminarily acquiring basic information of the certificate, judging whether the basic information meets some specifications and constraints of RFC 5280 or not, and recording related information;
secondly, constructing a complete certificate chain based on the trusted root certificate and the intermediate certificate acquired from the CCADB and combined with CERT _ ISSUER in AIA extension information of the certificate, verifying a certificate signature and recording corresponding information;
if the certificate chain is successfully constructed, no error that the signature verification is inconsistent occurs, and the final root certificate, namely the credible self-signed certificate, is found, then the OpenSSL is used for verifying the certificate chain for each certificate of the certificate chain until the certificates on the whole certificate chain are verified; if the certificate verification on the certificate chain is not successful, directly performing the step four;
fourthly, extracting the characteristics of the content of the certificate obtained before and the related verification information;
fifth, collect benign certificate data and malicious certificate data, carry on the extraction of the characteristic to the certificate;
and sixthly, after the data features are extracted, constructing a detection model and realizing the verification of the malicious certificate.
Preferably, the step one of acquiring the basic information of the certificate comprises the following steps:
step (1): importing the possibly input pem and cer format certificates, and converting and storing the formats;
step (2): acquiring basic information and possible expansion information of the X.509 certificate according to Cryptographic;
and (3): according to RFC 5280, some of the specification constraints involved in a document are checked and relevant checking information is recorded.
Preferably, the certificate chain verification in step three is performed as follows:
step 1): importing a root certificate library which is credible under the windows into the context to serve as a basic credible certificate library;
step 2): loading CRL information collected in the certificate parsing process to a context for performing CRL verification on the certificate and verifying whether the certificate is in a certificate revocation list;
step 3): and adding each certificate to the context according to the sequence from the root certificate to the end certificate in the certificate chain and verifying the certificate chain currently added to the certificate, and adding newly found CRL information to the context to facilitate verification, which is equivalent to verifying the certificate chain of each certificate in the whole certificate chain and simultaneously recording corresponding information.
Preferably, the features extracted in step four include:
A. basic information of the certificate;
B. specification verification information of the certificate;
C. authentication information of certificate chains.
Preferably, the step five specific method comprises the following steps:
benign certificate data is a certificate corresponding to one million domain names before Alexa, a certificate corresponding to an Alexa website is directly acquired from scans.io, and the number of the certificates which can be acquired is about 70 million; the malicious domain name certificates are some X.509 certificates used by a phishing website and a dead network control server, the acquisition of the phishing website and the dead network control server certificates firstly needs to acquire corresponding domain names and certificate fingerprint information from corresponding phishtank and SSL Blacklist websites, if the acquired domain name information is acquired, the certificate information is acquired by using HTTPS connection, and multi-process acceleration is used; and if the acquired fingerprint information is acquired, directly downloading the information of the certificate from crt.sh by using the fingerprint information.
Preferably, the specific method for constructing the detection model in the sixth step includes:
converting the benign certificate data to make the benign data and the malicious certificate data consistent in quantity; constructing a model and selecting an optimal hyper-parameter by adopting XgBoost, LightGBM and Catboost models; arbitrary shuffling is required when merging data, division at a rate of 0.15 when dividing training data and validation data, and validation of model generalization in a cross-validation manner.
Preferably, the construction of the model comprises the steps of:
step 1: importing data, processing default values, and not regularizing the data;
step 2: dividing data, namely dividing a verification set in a proportion of 0.15;
and step 3: training the model and finding the optimal hyper-parameter.
Preferably, the specific implementation steps of the verification of the malicious certificate in the sixth step are as follows:
step a): performing basic verification and specification constraint check based on RFC 5280 on the certificate of the data;
step b): constructing a certificate chain and verifying the certificate chain;
step c): extracting the features based on the two steps to obtain the specific features of the certificate;
step d): and inputting the acquired features into the trained model, and acquiring whether the certificate judgment result of the model is a malicious certificate or not.
The invention has the beneficial effects that: aiming at the problems of lack of accuracy and narrow range of related malicious certificates in the conventional malicious certificate detection method, the invention extracts features based on RFC 5280 and some information in the certificate chain verification process, collects wider malicious certificate data, and adopts an ensemble learning method to construct a malicious certificate detection model with higher accuracy and wider related range. The problem faced by the current malicious certificate detection is effectively solved.
Drawings
FIG. 1 is a model workflow diagram of the present invention;
FIG. 2 is a flow chart of the construction of the certificate chain of the present invention;
fig. 3 is a certificate verification data flow diagram of the present invention.
Detailed Description
The present invention is further described below with reference to the drawings and examples so that those skilled in the art can easily practice the present invention.
Example 1: fig. 1-3 show a work flow diagram of the model, a construction flow diagram of the certificate chain, and a certificate verification data flow diagram, respectively.
The invention firstly analyzes the basic content and checks the normalization of the certificate based on Cryptographic and RFC 5280, preliminarily obtains the basic information of the certificate, judges whether the certificate conforms to some specifications and constraints of RFC 5280 or not, and records the related information. The process mainly comprises the following steps:
step (1): and importing the possibly input pep and cer format certificates, and converting and storing the formats.
Step (2): and acquiring basic information and possible extension information of the X.509 certificate according to Cryptographic.
And (3): according to RFC 5280, some of the specification constraints involved in a document are checked and relevant checking information is recorded.
Secondly, a complete certificate chain is constructed based on a trusted root certificate and an intermediate certificate acquired from a CCADB (common CA database) in combination with CERT _ ISSUER in AIA (authorization Information Access) extension Information of the certificate. CCADB is a root certificate and intermediate certificate repository maintained by Mozilla mastery, which is supported by microsoft and google. The method for searching the upper certificate of a certain X.509 certificate mainly comprises the following steps: the method is constructed through a certificate library and then acquired according to CERT _ ISSUER information existing in certificate expansion. The invention combines the two forms and can effectively construct the certificate. In the process of constructing the certificate chain, the problem of signature verification is also involved, and the certificate and the upper-level certificate thereof can be confirmed only if the signature verification is successful. The specific method is to use the found superior certificate, the signature value of the current certificate, the signature algorithm and tbs information, namely the information input during signature, to verify whether the signature of the current certificate is issued by the private key of the superior certificate. Some information generated in this process is also recorded accordingly for later feature extraction.
If the certificate chain is successfully constructed, namely no error that the signature verification is inconsistent occurs in the construction process, and the final root certificate, namely the trusted self-signed certificate, is found, the OpenSSL is used for verifying the certificate chain for each certificate of the certificate chain until the certificates on the whole certificate chain are verified. The specific process is as follows:
step 1): importing a root certificate library which is credible under the windows into the context to serve as a basic credible certificate library;
step 2): loading CRL information collected in the certificate parsing process to a context for performing CRL verification on the certificate and verifying whether the certificate is in a certificate revocation list;
step 3): adding each certificate to the context in the order of root certificate to end certificate in the certificate chain and verifying the certificate chain to which the certificate is currently added, and adding newly found CRL information to the context facilitates verification, which is equivalent to verifying the certificate chain for each certificate throughout the certificate chain. The process also has corresponding information recorded.
Whether the verification is successful or not, feature extraction needs to be carried out on the content of the certificate obtained before and related verification information, and the total extracted features are 169, and relate to some information and marks generated in the process. The extracted features can be broadly classified into three categories:
A. basic information of certificate
B. Canonical verification information for certificates
C. Authentication information for certificate chains
So far, after information analysis and verification of a certificate and characteristic extraction of the certificate are finished, data are collected, benign certificate data adopted by the method are certificates corresponding to one million domain names before Alexa, and malicious domain name certificates are X.509 certificates used by phishing websites and dead network control servers. The Alexa website corresponding certificates are obtained directly from scales, io, which is approximately 70 ten thousand. The acquisition of the phishing website and the dead network control server certificate firstly needs to acquire corresponding domain name and certificate fingerprint information from corresponding phishtank and SSL Blacklist websites. If the domain name information is acquired, acquiring certificate information by using HTTPS connection, and accelerating by using multiple processes; if the acquired fingerprint information is acquired, the information of the certificate is directly downloaded from crt.sh by using the fingerprint information, and finally acquired malicious certificate data is about 2000. After the benign and malicious certificate data exist, the certificate is subjected to feature extraction according to the above process.
And after the data features are extracted, constructing and realizing a detection model. The data quantity of the extracted benign certificate and the extracted malicious certificate is greatly different, and the problem of data unbalance is obvious. The method for solving the problem is to transform the benign certificate data to ensure that the benign data is consistent with the malicious certificate data, namely the data used in the model training process is 4000, and the benign certificate and the malicious certificate are 2000 respectively. Arbitrary shuffling is required when merging data, division at a rate of 0.15 when dividing training data and validation data, and validation of model generalization in a cross-validation manner. Model selection may have a problem of low accuracy if the conventional machine learning method is used, and may cause a problem of model overfitting due to too small data amount if deep learning is used. The specific model is constructed by adopting XgBoost, LightGBM and Catboost models which are relatively popular in the industrial and academic circles. They are gradient lifting tree based algorithms, but there are some differences in training time and performance of final results, and these three types of models are better models in ensemble learning. XgBoost was proposed by Tianqi Chen et al in the competition, LightGBM by Guolin Ke et al, CatBost by Liudmila Prokhorenkova et al; the decision trees are all based on a model of the decision tree, a plurality of weak decision tree models are combined together to form a strong classifier, the decision trees are formed in a sequential manner, and the subsequent decision tree constructs a new decision tree aiming at improving the accuracy of the model according to the training effect of the previously formed decision tree until the model reaches the upper limit of the corresponding precision or the number of the decision trees. The three models are different in optimization and iteration strategies of some algorithms, so that the trained models have certain performance differences. The method adopts python realization of three types of models to construct a model to search for the model with better performance on the problem, gives the searching range of model parameters, and searches for the optimal model parameters in the given parameter range by means of GridSearchCV grid searching, wherein GridSearchCV is a model parameter searching tool in skearn. The three models are respectively used for constructing the model and selecting the optimal hyper-parameter, and the specific process comprises the following steps:
step 1: data import and some default value processing, considering the convenience of certificate detection later and the fact that the three types of models are tree-based models, data are not regularized, and missing values are discarded directly due to the fact that the data are few and the data are missing due to large data quantity;
step 2: dividing data, namely mixing and disordering abnormal certificate data and benign certificate data with consistent quantity, and dividing a training set and a verification set by a train _ test _ split tool in a skearn in a proportion of 0.15;
and step 3: training the model and finding the optimal hyper-parameter, constructing the model by using a Python realization library of the model, giving a section of parameter searching range, and finding the optimal model parameter in the given parameter range in a GridSearchCV grid searching mode.
With the trained model, the specific implementation of the verification of the malicious certificate follows. The method comprises the following specific steps:
step a): performing basic verification and specification constraint check based on RFC 5280 on the certificate of the data;
step b): constructing a certificate chain and verifying the certificate chain;
step c): extracting the features based on the two steps to obtain the specific features of the certificate;
step d): and inputting the acquired features into the trained model, and acquiring whether the certificate judgment result of the model is a malicious certificate or not.
The entire certificate detection process is now complete.
Example 2
In the basic analysis and verification process of the input X.509 certificate, basic information about a certificate main body, a certificate issuer, certificate expansion, public key use and the like stored in the certificate is extracted. Some restrictions in RFC 5280, such as whether decipher _ only and encipher _ only in the certificate public key usage are allowed to be set in the public key usage with key _ element set to true, whether serial _ number is the maximum positive integer represented by no more than 20 bytes, and the like, are innovatively set. The restrictions on the canonical certificates in the RFC 5280 document are carefully integrated and checked and relevant information is recorded. This completes the basic parsing and information checking of the certificate.
Example 3
The certificate chain is a complete certificate list from the end certificate to the root certificate, and the signatures of other certificates in the list except the root certificate can be verified by the public key of the superior certificate. Authentication of the certificate chain in addition to basic signature verification includes whether the certificate usage is in compliance, whether the certificate policy can be in compliance, whether the name constraints of the certificate are in compliance, whether the policy constraints of the certificate are in compliance, and the like. This information is detected during the authentication of the certificate chain, which also yields some relevant information. This process is performed for each certificate in the certificate chain, and it can be verified whether there are some problems with the intermediate certificate.
Example 4
Most of the information of the certificate can be obtained through basic parsing, specification verification and verification of the certificate chain. This information is then extracted and processed for later input into the model. After the data characteristics of benign certificates and malicious certificates are extracted, different integrated learning models are selected for training and parameter adjustment to find the model with the highest accuracy. Experimental results show that the accuracy of the integrated learning model can reach about 97%, and certificates used by phishing websites and dead network servers can be detected, so that some problems in the malicious certificate detection technology are effectively solved.
The above description is only for the purpose of illustrating preferred embodiments of the present invention and is not to be construed as limiting the present invention, and it is apparent to those skilled in the art that various modifications and variations can be made in the present invention. All changes, equivalents, modifications and the like which come within the scope of the invention as defined by the appended claims are intended to be embraced therein.

Claims (8)

1. A malicious certificate detection method is characterized by comprising the following steps:
firstly, performing basic content analysis and normalization inspection on a certificate based on Cryptographic and RFC 5280, preliminarily acquiring basic information of the certificate, judging whether the basic information meets the specification and constraint of RFC 5280 or not, and recording related information;
secondly, constructing a complete certificate chain based on the trusted root certificate and the intermediate certificate acquired from the CCADB and combined with CERT _ ISSUER in AIA extension information of the certificate, verifying a certificate signature and recording corresponding information;
if the certificate chain is successfully constructed, no error that the signature verification is inconsistent occurs, and the final root certificate, namely the credible self-signed certificate, is found, then the OpenSSL is used for verifying the certificate chain for each certificate of the certificate chain until the certificates on the whole certificate chain are verified; if the certificate verification on the certificate chain is not successful, directly performing the step four;
fourthly, extracting the characteristics of the content of the certificate obtained before and the related verification information;
fifth, collect benign certificate data and malicious certificate data, carry on the extraction of the characteristic to the certificate;
and sixthly, after the data features are extracted, constructing a detection model and realizing the verification of the malicious certificate.
2. The malicious certificate detection method according to claim 1, wherein the step one of obtaining the basic information of the certificate comprises the steps of:
step (1): importing input pem and cer format certificates, and converting and storing formats;
step (2): acquiring basic information and existing expansion information of the X.509 certificate according to Cryptographic;
and (3): according to RFC 5280, the specification constraints involved in the document are checked and relevant checking information is recorded.
3. The malicious certificate detection method according to claim 1, wherein the certificate chain verification in the third step is performed as follows:
step 1): importing a root certificate library which is credible under the windows into the context to serve as a basic credible certificate library;
step 2): loading CRL information collected in the certificate parsing process to a context for performing CRL verification on the certificate and verifying whether the certificate is in a certificate revocation list;
step 3): and adding each certificate to the context according to the sequence from the root certificate to the end certificate in the certificate chain and verifying the certificate chain currently added to the certificate, and adding newly found CRL information to the context to facilitate verification, which is equivalent to verifying the certificate chain of each certificate in the whole certificate chain and simultaneously recording corresponding information.
4. The malicious certificate detection method according to claim 1, wherein the features extracted in the fourth step include:
A. basic information of the certificate;
B. specification verification information of the certificate;
C. authentication information of certificate chains.
5. The malicious certificate detection method according to claim 1, wherein the step five specific method comprises:
benign certificate data is a certificate corresponding to one million domain names before Alexa, and the certificate corresponding to the Alexa website is directly acquired from scans.io; the malicious domain name certificate is an X.509 certificate used by a phishing website and a dead network control server, the acquisition of the phishing website and the dead network control server certificate firstly needs to acquire corresponding domain name and certificate fingerprint information from corresponding phishtank and SSL Blacklist websites, if the acquired domain name information is acquired, the certificate information is acquired by using HTTPS connection, and multi-process acceleration is used; and if the acquired fingerprint information is acquired, directly downloading the information of the certificate from crt.sh by using the fingerprint information.
6. The malicious certificate detection method according to claim 1, wherein the specific method for constructing the detection model in the sixth step includes:
converting the benign certificate data to make the benign data and the malicious certificate data consistent in quantity; constructing a model and selecting an optimal hyper-parameter by adopting XgBoost, LightGBM and Catboost models; arbitrary shuffling is required when merging data, division at a rate of 0.15 when dividing training data and validation data, and validation of model generalization in a cross-validation manner.
7. The malicious certificate detection method according to claim 6, wherein the model is constructed by the following steps:
step 1: importing data, processing default values, and not regularizing the data;
step 2: dividing data, namely dividing a verification set in a proportion of 0.15;
and step 3: training the model and finding the optimal hyper-parameter.
8. The method for detecting a malicious certificate according to claim 1, wherein the step six for verifying the malicious certificate comprises the following steps:
step a): performing basic verification and specification constraint check based on RFC 5280 on the certificate of the data;
step b): constructing a certificate chain and verifying the certificate chain;
step c): extracting the features based on the two steps to obtain the specific features of the certificate;
step d): and inputting the acquired features into the trained model, and acquiring whether the certificate judgment result of the model is a malicious certificate or not.
CN202010775718.3A 2020-08-05 2020-08-05 Malicious certificate detection method Active CN111884813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010775718.3A CN111884813B (en) 2020-08-05 2020-08-05 Malicious certificate detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010775718.3A CN111884813B (en) 2020-08-05 2020-08-05 Malicious certificate detection method

Publications (2)

Publication Number Publication Date
CN111884813A true CN111884813A (en) 2020-11-03
CN111884813B CN111884813B (en) 2022-03-25

Family

ID=73211704

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010775718.3A Active CN111884813B (en) 2020-08-05 2020-08-05 Malicious certificate detection method

Country Status (1)

Country Link
CN (1) CN111884813B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347475A (en) * 2020-11-11 2021-02-09 北京航空航天大学 Malicious certificate automatic detection system and method based on deep learning technology

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103117862A (en) * 2013-02-18 2013-05-22 无锡矽鼎科技有限公司 Method for using X.509 digital certificate of openssl for verifying Java certificate
CN106603519A (en) * 2016-12-07 2017-04-26 中国科学院信息工程研究所 SSL/TLS encrypted malicious service discovery method based on certificate characteristic generalization and server change behavior
US20170118196A1 (en) * 2015-10-23 2017-04-27 Oracle International Corporation Enforcing server authentication based on a hardware token
CN110049052A (en) * 2019-04-23 2019-07-23 哈尔滨工业大学(威海) The malice domain name detection method of label and attribute similarity based on dom tree
CN110113349A (en) * 2019-05-15 2019-08-09 北京工业大学 A kind of malice encryption traffic characteristics analysis method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103117862A (en) * 2013-02-18 2013-05-22 无锡矽鼎科技有限公司 Method for using X.509 digital certificate of openssl for verifying Java certificate
US20170118196A1 (en) * 2015-10-23 2017-04-27 Oracle International Corporation Enforcing server authentication based on a hardware token
CN106603519A (en) * 2016-12-07 2017-04-26 中国科学院信息工程研究所 SSL/TLS encrypted malicious service discovery method based on certificate characteristic generalization and server change behavior
CN110049052A (en) * 2019-04-23 2019-07-23 哈尔滨工业大学(威海) The malice domain name detection method of label and attribute similarity based on dom tree
CN110113349A (en) * 2019-05-15 2019-08-09 北京工业大学 A kind of malice encryption traffic characteristics analysis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DAVID W. CHADWICK ET AL: ""Role-based_access_control_with_X.509_attribute_certificates"", 《IEEE INTERNET COMPUTING》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347475A (en) * 2020-11-11 2021-02-09 北京航空航天大学 Malicious certificate automatic detection system and method based on deep learning technology
CN112347475B (en) * 2020-11-11 2022-05-17 北京航空航天大学 Malicious certificate automatic detection system and method based on deep learning technology

Also Published As

Publication number Publication date
CN111884813B (en) 2022-03-25

Similar Documents

Publication Publication Date Title
EP3693886B1 (en) Optimizations for verification of interactions system and method
CN107749848B (en) Internet of things data processing method and device and Internet of things system
Holz et al. The SSL landscape: a thorough analysis of the x. 509 PKI using active and passive measurements
JP2005341552A5 (en)
EP3313020B1 (en) Method of digital identity generation and authentication
Accorsi Log data as digital evidence: What secure logging protocols have to offer?
CN112636922B (en) IOT identity authentication method based on non-interactive zero-knowledge proof
EP2446390A1 (en) System and method for reliably authenticating an appliance
CN111884813B (en) Malicious certificate detection method
CN113609533B (en) Integrity auditing method for smart grid data
CN114553444B (en) Identity authentication method, identity authentication device and storage medium
Zhu et al. Guided, Deep Testing of X. 509 Certificate Validation via Coverage Transfer Graphs
CN115208628A (en) Data integrity verification method based on block chain
Quan et al. SADT: syntax-aware differential testing of certificate validation in SSL/TLS implementations
Alrawais et al. X. 509 check: A tool to check the safety and security of digital certificates
Pai et al. Novel TLS signature extraction for malware detection
CN115964407A (en) Double-copy power network security audit method, system, medium and equipment
CN114726502A (en) Safety system based on Internet of things and big data
RU2571382C1 (en) System and method for antivirus scanning depending on certificate trust level
CN113326527A (en) Credible digital signature system and method based on block chain
CN117688620B (en) Certificate verification optimization method and system based on big data information security
CN112116461A (en) Block chain and consensus method thereof
CN114844857B (en) Automatic website HTTPS deployment measurement method based on domain name
CN114640475B (en) Decentralized identity authentication method and device, computer equipment and storage medium
QUAN et al. SADT: Syntax-aware differential testing of certificate validation in SSL/TLS Implementations.(2020)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant