CN115314268A - Malicious encrypted traffic detection method and system based on traffic fingerprints and behaviors - Google Patents
Malicious encrypted traffic detection method and system based on traffic fingerprints and behaviors Download PDFInfo
- Publication number
- CN115314268A CN115314268A CN202210896050.7A CN202210896050A CN115314268A CN 115314268 A CN115314268 A CN 115314268A CN 202210896050 A CN202210896050 A CN 202210896050A CN 115314268 A CN115314268 A CN 115314268A
- Authority
- CN
- China
- Prior art keywords
- data stream
- word
- word component
- recognition model
- component matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000006399 behavior Effects 0.000 title claims abstract description 18
- 238000001514 detection method Methods 0.000 title claims abstract description 18
- 239000013598 vector Substances 0.000 claims abstract description 34
- 238000007637 random forest analysis Methods 0.000 claims abstract description 21
- 238000005070 sampling Methods 0.000 claims abstract description 14
- 238000001914 filtration Methods 0.000 claims abstract description 11
- 238000004458 analytical method Methods 0.000 claims abstract description 8
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims abstract description 8
- 230000009467 reduction Effects 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 53
- 238000003066 decision tree Methods 0.000 claims description 19
- 238000011176 pooling Methods 0.000 claims description 18
- 238000000034 method Methods 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 11
- 230000000903 blocking effect Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000009825 accumulation Methods 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013524 data verification Methods 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/33—User authentication using certificates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3263—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving certificates, e.g. public key certificate [PKC] or attribute certificate [AC]; Public key infrastructure [PKI] arrangements
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Bioethics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides a malicious encrypted flow detection method and system based on flow fingerprints and behaviors, which are characterized in that an encryption suite and a digital certificate are extracted from a message load part to generate a flow fingerprint vector, and the flow fingerprint vector and a word component are identified to determine whether the flow fingerprint vector is an attack or not, so that the success rate of identification is improved; the discretized data stream is obtained by dimensionality reduction sampling of the data stream, and the subsequent required operation amount is reduced; the sentence break and the redundancy filtration of the data stream are completed by calling the syntactic model and the semantic analysis model, and the automation of feature extraction is realized; through the convolutional neural network and the random forest classification, the required characteristic vectors can be further highlighted, different classification capabilities are integrated, and the problem that attack changing every moment is difficult to detect in the prior art is solved.
Description
Technical Field
The present application relates to the field of network security technologies, and in particular, to a malicious encrypted traffic detection method and system based on traffic fingerprints and behaviors.
Background
The network is rapidly developed, the security problem of the network is greatly valued by users, and although most of the traffic is encrypted, malicious codes can be generated in the data encryption process and can be transmitted through encryption. It is important to be able to identify malicious encrypted traffic.
Meanwhile, potential malicious codes can bring huge destructive power, and the means and attack forms of the codes are changed all the time and are difficult to detect. There is a need to propose an improvement to improve the machine learning capabilities.
Therefore, a method and a system for targeted malicious encrypted traffic detection based on traffic fingerprints and behaviors are urgently needed.
Disclosure of Invention
The invention aims to provide a malicious encrypted traffic detection method and system based on traffic fingerprints and behaviors, and solves the problems that malicious encrypted traffic cannot be well identified and attacks with means and forms changing constantly are difficult to detect in the prior art.
In a first aspect, the present application provides a malicious encrypted traffic detection method based on traffic fingerprints and behaviors, where the method includes:
receiving a data stream sent by an acquisition terminal, extracting the field content of a message header from the data stream, identifying different clients, and generating an independent identifier for each client;
extracting an encryption suite and a digital certificate from a message load part, and generating a flow fingerprint vector by the identifier together with the encryption suite and the digital certificate;
decrypting the data stream according to the encryption suite, and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimensionality reduction;
obtaining the discrete data stream, calling a syntactic model of a server, and carrying out sentence breaking to obtain a first word component;
inputting the first word components into a semantic analysis model of a server one by one, and receiving returned word meanings corresponding to the first word components;
filtering redundant information from word meanings according to a first rule to obtain a second word component corresponding to the filtered word components, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
inputting the first word component matrix into an input layer of a recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the identification model is a model architecture based on a random forest and a convolutional neural network;
the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model;
the pooling layer selects characteristic values for distinguishing the word meanings effectively by selecting a pooling function, and a third word component matrix is obtained by splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly by using a specified quantity characteristic value to obtain n decision trees, and the n decision trees obtain a classification result according to a voting mode;
and judging whether the data stream sent by the acquisition terminal comprises an attack vector or not according to the classification result, if so, blocking the data stream, and otherwise, allowing the data stream.
In a second aspect, the present application provides a malicious encrypted traffic detection system based on traffic fingerprints and behaviors, the system including:
the system comprises a preprocessing module, a message sending module, a message receiving module, a message sending module and a message sending module, wherein the preprocessing module is used for receiving a data stream sent by an acquisition terminal, extracting the field content of the header of a message from the data stream, identifying different clients and generating a separate identifier for each client; extracting an encryption suite and a digital certificate from a message load part, and generating a traffic fingerprint vector by the identifier together with the encryption suite and the digital certificate;
the decryption module is used for decrypting the data stream according to the encryption suite and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimension reduction;
the AI module is used for acquiring the discrete data stream, calling a syntactic model of the server, and performing sentence break to obtain a first word component; inputting the first word components into a semantic analysis model of a server one by one, and receiving a returned word meaning corresponding to the first word components; filtering redundant information from word meaning according to a first rule to obtain a corresponding second word component after filtering, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
the recognition module comprises a recognition model, the recognition model is a model framework based on a random forest and a convolutional neural network, and is used for receiving the first word component matrix output by the AI module, inputting the first word component matrix into an input layer of the recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model; the pooling layer selects characteristic values for distinguishing the word meanings through selecting a pooling function, and a third word component matrix is obtained through splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly through the specified quantity characteristic values to obtain n decision trees, and the n decision trees obtain classification results in a voting mode;
and the execution module is used for judging whether the data stream sent by the acquisition terminal comprises an attack vector according to the classification result, blocking the data stream if the data stream comprises the attack vector, and allowing the data stream if the data stream does not comprise the attack vector.
In a third aspect, the present application provides a malicious encrypted traffic detection system based on traffic fingerprints and behaviors, the system including a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method of any one of the four possibilities of the first aspect according to instructions in the program code.
In a fourth aspect, the present application provides a computer-readable storage medium for storing program code for performing the method of any one of the four possibilities of the first aspect.
Advantageous effects
The invention provides a malicious encrypted flow detection method and system based on flow fingerprints and behaviors, which are characterized in that an encryption suite and a digital certificate are extracted from a message load part to generate a flow fingerprint vector, and the flow fingerprint vector and a word component are identified together to determine whether the flow fingerprint vector is an attack or not, so that the success rate of identification is improved; the discretized data stream is obtained by dimensionality reduction sampling of the data stream, and the subsequent required operation amount is reduced; by calling the syntactic model and the semantic analysis model, sentence break and redundant filtering of the data stream can be automatically completed, and automation of feature extraction is realized; through the convolutional neural network and the random forest classification, the required feature vectors can be further highlighted, different classification capabilities are integrated, and the problem that attacks changing every moment are difficult to detect in the prior art is solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a general flow chart of a malicious encrypted traffic detection method based on traffic fingerprints and behaviors in accordance with the present invention;
fig. 2 is an architecture diagram of a malicious encrypted traffic detection system based on traffic fingerprints and behaviors according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, and the scope of the present invention will be more clearly and clearly defined.
Fig. 1 is a general flowchart of a malicious encrypted traffic detection method based on traffic fingerprints and behaviors, which includes:
receiving a data stream sent by an acquisition terminal, extracting the field content of the header of a message from the data stream, identifying different clients, and generating an independent identifier for each client;
extracting an encryption suite and a digital certificate from a message load part, and generating a traffic fingerprint vector by the identifier together with the encryption suite and the digital certificate;
decrypting the data stream according to the encryption suite, and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimensionality reduction;
obtaining the discrete data stream, calling a syntactic model of a server, and carrying out sentence breaking to obtain a first word component;
inputting the first word components into a semantic analysis model of a server one by one, and receiving returned word meanings corresponding to the first word components;
filtering redundant information from word meanings according to a first rule to obtain a second word component corresponding to the filtered word components, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
inputting the first word component matrix into an input layer of a recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the identification model is a model architecture based on a random forest and a convolutional neural network;
the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model;
the pooling layer selects characteristic values for distinguishing the word meanings through selecting a pooling function, and a third word component matrix is obtained through splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly by using a specified quantity characteristic value to obtain n decision trees, and the n decision trees obtain a classification result according to a voting mode;
and judging whether the data stream sent by the acquisition terminal comprises an attack vector according to the classification result, if so, blocking the data stream, and otherwise, allowing the data stream.
In some preferred embodiments, the recognition model is trained, and the entropy loss function is minimized through a reverse propagation manner, so as to avoid supersaturation, and when the accuracy of the recognition model meets the requirement of a threshold, the recognition model is trained completely. And then can be used for data verification.
In some preferred embodiments, the classification capability of each decision tree is targeted, the specified quantity feature values are obtained according to different classifications, and the same feature vector matrix is classified according to different angles through the decision trees, so that an integration function aiming at different classification capabilities is completed. The classification performance is higher than that of a single classifier.
The average generalization error of a decision tree in a random forest is related to the regression function.
In some preferred embodiments, the voting includes performing weighted accumulation on the output result of each decision tree.
Fig. 2 is an architecture diagram of a malicious encrypted traffic detection system based on traffic fingerprints and behaviors, the system includes:
the system comprises a preprocessing module, a message sending module and a message sending module, wherein the preprocessing module is used for receiving a data stream sent by an acquisition terminal, extracting the field content of a message header from the data stream, identifying different clients and generating a separate identifier for each client; extracting an encryption suite and a digital certificate from a message load part, and generating a traffic fingerprint vector by the identifier together with the encryption suite and the digital certificate;
the decryption module is used for decrypting the data stream according to the encryption suite and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimension reduction;
the AI module is used for acquiring the discrete data stream, calling a syntactic model of the server, and performing sentence breaking to obtain a first word component; inputting the first word components into a semantic analysis model of a server one by one, and receiving returned word meanings corresponding to the first word components; filtering redundant information from word meaning according to a first rule to obtain a corresponding second word component after filtering, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
the recognition module comprises a recognition model, the recognition model is a model framework based on a random forest and a convolutional neural network, and is used for receiving the first word component matrix output by the AI module, inputting the first word component matrix into an input layer of the recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model; the pooling layer selects characteristic values for distinguishing the word meanings effectively by selecting a pooling function, and a third word component matrix is obtained by splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly by using a specified quantity characteristic value to obtain n decision trees, and the n decision trees obtain a classification result according to a voting mode;
and the execution module is used for judging whether the data stream sent by the acquisition terminal comprises an attack vector according to the classification result, blocking the data stream if the data stream comprises the attack vector, and allowing the data stream if the data stream does not comprise the attack vector.
The application provides a malicious encrypted flow detection system based on flow fingerprint and behavior, the system includes: the system includes a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the embodiments of the first aspect according to instructions in the program code.
The present application provides a computer readable storage medium for storing program code for performing the method of any of the embodiments of the first aspect.
In specific implementation, the present invention further provides a computer storage medium, where the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments of the present invention when executed. The storage medium can be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like.
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The same and similar parts in the various embodiments of the present specification may be referred to each other. In particular, for the embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the description in the method embodiments.
The above-described embodiments of the present invention should not be construed as limiting the scope of the present invention.
Claims (7)
1. A malicious encrypted traffic detection method based on traffic fingerprints and behaviors, the method comprising:
receiving a data stream sent by an acquisition terminal, extracting the field content of the header of a message from the data stream, identifying different clients, and generating an independent identifier for each client;
extracting an encryption suite and a digital certificate from a message load part, and generating a flow fingerprint vector by the identifier together with the encryption suite and the digital certificate;
decrypting the data stream according to the encryption suite, and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimensionality reduction;
obtaining the discrete data stream, calling a syntactic model of a server, and carrying out sentence breaking to obtain a first word component;
inputting the first word components into a semantic analysis model of a server one by one, and receiving returned word meanings corresponding to the first word components;
filtering redundant information from word meaning according to a first rule to obtain a corresponding second word component after filtering, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
inputting the first word component matrix into an input layer of a recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the identification model is a model architecture based on a random forest and a convolutional neural network;
the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model;
the pooling layer selects characteristic values for distinguishing the word meanings through selecting a pooling function, and a third word component matrix is obtained through splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly through the specified quantity characteristic values to obtain n decision trees, and the n decision trees obtain classification results in a voting mode;
and judging whether the data stream sent by the acquisition terminal comprises an attack vector according to the classification result, if so, blocking the data stream, and otherwise, allowing the data stream.
2. The method of claim 1, wherein: when the recognition model is trained, the entropy loss function is minimized through a reverse propagation mode, supersaturation is avoided, and when the precision of the recognition model meets the requirement of a threshold value, the recognition model is trained.
3. The method of claim 1, wherein: the classification capability of each decision tree has pertinence, the specified quantity characteristic value is obtained according to different classifications, and the same characteristic vector matrix is classified according to different angles through the decision trees, so that the integration function aiming at different classification capabilities is completed.
4. A method according to any of claims 2 or 3, characterized in that: the voting mode comprises the step of performing weighted accumulation on the output result of each decision tree.
5. A malicious encrypted traffic detection system based on traffic fingerprints and behaviors, the system comprising:
the system comprises a preprocessing module, a message sending module, a message receiving module, a message sending module and a message sending module, wherein the preprocessing module is used for receiving a data stream sent by an acquisition terminal, extracting the field content of the header of a message from the data stream, identifying different clients and generating a separate identifier for each client; extracting an encryption suite and a digital certificate from a message load part, and generating a flow fingerprint vector by the identifier together with the encryption suite and the digital certificate;
the decryption module is used for decrypting the data stream according to the encryption suite and sampling the data stream according to time domain continuity to obtain a discrete data stream after dimension reduction;
the AI module is used for acquiring the discrete data stream, calling a syntactic model of the server, and performing sentence breaking to obtain a first word component; inputting the first word components into a semantic analysis model of a server one by one, and receiving returned word meanings corresponding to the first word components; filtering redundant information from word meanings according to a first rule to obtain a second word component corresponding to the filtered word components, and inputting the flow fingerprint vector and the second word component into a matrix template together to obtain a first word component matrix;
the recognition module comprises a recognition model, the recognition model is a model framework based on a random forest and a convolutional neural network, and is used for receiving the first word component matrix output by the AI module, inputting the first word component matrix into an input layer of the recognition model, and calculating standard deviations of different parts of speech, wherein the standard deviations are used for determining the width of a sliding window of a subsequent convolutional layer; the output of the input layer is sent into a convolutional layer of the recognition model, local word components in the text are selected by utilizing sliding windows with different sizes, a second word component matrix is obtained by splicing the local word components, and the second word component matrix is sent into a pooling layer of the recognition model; the pooling layer selects characteristic values for distinguishing the word meanings effectively by selecting a pooling function, and a third word component matrix is obtained by splicing again;
the processed third word component matrix is transmitted to a random forest of the recognition model for classification, the random forest performs n rounds of extraction on the third word component matrix to obtain n training sets, the extracted n training sets are used for training by column sampling randomly through the specified quantity characteristic values to obtain n decision trees, and the n decision trees obtain classification results in a voting mode;
and the execution module is used for judging whether the data stream sent by the acquisition terminal comprises an attack vector according to the classification result, blocking the data stream if the data stream comprises the attack vector, and allowing the data stream if the data stream does not comprise the attack vector.
6. A malicious encrypted traffic detection system based on traffic fingerprints and behaviors, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to instructions in the program code to implement any of claims 1-4.
7. A computer-readable storage medium, characterized in that the computer-readable storage medium is configured to store a program code for performing implementing the method of any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210896050.7A CN115314268B (en) | 2022-07-27 | 2022-07-27 | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210896050.7A CN115314268B (en) | 2022-07-27 | 2022-07-27 | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115314268A true CN115314268A (en) | 2022-11-08 |
CN115314268B CN115314268B (en) | 2023-12-12 |
Family
ID=83858890
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210896050.7A Active CN115314268B (en) | 2022-07-27 | 2022-07-27 | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115314268B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115333802A (en) * | 2022-07-27 | 2022-11-11 | 北京国瑞数智技术有限公司 | Malicious program detection method and system based on neural network |
CN115941361A (en) * | 2023-02-16 | 2023-04-07 | 科来网络技术股份有限公司 | Malicious traffic identification method, device and equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7712134B1 (en) * | 2006-01-06 | 2010-05-04 | Narus, Inc. | Method and apparatus for worm detection and containment in the internet core |
EP2343864A2 (en) * | 2010-01-08 | 2011-07-13 | Juniper Networks, Inc. | High availability for network security devices |
CN107483488A (en) * | 2017-09-18 | 2017-12-15 | 济南互信软件有限公司 | A kind of malice Http detection methods and system |
CN110784429A (en) * | 2018-07-11 | 2020-02-11 | 北京京东尚科信息技术有限公司 | Malicious traffic detection method and device and computer readable storage medium |
CN112738039A (en) * | 2020-12-18 | 2021-04-30 | 北京中科研究院 | Malicious encrypted flow detection method, system and equipment based on flow behavior |
CN113949531A (en) * | 2021-09-14 | 2022-01-18 | 北京邮电大学 | Malicious encrypted flow detection method and device |
CN114172748A (en) * | 2022-02-10 | 2022-03-11 | 中国矿业大学(北京) | Encrypted malicious traffic detection method |
CN115086060A (en) * | 2022-06-30 | 2022-09-20 | 深信服科技股份有限公司 | Flow detection method, device and equipment and readable storage medium |
CN115238799A (en) * | 2022-07-27 | 2022-10-25 | 天津市国瑞数码安全系统股份有限公司 | AI-based random forest malicious traffic detection method and system |
-
2022
- 2022-07-27 CN CN202210896050.7A patent/CN115314268B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7712134B1 (en) * | 2006-01-06 | 2010-05-04 | Narus, Inc. | Method and apparatus for worm detection and containment in the internet core |
EP2343864A2 (en) * | 2010-01-08 | 2011-07-13 | Juniper Networks, Inc. | High availability for network security devices |
CN107483488A (en) * | 2017-09-18 | 2017-12-15 | 济南互信软件有限公司 | A kind of malice Http detection methods and system |
CN110784429A (en) * | 2018-07-11 | 2020-02-11 | 北京京东尚科信息技术有限公司 | Malicious traffic detection method and device and computer readable storage medium |
CN112738039A (en) * | 2020-12-18 | 2021-04-30 | 北京中科研究院 | Malicious encrypted flow detection method, system and equipment based on flow behavior |
CN113949531A (en) * | 2021-09-14 | 2022-01-18 | 北京邮电大学 | Malicious encrypted flow detection method and device |
CN114172748A (en) * | 2022-02-10 | 2022-03-11 | 中国矿业大学(北京) | Encrypted malicious traffic detection method |
CN115086060A (en) * | 2022-06-30 | 2022-09-20 | 深信服科技股份有限公司 | Flow detection method, device and equipment and readable storage medium |
CN115238799A (en) * | 2022-07-27 | 2022-10-25 | 天津市国瑞数码安全系统股份有限公司 | AI-based random forest malicious traffic detection method and system |
Non-Patent Citations (1)
Title |
---|
沈昊: "云平台恶意网页流量的检测方法研究", 中国优秀硕士学位论文全文数据库信息科技辑, no. 9 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115333802A (en) * | 2022-07-27 | 2022-11-11 | 北京国瑞数智技术有限公司 | Malicious program detection method and system based on neural network |
CN115333802B (en) * | 2022-07-27 | 2024-08-13 | 北京国瑞数智技术有限公司 | Malicious program detection method and system based on neural network |
CN115941361A (en) * | 2023-02-16 | 2023-04-07 | 科来网络技术股份有限公司 | Malicious traffic identification method, device and equipment |
CN115941361B (en) * | 2023-02-16 | 2023-05-09 | 科来网络技术股份有限公司 | Malicious traffic identification method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN115314268B (en) | 2023-12-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635296B (en) | New word mining method, device computer equipment and storage medium | |
CN115314268B (en) | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior | |
US11483340B2 (en) | System for malicious HTTP traffic detection with multi-field relation | |
CN111740946B (en) | Webshell message detection method and device | |
CN113688240B (en) | Threat element extraction method, threat element extraction device, threat element extraction equipment and storage medium | |
CN115238799A (en) | AI-based random forest malicious traffic detection method and system | |
CN113450147A (en) | Product matching method, device and equipment based on decision tree and storage medium | |
Khan | Detection of phishing websites using deep learning techniques | |
CN116346397A (en) | Network request abnormality detection method and device, equipment, medium and product thereof | |
CN113067792A (en) | XSS attack identification method, device, equipment and medium | |
Abaimov et al. | A survey on the application of deep learning for code injection detection | |
CN111414621B (en) | Malicious webpage file identification method and device | |
CN116055067B (en) | Weak password detection method, device, electronic equipment and medium | |
CN113918936A (en) | SQL injection attack detection method and device | |
CN114528908B (en) | Network request data classification model training method, classification method and storage medium | |
CN115774762A (en) | Instant messaging information processing method, device, equipment and storage medium | |
CN115392238A (en) | Equipment identification method, device, equipment and readable storage medium | |
CN114881012A (en) | Article title and content intelligent rewriting system and method based on natural language processing | |
CN115333802B (en) | Malicious program detection method and system based on neural network | |
CN116414976A (en) | Document detection method and device and electronic equipment | |
CN113645222A (en) | Message flow detection method, system, device and computer readable storage medium | |
CN112597498A (en) | Webshell detection method, system and device and readable storage medium | |
CN114338058A (en) | Information processing method, device and storage medium | |
CN111861379A (en) | Chat data detection method and device | |
CN113065348B (en) | Internet negative information monitoring method based on Bert model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |