CN112822167B

CN112822167B - Abnormal TLS encrypted traffic detection method and system

Info

Publication number: CN112822167B
Application number: CN202011614293.4A
Authority: CN
Inventors: 樊树胜; 贺本彪
Original assignee: Hangzhou Zhongdian Anke Modern Technology Co ltd
Current assignee: Hangzhou Zhongdian Anke Modern Technology Co ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2023-04-07
Anticipated expiration: 2040-12-31
Also published as: CN112822167A

Abstract

The invention provides a detection method for abnormal TLS encrypted traffic, which is characterized by comprising the following steps: s1: respectively acquiring a flow message data set S2 of abnormal encrypted flow and normal flow: carrying out information preprocessing on the acquired TLS flow message data set of the normal flow; s3: establishing a decision tree prediction model, introducing a data sample into a random forest model, and training the decision tree prediction model; s4: and classifying or predicting the preprocessed new data by using a classification algorithm according to the generated rule. According to the method, the abnormal flow in the encrypted flow message can be identified by using a random forest algorithm, technical support can be provided for user privacy protection and network security, and TLS handshake information in the abnormal encrypted flow message plays an important role in identifying the abnormal encrypted flow message in the process of identifying the abnormal flow message.

Description

Abnormal TLS encrypted traffic detection method and system

Technical Field

The invention relates to the field of network technology security, in particular to a method and a system for detecting abnormal TLS encrypted traffic.

Background

The statements of background art in this application, as they pertain to the present application, are provided solely for the purpose of illustration and description to facilitate an understanding of the present application, and are not to be construed as admissions or conjectures of applicants as prior art at the date of filing the present application for the first time.

With the development of artificial intelligence, some internet companies have a problem that users do not agree to collect privacy data of the users in order to analyze user behaviors, and encrypt traffic messages generated in a data collection process. Some users have the behavior of bypassing gfw detection by using illegal software, because the illegal software cannot perform traffic limiting operation on the traffic currently by adopting a special encryption mode. In order to identify the abnormal flow data, the abnormal flow message identification rate of the text by collecting the flow message under the specific environment and carrying out data analysis reaches 99.3%.

With the development of the technology, the traffic encryption technology is more and more mature, and meanwhile, the traffic encryption technology is applied in a larger scale, a ca certificate is authenticated by most websites, the traffic data is protected by a non-plaintext transmission mode, meanwhile, great difficulty and challenge are brought to the identification of abnormal traffic, and the abnormal traffic identification tool developed aiming at the plaintext transmission cannot realize the identification of the abnormal encrypted traffic.

On one hand, some monopoly company built-in services invade user privacy, for example, the applet company default configuration collects user location information, part of apps of a mobile phone force users to provide location information, address book information and the like, otherwise the apps cannot be used, and in a microsoft windows 10 system, part of privacy data which are not agreed by the users are collected in a silent state and are transmitted to a microsoft server in an encrypted traffic transmission mode. Users unknowingly become machines that provide data for these large companies.

On the other hand, after the way that the mainstream of Shadowcheck and v2rayN bypasses GFW detection, the way of bypassing GFW detection by means of torjan appears in the market currently, and the encryption technology adopts the way of conventional https to access ca certificate authentication. Because the similarity degree of the encryption mode and the traffic of the normally accessed webpage is too high, the prior art cannot realize the identification of the encrypted traffic, and a new challenge is provided for the domestic network security environment.

Disclosure of Invention

In order to solve the problems, the method comprises the steps of collecting key information in encrypted flow, screening key information fields by using feature engineering after research, training a random forest algorithm by using screened data, monitoring bypass flow by using the trained algorithm, and identifying abnormal encrypted flow.

The invention aims to provide an abnormal TLS encrypted traffic detection method, which is characterized by comprising the following steps:

s1: respectively acquiring flow message data sets of abnormal encrypted flow and normal flow;

s2: carrying out information preprocessing on the acquired TLS flow message data set of the normal flow;

s3: establishing a decision tree prediction model, introducing a data sample into a random forest model, and training the decision tree prediction model;

s4: and classifying or predicting the preprocessed new data by using a classification algorithm according to the generated rule.

Optionally, in step S1, the source of the abnormal traffic specifically includes:

TLS encrypted flow message including user information in a system standing state; and

and the target ip address of the message handshake request is different from the real ip address of the user accessing the network resource.

Optionally, in step S1, the obtaining a flow data set of a normal flow includes:

all ip address types under the silent state are configured into ip addresses which are not allowed to be accessed, then messages generated during the period that users normally access various types of mainstream websites through browsers are simulated, and the messages are marked as normal flow.

Optionally, in the step S2, the acquired TLS traffic packet data set includes:

the Client hello message acquisition information specifically comprises the following steps: the method comprises the following steps of (1) data packet length, data packet arrival interval time sequence, TLS recording length, TLS recording time, TLS content type, TLS handshake type, TLS cipher suite, TLS extension length, TLS extension type, TLS version number, TLS random number and 10 parameters;

the method for collecting information in the message of the Server hello, the Certificate option, the Certificate request option and the Server key exchange option specifically comprises the following steps: the method comprises the following steps of (1) data packet length, data packet arrival interval time sequence, TLS recording length, TLS recording time, TLS content type, TLS handshake type, TLS password suite, TLS version, TLS session ID and TLS random;

the information acquisition method comprises the following steps of collecting information in a message of a Certificate option, a Client key exchange and a Certificate version option, and specifically comprises the following steps: the method comprises the following steps of (1) packet length, packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type, TLS version and TLS key length;

the information acquisition in the Change cipher spec message specifically comprises the following steps: packet length, packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type, TLS version.

Optionally, the step S3 specifically includes:

s3.1) carrying out recursive analysis on the training set to generate an inverted decision tree structure;

s3.2) analyzing the path of the tree from the root node to the leaf node to generate a series of rules;

s3.3) generating t decision trees, and then forming a random forest model.

Optionally, in step S4, in the classification algorithm of the random forest, m =7 is calculated as follows:

。

the invention also provides an abnormal TLS encrypted flow detection system, which comprises the following units:

an obtaining unit, configured to obtain flow message data sets of an abnormal encrypted flow and a normal flow, respectively;

the information preprocessing unit is used for preprocessing the acquired TLS flow message data set of the normal flow;

the model establishing and training unit is used for establishing a decision tree prediction model, introducing a data sample into a random forest model and training the decision tree prediction model;

and the classification prediction unit is used for classifying or predicting the preprocessed new data by using a classification algorithm according to the generated rule.

Optionally, the model building and training unit further includes:

the decision tree generating module is used for carrying out recursive analysis on the training set to generate an inverted decision tree structure;

the rule generating module is used for analyzing the path of the tree from the root node to the leaf node to generate a series of rules;

and the random forest model generation module is used for generating t decision trees and then forming a random forest model.

The invention also provides a computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, carries out the steps of any of the methods described above.

The invention also provides a terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any of the above methods when executing the program.

Compared with the prior art, the scheme implemented by the invention at least has the following beneficial effects: the invention utilizes the importance of the random forest energy calculation parameters, and only selects a small number of important characteristics of several dimensions to approximately represent the original data, thereby having the function of reducing the dimensions of the data. In addition, the method can be defined according to the abnormal data characteristics, and when the data has a plurality of different characteristics, the method is used for characteristic selection, and key characteristics are selected to be used in an algorithm, so that an accurate prediction result is obtained.

The invention utilizes the natural parallelism of the random forest, can well process large-scale data, and can be easily used in a distributed environment.

The invention also can identify the abnormal flow in the encrypted flow message by using a random forest algorithm, can provide technical support for user privacy protection and network security, and has an important role in identifying the abnormal encrypted flow message by using TLS handshake information in the abnormal encrypted flow message in the identification process of the abnormal flow message. The random forest algorithm has advantages for identifying abnormal encrypted flow messages.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

fig. 1 (a) - (c) are schematic diagrams illustrating a normal complete https request, three-way handshake, four-way hand waving and data transmission process of interaction between a client and a server;

fig. 2 (a) shows a general format of an http request message;

FIG. 2 (b) shows the common http request header code and meaning specification;

fig. 3 (a) shows the general format of an http response message;

FIG. 3 (b) shows a common http status code and meaning specification;

FIG. 3 (c) shows a common http response header and meaning specification;

FIG. 4 illustrates a flow diagram of an embodiment of an anomalous TLS encrypted traffic detection method of the present invention;

fig. 5 shows an example of storing, in step S2, TLS handshake data of each time as a piece of data in a report in the method for detecting an abnormal TLS encrypted traffic according to the present invention;

fig. 6 is a flowchart illustrating a specific embodiment of step S3 in the method for detecting an abnormal TLS encrypted traffic according to the present invention;

FIG. 7 shows the present invention FIG. 7 shows a flow of a classification algorithm for a classification algorithm using a random forest in an embodiment of the present invention;

figure 8 illustrates a block diagram of an embodiment of the anomalous TLS encrypted traffic detection system of the present invention.

Detailed description of the preferred embodiments

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a plurality" typically includes at least two.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or apparatus. Without further limitation, an element defined by the phrases "comprising one of \8230;" does not exclude the presence of additional like elements in an article or device comprising the element.

Alternative embodiments of the present invention are described in detail below with reference to the accompanying drawings.

Fig. 1 shows a schematic diagram of the interaction process between a client and a server for a normally complete https request.

As shown in fig. 1 (a) - (c), a normal and complete https request is sent, and the client and the server interact with each other through three handshakes, four waving hands, and data transmission.

Fig. 1 (a) shows 7 steps of HTTP request and response in full.

Fig. 1 (b) shows a TCP three-way handshake process. A connection must be established between two parties before either party can send data to the other party. Among the TCP/IP protocols, the TCP protocol provides reliable connection services, and the connection is initialized through three-way handshake. The purpose of the three-way handshake is to synchronize the sequence and acknowledgement numbers of both parties and exchange TCP window size information.

First handshake: a connection is established. The client sends a connection request message segment, the SYN position is 1, and the sequence Number is x; then, the client enters a SYN _ SEND state and waits for the confirmation of the server;

second handshake: the server receives the SYN segment. The server receives the SYN segment of the client, needs to confirm the SYN segment, and sets acknowledgement Number as x +1 (Sequence Number + 1); meanwhile, the self also sends SYN request information, the SYN position is 1, and the sequence Number is y; the server end puts all the information into a segment (namely SYN + ACK segment) and sends the segment to the client end, and at the moment, the server enters a SYN _ RECV state;

and (3) third handshake: the client receives the SYN + ACK segment from the server. And then setting the acknowledgement Number to be y +1, sending an ACK segment to the server, and after the segment is sent, enabling the client and the server to enter an ESTABLISHED state to finish TCP three-way handshake.

The three-way handshake can effectively prevent the failed connection request message segment from being suddenly transmitted to the server side again, thereby generating errors.

FIG. 2 shows components and an illustration of an HTTP request message.

As shown in fig. 2 (a), an HTTP request message is composed of 4 parts, i.e., a request line (request line), a request header (header), a null line, and request data.

The request header adds some additional information to the request message, and is composed of name/value pairs, wherein each row is paired, and the name and the value are separated by colon.

As shown in fig. 2 (b), a common http request header and a meaning specification are shown.

Fig. 1 (c) shows the TCP four-hand waving process. After the client and the server establish the TCP connection through the three-way handshake, the TCP connection is definitely to be disconnected after the data transmission is completed. That is, for a TCP disconnect, there is a "four wave".

Waving hands for the first time: a host 1 (a client can be used, and the server can be used) sets a Sequence Number, and sends a FIN message segment to a host 2; at this time, the host 1 enters a FIN _ WAIT _1 state; this indicates that host 1 has no data to send to host 2;

waving hands for the second time: the host 2 receives the FIN segment sent by the host 1 and returns an ACK segment to the host 1, wherein the acknowledgement Number is the Sequence Number plus 1; the host 1 enters a FIN _ WAIT _2 state; host 2 tells host 1 that i "agree" to your close request;

and c, waving hands for the third time: the host 2 sends a FIN message segment to the host 1 to request to close the connection, and meanwhile, the host 2 enters a LAST _ ACK state;

fourth hand waving: the host 1 receives the FIN segment sent by the host 2, sends an ACK segment to the host 2, and then the host 1 enters a TIME _ WAIT state; after the host 2 receives the ACK message segment of the host 1, the connection is closed; at this time, the host 1 still does not receive the reply after waiting for 2MSL, which proves that the Server is normally closed, and then the host 1 can also close the connection.

The TCP protocol is a connection-oriented, reliable, byte stream based transport layer communication protocol. TCP is full duplex mode, which means that when host 1 sends a FIN segment, it only indicates that host 1 has no data to send, and host 1 tells host 2 that its data has been sent completely; however, at this time, the host 1 can accept the data from the host 2; when host 2 returns an ACK segment, it indicates that it already knows that host 1 has no data to send, but that host 2 can still send data to host 1; when host 2 also sends a FIN segment, this time indicating that host 2 has no data to send, host 1 is told that i have no data to send, and then each other haphazardly breaks the TCP connection. Thus, this produces four hand swings.

As shown in fig. 3 (a), the HTTP response packet is mainly composed of a status line, a response header, an empty line, and response data.

The status row consists of 3 parts, respectively: protocol version, status code description.

The protocol version is consistent with the request message, and the state code description is a simple description of the state code. The status code is a 3-bit number.

1xx: indication information-indicating that the request has been received, processing continues.

2xx: success-meaning that the request has been successfully received, understood, accepted.

3xx: redirect-further action must be taken to complete the request.

4xx: client error-request has syntax error or request cannot be fulfilled.

5xx: server side error-the server fails to fulfill a legitimate request.

For example, a common status code is shown in fig. 3 (b), and a common response header is shown in fig. 3 (c).

Through the above three-way handshake and four-way waving steps, HTTP requests and responses are completed, and data transfer is possible.

Fig. 4 shows a flowchart of a first embodiment of a method of the abnormal TLS encrypted traffic detection method of the present invention.

As shown in fig. 4, the method for detecting the abnormal TLS encrypted traffic of the present invention includes the following steps:

the method specifically comprises the following steps:

s1.1) acquiring an abnormal flow message data set;

the source of the abnormal traffic is,

for example: TLS encryption flow message including user information in a system standing state;

also for example: and in the flow message generated by the illegal software, the destination ip address of the message handshake request is different from the real ip address of the user accessing the network resource. The destination address of the flow message is a proxy server address, and the flow message is forwarded by the proxy server in a TLS encryption mode.

S1.2) acquiring a normal flow message data set;

The above steps S1.1) and S1.2) may be performed in parallel.

S2: carrying out acquisition information preprocessing on a TLS flow message data set of normal flow;

for example,

the Client hello message acquisition information specifically comprises the following steps: the method comprises the following steps of data packet length, data packet arrival interval time sequence, TLS recording length, TLS recording time, TLS content type, TLS handshake type, TLS cipher suite, TLS extension length, TLS extension type, TLS version number, TLS random number and 10 parameters.

The method for collecting information in the message of the Server hello, the Certificate option, the Certificate request option and the Server key exchange option specifically comprises the following steps: packet length, packet arrival interval time sequence, TLS record length, TLS record time, TLS content type, TLS handshake type, TLS cipher suite, TLS version, TLS session ID, TLS random.

The information acquisition method comprises the following steps of collecting information in a message of a Certificate option, a Client key exchange and a Certificate version option, and specifically comprises the following steps: packet length, packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type, TLS version, TLS key length.

Collecting information in the Change ciper spec message specifically comprises the following steps: packet length, packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type, TLS version.

The result of saving each TLS stream data as one piece of data is shown in fig. 5.

Optionally, the experimental data does not collect information such as destination ip, source ip, destination port, source port, mac address, protocol number, and message generation time, so as to prevent overfitting of the data and avoid the influence of the characteristics of the noisy data.

S3: establishing a decision tree prediction model, introducing a data sample into a random forest model, and training the decision tree prediction model.

As shown in fig. 6, the step S3 specifically includes:

each tree of the random forest is independently constructed, and the random forest depends on the independent direction as much as possible without depending on the construction of other trees. The key to constructing each decision tree is: and what judging conditions are placed on each decision node.

In the invention, the mode of constructing the decision tree by adopting a recursion mode is as follows:

s3.1.1) making N be the number of training samples, and then making the number of input samples of a single decision tree be N randomly extracted N training samples returned from a training set.

S3.1.2) making the number of input features of the training sample M, wherein M is far smaller than M, randomly selecting M input features from the M input features when splitting is performed on each node of each decision tree, and then selecting the best input feature from the M input features to perform splitting. m does not change during the construction of the decision tree.

S 3.1.3) each tree is split until all training examples for that node belong to the same class. Pruning is not required.

The information entropy is often used as a quantitative index of the information content of a system, and thus can be further used as a target of system equation optimization or a criterion of parameter selection. In the generation process of the decision tree, the scheme uses entropy as a criterion for dividing the optimal attribute of the sample.

The larger the information entropy is, the higher the uncertainty of the event is, and when a decision tree is constructed by using the information entropy, the degree of information entropy decrease caused by each judgment condition needs to be compared. And selecting the judgment condition which causes the maximum degree of information entropy decrement, and placing the judgment condition at the position of the root node. Such steps are performed recursively until a complete decision tree is constructed.

In short, the randomly generated decision tree does not know which parameter to judge first, and the first step is to tell the decision tree the judgment order. For example, in the process of judging whether a section of code is good or bad, whether the required function can be realized, whether a comment exists, redundancy and the like are put after the function is realized or not are judged firstly.

analyzing the relationship between the parameters and the results in the decision tree generated in the step S3.1) to generate a rule for distinguishing abnormal flow judgment.

S3.3) generating t decision trees, and then forming a random forest model.

Preferably, a random forest classification algorithm is used, and the flow of the random forest classification algorithm is shown in fig. 6.

In the classification algorithm of fig. 7, m takes a value of, for example, 7, and is calculated as follows:

m is calculated as follows:

in step S2, the Client hello message has 11 parameters, 10 parameters are respectively taken from the messages of Server hello, central option, central request option and Server key exchange option, that is, 10 parameters 4=40 parameters, 7 parameters are respectively taken from the messages of central option, client key exchange and central option, and 6 parameters are taken from the Change chart spec message. In total, 11+40+3 + 7+ 6=78parameters, which are used as the characteristics of the decision tree.

Therefore, the invention utilizes the importance of the random forest energy calculation parameters, only selects a small number of important characteristics of several dimensions to approximately represent the original data, thereby having the function of reducing the dimension of the data. In addition, the method can be defined according to the abnormal data characteristics, and when the data has a plurality of different characteristics, the method is used for characteristic selection, and key characteristics are selected to be used in an algorithm, so that an accurate prediction result is obtained.

As shown in fig. 8, the system for detecting an abnormal TLS encrypted traffic provided by the present invention may include the following units:

As shown in fig. 8, the model building and training unit further comprises:

the decision tree generation module is used for carrying out recursive analysis on the training set to generate an inverted decision tree structure;

The present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of the above-described abnormal TLS encrypted traffic detection method. The computer-readable storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, DVD, CD-ROMs, microdrive, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, DRAMs, VRAMs, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.

The invention also provides computer equipment comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the method for detecting the abnormal TLS encrypted traffic. In the embodiment of the present invention, the processor is a control center of a computer system, and may be a processor of a physical machine or a processor of a virtual machine.

The foregoing description is only exemplary of the preferred embodiments of the invention and is not intended to limit the invention in any way as to its nature or form. Although the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. However, any simple modification, equivalent replacement, improvement and the like of the above embodiments according to the technical spirit of the present invention should be included in the protection scope of the present invention without departing from the spirit and principle of the present invention.

Claims

1. An abnormal TLS encrypted traffic detection method is characterized by comprising the following steps:

in step S2, the collected TLS traffic packet data set includes:

the Client hello message acquisition information specifically comprises the following steps: the method comprises the steps of data packet length, data packet arrival interval time sequence, TLS recording length, TLS recording time, TLS content type, TLS handshake type, TLS cipher suite, TLS extension length, TLS extension type, TLS version number and TLS random number;

collecting information in the message of Server hello, certificate option, certificate request option and Server key exchange option, which comprises the following steps: the method comprises the following steps of (1) data packet length, data packet arrival interval time sequence, TLS record length, TLS record time, TLS content type, TLS handshake type, TLS password suite, TLS version, TLS session ID and TLS random number;

the information acquisition method includes the following steps that information is acquired in a message of a Certificate option, a Client key exchange and a Certificate version option, and specifically includes the following steps: the method comprises the following steps of (1) data packet length, data packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type, TLS version and TLS key length;

collecting information in the Change cipher spec message specifically comprises the following steps: the method comprises the following steps of (1) packet length, packet arrival interval time sequence, TLS record length, TLS content type, TLS handshake type and TLS version;

wherein, the step S3 specifically includes:

s3.3) generating t decision trees, and then forming a random forest model;

wherein, step S3.1) specifically includes:

s3.1.1) making N be the number of training samples, wherein the number of input samples of a single decision tree is N, and N training samples are randomly extracted from a training set;

s3.1.2) making the number of input features of the training sample M, wherein M is far smaller than M, randomly selecting M input features from the M input features when splitting is performed on each node of each decision tree, and then selecting the best input feature from the M input features for splitting, wherein M cannot be changed in the process of constructing the decision tree;

s3.1.3) each tree is split in such a way until all training examples of the node belong to the same class and pruning is not needed; s4: and classifying or predicting the preprocessed new data by using a classification algorithm according to the generated rule.

2. The abnormal TLS encrypted traffic detection method of claim 1, wherein:

in step S1, the abnormal encrypted traffic specifically includes:

TLS encryption flow message including user information in a system standing state; and

and the target ip address of the message handshake request is different from the real ip address of the network resource accessed by the user.

3. The method for detecting anomalous TLS encrypted traffic as recited in claim 1, wherein:

in step S1, the acquiring a flow data set of a normal flow includes:

4. The anomalous TLS encrypted traffic detection method of claim 1,

in the step S4, in the classification algorithm of the random forest, m =7, the calculation mode is as follows:

log2M+ 1。

5. an abnormal TLS encrypted traffic detection system is characterized by comprising the following units:

the collected TLS flow message data set comprises:

the method for collecting information in the message of the Server hello, the Certificate option, the Certificate request option and the Server key exchange option specifically comprises the following steps: the method comprises the steps of data packet length, data packet arrival interval time sequence, TLS recording length, TLS recording time, TLS content type, TLS handshake type, TLS password suite, TLS version, TLS session ID and TLS random number;

wherein, the model establishing and training unit specifically comprises:

the recursion module is used for carrying out recursion analysis on the training set to generate an inverted decision tree structure;

the analysis module analyzes the path of the tree from the root node to the leaf node and generates a series of rules;

the generation model module is used for generating t decision trees and then forming a random forest model;

wherein, the recursion module specifically comprises:

let N be the number of training samples, the number of input samples of a single decision tree is N, and N training samples are randomly extracted from the training set;

the number of input features of a training sample is set to be M, and M is far smaller than M, so that when splitting is performed on each node of each decision tree, M input features are randomly selected from the M input features, then the best input feature is selected from the M input features for splitting, and M cannot be changed in the process of constructing the decision tree;

each tree is split in such a way until all training examples of the node belong to the same class and pruning is not needed;

6. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.

7. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-4 when executing the program.