CN117082118A

CN117082118A - Network connection method based on data derivation and port prediction

Info

Publication number: CN117082118A
Application number: CN202310451970.2A
Authority: CN
Inventors: 赵干杰; 徐伟
Original assignee: Shenzhen Guanglian Computing Co ltd
Current assignee: Shenzhen Guanglian Computing Co ltd
Priority date: 2023-04-23
Filing date: 2023-04-23
Publication date: 2023-11-17

Abstract

A network connection method based on data deduction and port prediction obtains network traffic data of a target host, wherein the network traffic data comprises a source address, a target address and a protocol type; by adopting the artificial intelligence technology based on deep learning, the network flow data of the target host is analyzed and mined, the port range which is possibly opened by the target host is quickly determined, and connection is attempted to be established with the port range, so that the port range which is possibly opened by the target host is quickly determined, the time for attempting connection and the resource consumption are reduced, and the efficiency of network connection is improved.

Description

Network connection method based on data derivation and port prediction

Technical Field

The application relates to the technical field of intelligent network connection, in particular to a network connection method based on data deduction and port prediction.

Background

Conventional network connection methods require attempting all ports that the target host may open one by one, which may waste a lot of time and bandwidth resources, and it is difficult to determine the target port in case of large network traffic.

Thus, an optimized network connection scheme is desired.

Disclosure of Invention

The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides a network connection method based on data deduction and port prediction, which acquires network traffic data of a target host, wherein the network traffic data comprises a source address, a target address and a protocol type; by adopting the artificial intelligence technology based on deep learning, the network flow data of the target host is analyzed and mined, the port range which is possibly opened by the target host is quickly determined, and connection is attempted to be established with the port range, so that the port range which is possibly opened by the target host is quickly determined, the time for attempting connection and the resource consumption are reduced, and the efficiency of network connection is improved.

In a first aspect, a network connection method based on data derivation and port prediction is provided, which includes:

acquiring network traffic data of a target host, wherein the network traffic data comprises a source address, a target address and a protocol type;

preprocessing the network traffic data to obtain preprocessed network traffic data;

based on the preprocessed network flow data, counting the use information of each port of the target host;

Performing word segmentation on the use information of each port of the target host, and then obtaining a plurality of port information semantic understanding feature vectors through a context encoder comprising a word embedding layer;

arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and then obtaining a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor;

taking the semantic understanding feature vectors of the port information as query feature vectors, and calculating the product between the query feature vectors and the semantic association feature matrix of the port information to obtain a plurality of classification feature vectors;

respectively passing the plurality of classification feature vectors through a classifier to obtain a plurality of probability values; and

and recommending the port of the target host corresponding to the maximum probability value in the plurality of probability values.

In the above network connection method based on data derivation and port prediction, preprocessing the network traffic data to obtain preprocessed network traffic data includes: and performing data deduplication, data filtering and data format conversion on the network traffic data to obtain the preprocessed network traffic data.

In the above network connection method based on data derivation and port prediction, the word segmentation processing is performed on the usage information of each port of the target host, and then a plurality of port information semantic understanding feature vectors are obtained through a context encoder including a word embedding layer, including: word segmentation processing is carried out on the use information of each port of the target host so as to convert the use information of each port of the target host into a word sequence composed of a plurality of words; mapping each word in the word sequence to a word vector using an embedding layer of the context encoder including the word embedding layer to obtain a sequence of word vectors; and performing global-based context semantic coding on the sequence of word vectors using the context encoder comprising a word embedding layer to obtain the plurality of port information semantic understanding feature vectors.

In the above network connection method based on data derivation and port prediction, performing global-based context semantic coding on the sequence of word vectors using the context encoder including the word embedding layer to obtain the plurality of port information semantic understanding feature vectors, including: one-dimensional arrangement is carried out on the sequence of the word vectors to obtain global word vectors; calculating the product between the global word vector and the transpose vector of each word vector in the sequence of word vectors to obtain a plurality of self-attention association matrices; respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and weighting each word vector in the sequence of word vectors by taking each probability value in the plurality of probability values as a weight to obtain the plurality of port information semantic understanding feature vectors.

In the above network connection method based on data derivation and port prediction, the method for obtaining the port information semantic association feature matrix by a convolutional neural network model as a feature extractor after arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix includes: and respectively carrying out convolution processing, pooling processing along a channel dimension and nonlinear activation processing on input data in forward transfer of layers by using each layer of the convolutional neural network model serving as a feature extractor, wherein the output of the last layer of the convolutional neural network model serving as the feature extractor is used as the port information semantic association feature matrix, and the input of the first layer of the convolutional neural network model serving as the feature extractor is used as the two-dimensional feature matrix.

In the above network connection method based on data derivation and port prediction, the step of passing the plurality of classification feature vectors through a classifier to obtain a plurality of probability values includes: performing full-connection coding on the plurality of classification feature vectors by using a plurality of full-connection layers of the classifier to obtain a plurality of coding classification feature vectors; and passing the plurality of encoded classification feature vectors through a Softmax classification function of the classifier to obtain the plurality of probability values.

The above network connection method based on data derivation and port prediction further comprises a training step: training the context encoder including the word embedding layer, the convolutional neural network model as a feature extractor, and the classifier.

In the above network connection method based on data derivation and port prediction, the training step includes: acquiring training data of a target host, wherein the training data comprises a source address, a target address and a protocol type; preprocessing the training data to obtain preprocessed training network flow data; based on the preprocessed training network flow data, training use information of each port of the target host is counted; after word segmentation processing is carried out on training use information of each port of the target host, a plurality of training port information semantic understanding feature vectors are obtained through a context encoder comprising a word embedding layer; arranging the training port information semantic understanding feature vectors into a training two-dimensional feature matrix, and then obtaining a training port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; taking the semantic understanding feature vectors of the training port information as query feature vectors, and calculating the product between the query feature vectors and the semantic association feature matrix of the training port information to obtain a plurality of training classification feature vectors; performing Geng Beier normal periodic re-parameterization on the plurality of training classification feature vectors to obtain a plurality of optimized training classification feature vectors; the optimized training classification feature vectors pass through a classifier to obtain a plurality of classification loss function values; and training the context encoder including the word embedding layer, the convolutional neural network model as the feature extractor, and the classifier based on the plurality of classification loss function values and traveling through a direction of gradient descent.

In the above network connection method based on data derivation and port prediction, performing Geng Beier normal periodic re-parameterization on the plurality of training classification feature vectors to obtain a plurality of optimized training classification feature vectors, including: performing Geng Beier normal periodic re-parameterization on the training classification feature vectors by using the following optimization formula to obtain a plurality of optimized training classification feature vectors; wherein, the optimization formula is:

wherein v is _i The characteristic values of the positions of the training classification characteristic vectors are represented, mu and sigma are respectively the mean value and the variance of the characteristic value set of the positions of the training classification characteristic vectors, log represents a logarithmic function based on 2, arcsin(s) represents an arcsine function, arccos(s) represents an arccosine function, v _i ' representing individual bits of a plurality of optimized training classification feature vectorsAnd (5) setting a characteristic value.

Compared with the prior art, the network connection method based on data deduction and port prediction provided by the application acquires network traffic data of a target host, wherein the network traffic data comprises a source address, a target address and a protocol type; by adopting the artificial intelligence technology based on deep learning, the network flow data of the target host is analyzed and mined, the port range which is possibly opened by the target host is quickly determined, and connection is attempted to be established with the port range, so that the port range which is possibly opened by the target host is quickly determined, the time for attempting connection and the resource consumption are reduced, and the efficiency of network connection is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a network connection method based on data derivation and port prediction according to an embodiment of the present application.

Fig. 2 is a flow chart of a method of data derivation and port prediction based network connection according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a network connection method based on data derivation and port prediction according to an embodiment of the present application.

Fig. 4 is a flowchart of the sub-steps of step 140 in a data derivation and port prediction based network connection method according to an embodiment of the present application.

Fig. 5 is a flowchart of the sub-steps of step 143 in a data-derivation and port-prediction based network connection method, according to an embodiment of the present application.

Fig. 6 is a flowchart of the sub-steps of step 170 in a data-derivation and port-prediction based network connection method according to an embodiment of the present application.

Fig. 7 is a flowchart of the sub-steps of step 190 in a data-derivation and port-prediction based network connection method, in accordance with an embodiment of the present application.

Fig. 8 is a block diagram of a data derivation and port prediction based network connection system in accordance with an embodiment of the present application.

Detailed Description

The following description of the technical solutions according to the embodiments of the present application will be given with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Unless defined otherwise, all technical and scientific terms used in the embodiments of the application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.

In describing embodiments of the present application, unless otherwise indicated and limited thereto, the term "connected" should be construed broadly, for example, it may be an electrical connection, or may be a communication between two elements, or may be a direct connection, or may be an indirect connection via an intermediate medium, and it will be understood by those skilled in the art that the specific meaning of the term may be interpreted according to circumstances.

It should be noted that, the term "first\second\third" related to the embodiment of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, it is to be understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate such that embodiments of the application described herein may be practiced in sequences other than those illustrated or described herein.

In order to solve the technical problems, the optimal port is determined by a method based on data derivation and port prediction. Specifically, the method based on data deduction and port prediction rapidly determines the port range which is possibly opened by the target host through analyzing the network traffic data and tries to establish connection with the target host, so that the port range which is possibly opened by the target host can be rapidly determined through analyzing and mining the network traffic data of the target host, thereby reducing the time and resource consumption for trying to connect and improving the efficiency of network connection.

Specifically, in the technical scheme of the application, network traffic data of a target host is firstly obtained, wherein the network traffic data comprises a source address, a target address and a protocol type. In the technical scheme of the application, the network traffic data of the target host is acquired, wherein the source address, the target address and the protocol type are included for analyzing and mining the network traffic data so as to identify the port use condition and the corresponding protocol type. In these data, the source address indicates the IP address of the information sender, the destination address indicates the IP address of the information receiver, and the protocol type indicates the transmission protocol used by the information, such as TCP, UDP, etc. Such data may provide basic information for network communications, such as both parties to the communication, the type of protocol, etc., and may be used to quickly determine the range of ports that the target host may open to attempt to establish a connection with. Meanwhile, the network traffic data of the target host is also beneficial to detecting network security threat, identifying suspicious network activities or malicious software and taking corresponding security protection measures. Thus, acquiring network traffic data of a target host is an important basis for network connection and security management.

And then, preprocessing and cleaning the network traffic data to obtain preprocessed network traffic data, wherein the preprocessing and cleaning comprises data deduplication, data filtering, data format conversion and the like. The network traffic data is preprocessed and cleaned to obtain preprocessed network traffic data, so as to remove noise, redundant information and error data in the data, thereby improving the quality and accuracy of the data. Network traffic data typically contains large amounts of clutter data and redundant information, such as duplicate traffic data, invalid packets, invalid protocols, etc., which can interfere with the analysis and mining process and lead to erroneous results. Thus, preprocessing and flushing network traffic data may first remove invalid data and duplicate data, such as removing invalid data packets and redundant data records. Second, there is also a need to convert and normalize the data format in order to better count and analyze the data. Finally, data denoising and error correction are also required, for example, whether the data meets the standard or not is checked according to the protocol specification.

And then, based on the preprocessed network traffic data, counting the use information of each port of the target host. Here, counting usage information of the various ports of the target host may help understand the communication mode and network behavior of the target host in order to better predict the range of ports that the target host may open. By counting the use information of each port of the target host, the conditions of the protocol type, the service type, the use frequency and the like corresponding to different ports can be known, so that the meaning of network traffic data can be better understood, and a prediction model can be built pertinently.

And further, performing word segmentation processing on the use information of each port of the target host, and then obtaining a plurality of port information semantic understanding feature vectors through a context encoder comprising a word embedding layer. This step is to translate the discrete port usage information into a continuous semantic space to better extract features and understand their meaning. By word segmentation and embedding of the port use information, different port use conditions can be converted into vector representations with similar semantics and mapped into a low-dimensional space, so that subsequent processing and classification are facilitated.

In particular, the word segmentation of port usage information may split it into different semantic units, such as "TCP", "80", "HTTP", etc., to better understand its meaning and relationships. On the basis of the semantic units, the semantic units can be further mapped into a high-dimensional space by a context encoder comprising a word embedding layer, so that a plurality of port information semantic understanding feature vectors are obtained. The feature vectors retain the semantic relevance of the original port usage information and have similar semantic spatial distances to other feature vectors, facilitating subsequent processing and classification.

And then, arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and obtaining a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor. Namely, after the port information semantic understanding feature vectors of all the ports of the target host are arranged into a two-dimensional feature matrix at the data structure level, the correlation mode features of the use information of all the ports in the high-dimensional semantic space are extracted through a convolutional neural network model with excellent performance in the local neighborhood correlation feature field. From another point of view, since the network traffic data contains a large amount of discrete port usage information, it is difficult to directly process and classify the network traffic data, and therefore, it is necessary to convert and extract the discrete port information into a continuous numerical feature representation by using a model such as a neural network, and extract semantic relevance between different ports.

Specifically, arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix can convert them into a continuous numerical feature representation, and preserve the semantic relevance of the original vector in space. The feature matrix is used as input of a convolutional neural network model, so that semantic association features among different ports can be further extracted.

And then, taking the semantic understanding feature vectors of the port information as query feature vectors, and calculating a matrix product between the query feature vectors and the semantic association feature matrices of the port information to obtain a plurality of classification feature vectors. Here, the semantic understanding feature vector of each port information is taken as a query feature vector, and a matrix product between the semantic understanding feature vector of each port information and the semantic association feature matrix of the port information is calculated to obtain the plurality of classification feature vectors, which essentially refers to similarity and association between the global features of the use information of each port and the use information of all ports of the target host. That is, the respective classification feature vectors represent the correlation and similarity between the respective port information semantic understanding feature vectors and the port information semantic association feature matrix, so as to reflect the semantic association between different port information, and contain abundant information for classification and prediction.

Further, the plurality of classification feature vectors are respectively passed through a classifier to obtain a plurality of probability values. That is, the plurality of classification feature vectors may be converted into probability values by a classifier, respectively, and classified and predicted according to different probability values. For example, the probability value may be set to a size that predicts that the target host opens a port, if the probability exceeds a predetermined threshold, then the target host is considered likely to open the port, otherwise the target host is considered not to open the port. Thus, the port use condition of the target host can be predicted and judged, and corresponding precautionary measures can be further adopted.

Finally, recommending the port corresponding to the maximum probability value in the probability values. Therefore, through analysis and mining of the network flow data of the target host, the port range which is possibly opened by the target host can be rapidly determined, so that the time and resource consumption for trying to connect are reduced, and the efficiency of network connection is improved.

In particular, in the technical solution of the present application, since the classification feature vector is obtained by taking the semantic understanding feature vector of each port information as the query feature vector and calculating the matrix product between the semantic association feature matrices of the port information, the semantic understanding feature vector of each port information is obtained by performing semantic encoding understanding on the usage information of each port of the target host, and the semantic association feature matrix of the port information is obtained by performing convolutional encoding on the two-dimensional feature matrix obtained by two-dimensionally arranging the semantic understanding feature vectors of each port information on the basis of the semantic understanding feature vector of each port information, so that the semantic understanding feature vector of the port information and the semantic association feature matrix of the port information have differences in association order and feature depth, which may cause the feature distribution of the classification feature vector to have discontinuities, that is, there is a distribution gap (distribution gap) of local distribution in the feature distribution of the classification feature vector, which may affect the training speed during model training.

Based on this, the applicant of the present application performs Geng Beier (gummel) normal periodic re-parameterization on the classification feature vector V to obtain an optimized classification feature vector V', specifically expressed as:

mu and sigma are respectively the eigenvalue sets v _i Mean and variance of e V, and V _i ′∈V′。

Here, the Geng Beier normal periodic re-parameterization is performed by dividing the feature values V of the respective positions of the classification feature vector V _i The method is converted into angular feature expression of probability distribution, random periodic distribution is introduced into normal distribution of a feature value set based on a random periodic operation mode of Geng Beier (Gumbel) distribution, so that periodic continuous micro-approximation with randomness of original feature distribution is obtained, and accordingly dynamic continuous wave capacity of gradient of a loss function in inverse propagation in a model is improved when an optimized classification feature vector V' is trained through periodic re-parameterization of features, and dynamic applicability of a context encoder containing a word embedding layer and a convolutional neural network model in a training process is improved, and therefore influence of local discontinuity of feature distribution of the classification feature vector on training speed is compensated.

Fig. 1 is a schematic diagram of a network connection method based on data derivation and port prediction according to an embodiment of the present application. As shown in fig. 1, in this application scenario, first, network traffic data of a target host is acquired (e.g., C as illustrated in fig. 1), wherein the network traffic data includes a source address, a target address, and a protocol type; then, the obtained network traffic data of the target host is input to a server (e.g., S as illustrated in fig. 1) deployed with a network connection algorithm based on data derivation and port prediction, wherein the server is capable of processing the network traffic data of the target host based on the network connection algorithm of data derivation and port prediction to generate a plurality of probability values, and recommending a port of the target host corresponding to a maximum probability value among the plurality of probability values.

Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.

In one embodiment of the present application, fig. 2 is a flowchart of a method of network connection based on data derivation and port prediction according to an embodiment of the present application. As shown in fig. 2, a network connection method 100 based on data derivation and port prediction according to an embodiment of the present application includes: 110, acquiring network traffic data of a target host, wherein the network traffic data comprises a source address, a target address and a protocol type; 120, preprocessing the network traffic data to obtain preprocessed network traffic data; 130, based on the preprocessed network traffic data, counting the usage information of each port of the target host; 140, performing word segmentation processing on the use information of each port of the target host, and then obtaining a plurality of port information semantic understanding feature vectors through a context encoder comprising a word embedding layer; 150, arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and then obtaining a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; 160, calculating the product between each port information semantic understanding feature vector and the port information semantic association feature matrix by taking the port information semantic understanding feature vector as a query feature vector to obtain a plurality of classification feature vectors; 170, respectively passing the plurality of classification feature vectors through a classifier to obtain a plurality of probability values; and recommending 180 a port of the target host corresponding to the maximum probability value in the plurality of probability values.

Fig. 3 is a schematic diagram of a network connection method based on data derivation and port prediction according to an embodiment of the present application. As shown in fig. 3, in the network architecture, first, network traffic data of a target host is acquired, where the network traffic data includes a source address, a target address, and a protocol type; then, preprocessing the network traffic data to obtain preprocessed network traffic data; then, based on the preprocessed network flow data, counting the use information of each port of the target host; then, the use information of each port of the target host is subjected to word segmentation processing, and then a context encoder containing a word embedding layer is used for obtaining a plurality of port information semantic understanding feature vectors; secondly, arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and then obtaining a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; then, taking the semantic understanding feature vectors of the port information as query feature vectors, and calculating the product between the query feature vectors and the semantic association feature matrix of the port information to obtain a plurality of classification feature vectors; then, the classification feature vectors are respectively passed through a classifier to obtain a plurality of probability values; and finally, recommending the port of the target host corresponding to the maximum probability value in the probability values.

Specifically, in step 110, network traffic data of the target host is obtained, where the network traffic data includes a source address, a target address, and a protocol type. In order to solve the technical problems, the optimal port is determined by a method based on data derivation and port prediction. Specifically, the method based on data deduction and port prediction rapidly determines the port range which is possibly opened by the target host through analyzing the network traffic data and tries to establish connection with the target host, so that the port range which is possibly opened by the target host can be rapidly determined through analyzing and mining the network traffic data of the target host, thereby reducing the time and resource consumption for trying to connect and improving the efficiency of network connection.

Specifically, in the technical scheme of the application, network traffic data of a target host is firstly obtained, wherein the network traffic data comprises a source address, a target address and a protocol type. In the technical scheme of the application, the network traffic data of the target host is acquired, wherein the source address, the target address and the protocol type are included for analyzing and mining the network traffic data so as to identify the port use condition and the corresponding protocol type.

In these data, the source address indicates the IP address of the information sender, the destination address indicates the IP address of the information receiver, and the protocol type indicates the transmission protocol used by the information, such as TCP, UDP, etc. Such data may provide basic information for network communications, such as both parties to the communication, the type of protocol, etc., and may be used to quickly determine the range of ports that the target host may open to attempt to establish a connection with. Meanwhile, the network traffic data of the target host is also beneficial to detecting network security threat, identifying suspicious network activities or malicious software and taking corresponding security protection measures. Thus, acquiring network traffic data of a target host is an important basis for network connection and security management.

Specifically, in step 120, the network traffic data is preprocessed to obtain preprocessed network traffic data. And then, preprocessing and cleaning the network traffic data to obtain preprocessed network traffic data, wherein the preprocessing and cleaning comprises data deduplication, data filtering, data format conversion and the like. The network traffic data is preprocessed and cleaned to obtain preprocessed network traffic data, so as to remove noise, redundant information and error data in the data, thereby improving the quality and accuracy of the data.

Network traffic data typically contains large amounts of clutter data and redundant information, such as duplicate traffic data, invalid packets, invalid protocols, etc., which can interfere with the analysis and mining process and lead to erroneous results. Thus, preprocessing and flushing network traffic data may first remove invalid data and duplicate data, such as removing invalid data packets and redundant data records. Second, there is also a need to convert and normalize the data format in order to better count and analyze the data. Finally, data denoising and error correction are also required, for example, whether the data meets the standard or not is checked according to the protocol specification.

In a specific embodiment of the present application, preprocessing the network traffic data to obtain preprocessed network traffic data includes: and performing data deduplication, data filtering and data format conversion on the network traffic data to obtain the preprocessed network traffic data.

Specifically, in step 130, based on the preprocessed network traffic data, usage information of each port of the target host is counted. And then, based on the preprocessed network traffic data, counting the use information of each port of the target host. Here, counting usage information of the various ports of the target host may help understand the communication mode and network behavior of the target host in order to better predict the range of ports that the target host may open. By counting the use information of each port of the target host, the conditions of the protocol type, the service type, the use frequency and the like corresponding to different ports can be known, so that the meaning of network traffic data can be better understood, and a prediction model can be built pertinently.

Specifically, in step 140, the usage information of each port of the target host is subjected to word segmentation, and then a context encoder including a word embedding layer is used to obtain a plurality of port information semantic understanding feature vectors. And further, performing word segmentation processing on the use information of each port of the target host, and then obtaining a plurality of port information semantic understanding feature vectors through a context encoder comprising a word embedding layer. This step is to translate the discrete port usage information into a continuous semantic space to better extract features and understand their meaning. By word segmentation and embedding of the port use information, different port use conditions can be converted into vector representations with similar semantics and mapped into a low-dimensional space, so that subsequent processing and classification are facilitated.

Fig. 4 is a flowchart of a sub-step of step 140 in a network connection method based on data derivation and port prediction according to an embodiment of the present application, as shown in fig. 4, after performing word segmentation processing on usage information of each port of the target host, a context encoder including a word embedding layer is used to obtain a plurality of port information semantic understanding feature vectors, including: 141, performing word segmentation processing on the use information of each port of the target host to convert the use information of each port of the target host into a word sequence composed of a plurality of words; 142 mapping each word in the word sequence to a word vector using the embedding layer of the context encoder including a word embedding layer to obtain a sequence of word vectors; and, 143, performing global-based context semantic coding on the sequence of word vectors using the context encoder including the word embedding layer to obtain the plurality of port information semantic understanding feature vectors.

Further, fig. 5 is a flowchart of the sub-steps of step 143 in the data derivation and port prediction based network connection method according to an embodiment of the present application, as shown in fig. 5, using the context encoder including the word embedding layer to perform global-based context semantic encoding on the sequence of word vectors to obtain the plurality of port information semantic understanding feature vectors, including: 1431, one-dimensionally arranging the sequence of the word vectors to obtain global word vectors; 1432, calculating the product between the global word vector and the transpose vector of each word vector in the sequence of word vectors to obtain a plurality of self-attention association matrices; 1433, respectively performing standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; 1434, obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and 1435, weighting each word vector in the sequence of word vectors by taking each probability value in the plurality of probability values as a weight to obtain the plurality of port information semantic understanding feature vectors.

The context encoder aims to mine for hidden patterns between contexts in the word sequence, optionally the encoder comprises: CNN (Convolutional Neural Network ), recurrent NN (RecursiveNeural Network, recurrent neural network), language Model (Language Model), and the like. The CNN-based method has a better extraction effect on local features, but has a poor effect on Long-Term Dependency (Long-Term Dependency) problems in sentences, so Bi-LSTM (Long Short-Term Memory) based encoders are widely used. The repetitive NN processes sentences as a tree structure rather than a sequence, has stronger representation capability in theory, but has the weaknesses of high sample marking difficulty, deep gradient disappearance, difficulty in parallel calculation and the like, so that the repetitive NN is less in practical application. The transducer has a network structure with wide application, has the characteristics of CNN and RNN, has a better extraction effect on global characteristics, and has a certain advantage in parallel calculation compared with RNN (RecurrentNeural Network ).

Specifically, in step 150, the plurality of port information semantic understanding feature vectors are arranged into a two-dimensional feature matrix, and then the two-dimensional feature matrix is obtained through a convolutional neural network model serving as a feature extractor, so as to obtain a port information semantic association feature matrix. And then, arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and obtaining a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor.

Namely, after the port information semantic understanding feature vectors of all the ports of the target host are arranged into a two-dimensional feature matrix at the data structure level, the correlation mode features of the use information of all the ports in the high-dimensional semantic space are extracted through a convolutional neural network model with excellent performance in the local neighborhood correlation feature field. From another point of view, since the network traffic data contains a large amount of discrete port usage information, it is difficult to directly process and classify the network traffic data, and therefore, it is necessary to convert and extract the discrete port information into a continuous numerical feature representation by using a model such as a neural network, and extract semantic relevance between different ports.

The method for obtaining the port information semantic association feature matrix by using the convolutional neural network model as the feature extractor after arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix comprises the following steps: and respectively carrying out convolution processing, pooling processing along a channel dimension and nonlinear activation processing on input data in forward transfer of layers by using each layer of the convolutional neural network model serving as a feature extractor, wherein the output of the last layer of the convolutional neural network model serving as the feature extractor is used as the port information semantic association feature matrix, and the input of the first layer of the convolutional neural network model serving as the feature extractor is used as the two-dimensional feature matrix.

The convolutional neural network (Convolutional Neural Network, CNN) is an artificial neural network and has wide application in the fields of image recognition and the like. The convolutional neural network may include an input layer, a hidden layer, and an output layer, where the hidden layer may include a convolutional layer, a pooling layer, an activation layer, a full connection layer, etc., where the previous layer performs a corresponding operation according to input data, outputs an operation result to the next layer, and obtains a final result after the input initial data is subjected to a multi-layer operation.

The convolutional neural network model has excellent performance in the aspect of image local feature extraction by taking a convolutional kernel as a feature filtering factor, and has stronger feature extraction generalization capability and fitting capability compared with the traditional image feature extraction algorithm based on statistics or feature engineering.

Specifically, in step 160, the product between each port information semantic understanding feature vector and the port information semantic association feature matrix is calculated to obtain a plurality of classification feature vectors by using the respective port information semantic understanding feature vectors as query feature vectors. And then, taking the semantic understanding feature vectors of the port information as query feature vectors, and calculating a matrix product between the query feature vectors and the semantic association feature matrices of the port information to obtain a plurality of classification feature vectors.

Here, the semantic understanding feature vector of each port information is taken as a query feature vector, and a matrix product between the semantic understanding feature vector of each port information and the semantic association feature matrix of the port information is calculated to obtain the plurality of classification feature vectors, which essentially refers to similarity and association between the global features of the use information of each port and the use information of all ports of the target host. That is, the respective classification feature vectors represent the correlation and similarity between the respective port information semantic understanding feature vectors and the port information semantic association feature matrix, so as to reflect the semantic association between different port information, and contain abundant information for classification and prediction.

In one embodiment of the application, the product between each port information semantic understanding feature vector and the port information semantic association feature matrix is calculated by the following product formula to obtain a plurality of classification feature vectors; wherein, the product formula is:

wherein M is ₁ Representing the semantic association feature matrix of the port information, V _i Representing the semantic understanding feature vector of each port information, V _m Representing the plurality of classification feature vectors, Representing vector multiplication.

Specifically, in step 170 and step 180, the plurality of classification feature vectors are respectively passed through a classifier to obtain a plurality of probability values; and recommending the port of the target host corresponding to the maximum probability value in the plurality of probability values. Further, the plurality of classification feature vectors are respectively passed through a classifier to obtain a plurality of probability values.

That is, the plurality of classification feature vectors may be converted into probability values by a classifier, respectively, and classified and predicted according to different probability values. For example, the probability value may be set to a size that predicts that the target host opens a port, if the probability exceeds a predetermined threshold, then the target host is considered likely to open the port, otherwise the target host is considered not to open the port. Thus, the port use condition of the target host can be predicted and judged, and corresponding precautionary measures can be further adopted.

Fig. 6 is a flowchart illustrating a sub-step of step 170 in a data deriving and port predicting based network connection method according to an embodiment of the present application, as shown in fig. 6, the step of respectively passing the plurality of classification feature vectors through a classifier to obtain a plurality of probability values includes: 171, performing full-connection encoding on the plurality of classification feature vectors by using a plurality of full-connection layers of the classifier to obtain a plurality of encoded classification feature vectors; and, 172, passing the plurality of encoded classification feature vectors through a Softmax classification function of the classifier to obtain the plurality of probability values.

In a specific example of the present application, the classifier processes the plurality of classification feature vectors to generate a plurality of probability values according to a classification formula:

softmax{(W _n ，B _n )：...：(W ₁ ，B ₁ )|X _i ++, wherein X _i Representing the plurality of classification feature vectors, W ₁ To W _n Is a weight matrix, B ₁ To B _n Representing a bias matrix

Further, the network connection method based on data deduction and port prediction further comprises the training steps of: training the context encoder including the word embedding layer, the convolutional neural network model as a feature extractor, and the classifier. Fig. 7 is a flowchart showing the sub-steps of step 190 in the data deriving and port predicting based network connection method according to the embodiment of the present application, as shown in fig. 7, the training step 190 includes: 191, acquiring training data of a target host, wherein the training data comprises a source address, a target address and a protocol type; 192, preprocessing the training data to obtain preprocessed training network flow data; 193, based on the preprocessed training network traffic data, counting training use information of each port of the target host; 194, word segmentation processing is carried out on training use information of each port of the target host, and then a plurality of training port information semantic understanding feature vectors are obtained through a context encoder comprising a word embedding layer; 195, arranging the training port information semantic understanding feature vectors into a training two-dimensional feature matrix, and then obtaining a training port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; 196, calculating the product between the training port information semantic understanding feature vector and the training port information semantic association feature matrix by taking the training port information semantic understanding feature vector as a query feature vector to obtain a plurality of training classification feature vectors; 197, performing Geng Beier normal periodic re-parameterization on the training classification feature vectors to obtain a plurality of optimized training classification feature vectors; 198, passing the plurality of optimized training classification feature vectors through a classifier to obtain a plurality of classification loss function values; and 199 training the context encoder including the word embedding layer, the convolutional neural network model as the feature extractor, and the classifier based on the plurality of classification loss function values and traveling through a direction of gradient descent.

Based on this, the applicant of the present application performs Geng Beier (gummel) normal periodic re-parameterization on the plurality of training classification feature vectors V to obtain a plurality of optimized training classification feature vectors V', specifically expressed as: performing Geng Beier normal periodic re-parameterization on the training classification feature vectors by using the following optimization formula to obtain a plurality of optimized training classification feature vectors; wherein, the optimization formula is:

Wherein v is _i The characteristic values of the positions of the training classification characteristic vectors are represented, mu and sigma are respectively the mean value and the variance of the characteristic value set of the positions of the training classification characteristic vectors, log represents a logarithmic function based on 2, arcsin(s) represents an arcsine function, arccos(s) represents an arccosine function, v _i ' represents the feature values for each location of the plurality of optimized training classification feature vectors.

Here, the Geng Beier normal periodic re-parameterization is performed by integrating the feature values V of the respective positions of the plurality of training classification feature vectors V _i Angular feature representation converted into its probability distribution to operate on the basis of a random periodicity of Geng Beier (gummel) distributionThe random periodic distribution is introduced into the normal distribution of the feature value set to obtain the periodic continuous micro approximation of the original feature distribution, so that the dynamic continuous wave capacity of the gradient of the loss function, which is reversely propagated in the model during training, is improved through the periodic re-parameterization of the features, and the dynamic applicability of the context encoder containing the word embedding layer and the convolutional neural network model in the training process is improved, so that the influence of the local discontinuity of the feature distribution of the classification feature vector on the training speed is compensated.

In summary, a data derivation and port prediction based network connection method 100 according to an embodiment of the present application is illustrated, which obtains network traffic data of a target host, where the network traffic data includes a source address, a target address, and a protocol type; by adopting the artificial intelligence technology based on deep learning, the network flow data of the target host is analyzed and mined, the port range which is possibly opened by the target host is quickly determined, and connection is attempted to be established with the port range, so that the port range which is possibly opened by the target host is quickly determined, the time for attempting connection and the resource consumption are reduced, and the efficiency of network connection is improved.

In one embodiment of the present application, fig. 8 is a block diagram of a data derivation and port prediction based network connection system according to an embodiment of the present application. As shown in fig. 8, a network connection system 200 based on data derivation and port prediction according to an embodiment of the present application includes: a data acquisition module 210, configured to acquire network traffic data of a target host, where the network traffic data includes a source address, a target address, and a protocol type; a preprocessing module 220, configured to preprocess the network traffic data to obtain preprocessed network traffic data; a usage information statistics module 230, configured to, based on the preprocessed network traffic data, count usage information of each port of the target host; the context coding module 240 is configured to perform word segmentation on the usage information of each port of the target host, and obtain a plurality of port information semantic understanding feature vectors through a context encoder including a word embedding layer; the feature extraction module 250 is configured to arrange the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix, and then obtain a port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; the classification feature vector calculation module 260 is configured to calculate a product between the classification feature vector calculation module and the port information semantic association feature matrix by using the semantic understanding feature vector of each port information as a query feature vector to obtain a plurality of classification feature vectors; a probability value generating module 270, configured to pass the plurality of classification feature vectors through a classifier to obtain a plurality of probability values; and a recommending module 280, configured to recommend a port of the target host corresponding to a maximum probability value among the plurality of probability values.

In a specific example, in the above network connection system based on data derivation and port prediction, the preprocessing module is configured to: and performing data deduplication, data filtering and data format conversion on the network traffic data to obtain the preprocessed network traffic data.

In a specific example, in the above network connection system based on data derivation and port prediction, the context encoding module includes: the word segmentation processing unit is used for carrying out word segmentation processing on the use information of each port of the target host so as to convert the use information of each port of the target host into a word sequence consisting of a plurality of words; a word mapping unit, configured to map each word in the word sequence to a word vector using an embedding layer of the context encoder including a word embedding layer to obtain a sequence of word vectors; and an encoding unit, configured to perform global-based context semantic encoding on the sequence of word vectors using the context encoder including the word embedding layer to obtain the plurality of port information semantic understanding feature vectors.

In a specific example, in the above network connection system based on data derivation and port prediction, the encoding unit includes: a one-dimensional arrangement subunit, configured to perform one-dimensional arrangement on the sequence of word vectors to obtain a global word vector; a self-attention subunit, configured to calculate a product between the global word vector and a transpose vector of each word vector in the sequence of word vectors to obtain a plurality of self-attention association matrices; the normalization subunit is used for respectively performing normalization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of normalized self-attention correlation matrices; the classification function calculation subunit is used for obtaining a plurality of probability values through a Softmax classification function by each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and the weighting subunit is used for weighting each word vector in the sequence of the word vectors by taking each probability value in the plurality of probability values as a weight so as to obtain the plurality of port information semantic understanding feature vectors.

In a specific example, in the above network connection system based on data derivation and port prediction, the feature extraction module is configured to: and respectively carrying out convolution processing, pooling processing along a channel dimension and nonlinear activation processing on input data in forward transfer of layers by using each layer of the convolutional neural network model serving as a feature extractor, wherein the output of the last layer of the convolutional neural network model serving as the feature extractor is used as the port information semantic association feature matrix, and the input of the first layer of the convolutional neural network model serving as the feature extractor is used as the two-dimensional feature matrix.

In a specific example, in the above network connection system based on data derivation and port prediction, the probability value generating module includes: the full-connection coding unit is used for performing full-connection coding on the plurality of classification feature vectors by using a plurality of full-connection layers of the classifier to obtain a plurality of coding classification feature vectors; and a classification result unit, configured to pass the plurality of encoded classification feature vectors through a Softmax classification function of the classifier to obtain the plurality of probability values.

In a specific example, in the network connection system based on data derivation and port prediction, the system further includes a training module for training the context encoder including the word embedding layer, the convolutional neural network model as the feature extractor, and the classifier.

In a specific example, in the above network connection system based on data derivation and port prediction, the training module includes: the training data acquisition unit is used for acquiring training data of the target host, wherein the training data comprises a source address, a target address and a protocol type; the training preprocessing unit is used for preprocessing the training data to obtain preprocessed training network flow data; the training use information statistics unit is used for counting the training use information of each port of the target host based on the preprocessed training network flow data; the training context coding unit is used for word segmentation processing is carried out on training use information of each port of the target host, and then a plurality of training port information semantic understanding feature vectors are obtained through a context coder comprising a word embedding layer; the training feature extraction unit is used for arranging the training port information semantic understanding feature vectors into a training two-dimensional feature matrix and obtaining a training port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor; the training classification feature vector calculation unit is used for calculating the product between the training classification feature vector calculation unit and the training port information semantic association feature matrix by taking the training port information semantic understanding feature vector as a query feature vector so as to obtain a plurality of training classification feature vectors; the training optimization unit is used for carrying out Geng Beier normal periodic re-parameterization on the training classification feature vectors so as to obtain a plurality of optimized training classification feature vectors; the classification loss function value calculation unit is used for enabling the plurality of optimized training classification feature vectors to pass through a classifier to obtain a plurality of classification loss function values; and a training unit for training the context encoder including the word embedding layer, the convolutional neural network model as the feature extractor, and the classifier based on the plurality of classification loss function values and traveling in a direction of gradient descent.

In a specific example, in the above network connection system based on data derivation and port prediction, the training optimization unit is configured to: performing Geng Beier normal periodic re-parameterization on the training classification feature vectors by using the following optimization formula to obtain a plurality of optimized training classification feature vectors; wherein, the optimization formula is:

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described data-derivation-and-port-prediction-based network connection system have been described in detail in the above description of the data-derivation-and-port-prediction-based network connection method with reference to fig. 1 to 7, and thus, repetitive descriptions thereof will be omitted.

As described above, the data derivation and port prediction based network connection system 200 according to the embodiment of the present application may be implemented in various terminal devices, such as a server or the like for data derivation and port prediction based network connection. In one example, the data derivation and port prediction based network connection system 200 according to embodiments of the present application may be integrated into a terminal device as a software module and/or hardware module. For example, the data derivation and port prediction based network connection system 200 may be a software module in the operating system of the terminal device or may be an application developed for the terminal device; of course, the network connection system 200 based on data derivation and port prediction can also be one of a plurality of hardware modules of the terminal device.

Alternatively, in another example, the data-derivation and port-prediction based network connection system 200 and the terminal device may be separate devices, and the data-derivation and port-prediction based network connection system 200 may be connected to the terminal device through a wired and/or wireless network, and transmit the interworking information according to an agreed data format.

The present application also provides a computer program product comprising instructions which, when executed, cause an apparatus to perform operations corresponding to the above-described method.

In one embodiment of the present application, there is also provided a computer-readable storage medium storing a computer program for executing the above-described method.

It should be appreciated that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the forms of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects may be utilized. Furthermore, the computer program product may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Methods, systems, and computer program products of embodiments of the present application are described in the flow diagrams and/or block diagrams. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be considered as essential to the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not necessarily limited to practice with the above described specific details.

The block diagrams of the devices, apparatuses, devices, systems referred to in the present application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. A method for network connection based on data derivation and port prediction, comprising:

2. The data derivation and port prediction based network connection method of claim 1 wherein preprocessing the network traffic data to obtain preprocessed network traffic data comprises:

and performing data deduplication, data filtering and data format conversion on the network traffic data to obtain the preprocessed network traffic data.

3. The method for data derivation and port prediction based network connection of claim 2 wherein word segmentation is performed on usage information of each port of the target host to obtain a plurality of port information semantic understanding feature vectors through a context encoder including a word embedding layer, comprising:

word segmentation processing is carried out on the use information of each port of the target host so as to convert the use information of each port of the target host into a word sequence composed of a plurality of words;

Mapping each word in the word sequence to a word vector using an embedding layer of the context encoder including the word embedding layer to obtain a sequence of word vectors; and

and performing global-based context semantic coding on the sequence of word vectors by using the context encoder comprising the word embedding layer to obtain the plurality of port information semantic understanding feature vectors.

4. The data-derivation and port-prediction-based network connection method of claim 3 wherein globally-based context semantic encoding the sequence of word vectors using the word-embedding-layer-containing context encoder to obtain the plurality of port-information semantic-understanding feature vectors comprises:

one-dimensional arrangement is carried out on the sequence of the word vectors to obtain global word vectors;

calculating the product between the global word vector and the transpose vector of each word vector in the sequence of word vectors to obtain a plurality of self-attention association matrices;

respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices;

obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and

And weighting each word vector in the sequence of word vectors by taking each probability value in the plurality of probability values as a weight so as to obtain the plurality of port information semantic understanding feature vectors.

5. The method for data derivation and port prediction based network connection of claim 4 wherein arranging the plurality of port information semantic understanding feature vectors into a two-dimensional feature matrix and then passing through a convolutional neural network model as a feature extractor to obtain a port information semantic association feature matrix comprises: and respectively carrying out convolution processing, pooling processing along a channel dimension and nonlinear activation processing on input data in forward transfer of layers by using each layer of the convolutional neural network model serving as a feature extractor, wherein the output of the last layer of the convolutional neural network model serving as the feature extractor is used as the port information semantic association feature matrix, and the input of the first layer of the convolutional neural network model serving as the feature extractor is used as the two-dimensional feature matrix.

6. The method for data derivation and port prediction based network connection of claim 5 wherein passing the plurality of classification feature vectors through a classifier to obtain a plurality of probability values, respectively, comprises:

Performing full-connection coding on the plurality of classification feature vectors by using a plurality of full-connection layers of the classifier to obtain a plurality of coding classification feature vectors; and

and passing the plurality of encoded classification feature vectors through a Softmax classification function of the classifier to obtain the plurality of probability values.

7. The data-deriving-and-port-prediction-based network connection method according to claim 6, further comprising the training step of: training the context encoder including the word embedding layer, the convolutional neural network model as a feature extractor, and the classifier.

8. The data derivation and port prediction based network connection method of claim 7, wherein the training step comprises:

acquiring training data of a target host, wherein the training data comprises a source address, a target address and a protocol type;

preprocessing the training data to obtain preprocessed training network flow data;

based on the preprocessed training network flow data, training use information of each port of the target host is counted;

after word segmentation processing is carried out on training use information of each port of the target host, a plurality of training port information semantic understanding feature vectors are obtained through a context encoder comprising a word embedding layer;

Arranging the training port information semantic understanding feature vectors into a training two-dimensional feature matrix, and then obtaining a training port information semantic association feature matrix through a convolutional neural network model serving as a feature extractor;

taking the semantic understanding feature vectors of the training port information as query feature vectors, and calculating the product between the query feature vectors and the semantic association feature matrix of the training port information to obtain a plurality of training classification feature vectors;

performing Geng Beier normal periodic re-parameterization on the plurality of training classification feature vectors to obtain a plurality of optimized training classification feature vectors;

the optimized training classification feature vectors pass through a classifier to obtain a plurality of classification loss function values; and

the context encoder including the word embedding layer, the convolutional neural network model as the feature extractor, and the classifier are trained based on the plurality of classification loss function values and propagated through the direction of gradient descent.

9. The data-derivation and port-prediction based network connection method of claim 8 wherein performing Geng Beier normal periodic re-parameterization on the plurality of training classification feature vectors to obtain a plurality of optimized training classification feature vectors comprises: performing Geng Beier normal periodic re-parameterization on the training classification feature vectors by using the following optimization formula to obtain a plurality of optimized training classification feature vectors;

Wherein, the optimization formula is:

wherein v is _i Representing the plurality of training classification feature directionsThe eigenvalues of each position of the quantity, mu and sigma are the mean and variance of the eigenvalue set of each position of the training classification eigenvectors, log represents a logarithmic function based on 2, arcsin (·) represents an arcsin function, arccos (·) represents an arccosine function, v _i ' represents the feature values for each location of the plurality of optimized training classification feature vectors.