CN115695025A

CN115695025A - Training method and device of network security situation prediction model

Info

Publication number: CN115695025A
Application number: CN202211378514.1A
Authority: CN
Inventors: 葛康康
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2022-11-04
Filing date: 2022-11-04
Publication date: 2023-02-03

Abstract

The application provides a training method and device for a network security situation prediction model. The method comprises the following steps: generating a model training sample according to the acquired network security status data of the electronic equipment; inputting the model training sample into a to-be-trained network security situation prediction model; calling a convolutional network layer to process the model training sample to obtain a network situation characteristic diagram of the model training sample associated in the time dimension; calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain attention enhancement characteristics; calling a deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample; calculating to obtain a loss value of the to-be-trained network security situation prediction model based on the prediction network situation label and the real network situation label; and under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. The method and the device can improve the network security.

Description

Training method and device of network security situation prediction model

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a training method and a training device for a network security situation prediction model.

Background

With the large-scale development of new application of new technologies such as cloud computing, big data, internet of things and artificial intelligence in the domestic internet technology, network security events are increasing day by day.

For increasingly complicated and diversified network attacks, traditional protection measures such as firewalls, access control, vulnerability scanning, intrusion detection and the like are protected from both active defense and passive defense, and although the security devices have the recording function of security events and security logs, the security devices are independent from one another, and the security information is scattered and cannot be shared. A network security administrator is difficult to monitor the global network condition, and cannot make an appropriate decision when an attack occurs, thereby resulting in low network security.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present application is to provide a method and an apparatus for training a network security situation prediction model, so as to greatly improve network security.

In a first aspect, an embodiment of the present application provides a method for training a network security situation prediction model, where the method includes:

generating a model training sample according to the acquired network security status data of the electronic equipment; each model training sample corresponds to a real network situation label;

inputting the model training sample into a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises: a convolution network layer, an attention mechanism layer and a deep neural network layer;

calling the convolutional network layer to process the model training sample to obtain a network situation characteristic diagram of the model training sample in a time dimension;

calling the attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain attention enhancement characteristics;

calling the deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample;

calculating to obtain a loss value of the to-be-trained network security situation prediction model based on the predicted network situation label and the real network situation label;

and under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model.

Optionally, the generating a model training sample according to the acquired network security status data of the electronic device includes:

collecting network security status data of the electronic equipment;

cleaning the network security status data to obtain processed network security status data;

and normalizing the processed network security condition data to obtain the model training sample.

Optionally, the invoking the convolutional network layer to process the model training sample to obtain a network situation feature map of the model training sample in a time dimension includes:

calling the convolution network layer to perform convolution operation on the model training sample so as to obtain network situation characteristics and time dimension characteristics of the model training sample;

and carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated in the time dimension.

Optionally, the invoking the attention mechanism layer to process the network situation features in the network situation feature map to obtain the attention enhancement feature includes:

calling the attention mechanism layer to perform vector conversion processing on the network situation features in each channel to obtain global situation features corresponding to each channel;

obtaining the weight corresponding to each channel according to the correlation index among the channels;

and processing the global situation characteristics and the corresponding weights of each channel to obtain attention enhancement characteristics.

Optionally, the network security situation prediction model to be trained further includes: a parameter optimization layer for optimizing the parameters of the target,

after the loss value of the to-be-trained network security situation prediction model is calculated and obtained based on the predicted network situation label and the real network situation label, the method further includes:

under the condition that the loss value is not within a preset range, calling the parameter optimization layer to determine a model parameter adjustment value of the to-be-trained network security situation prediction model according to the loss value by adopting a genetic algorithm;

and adjusting the model parameters of the to-be-trained network security situation prediction model based on the model parameter adjustment value.

Optionally, after the trained network security situation prediction model to be trained is used as a final network security situation prediction model, the method further includes:

acquiring target network security status data of target equipment;

preprocessing the target network security condition data to obtain model input data;

inputting the model input data into the network security situation prediction model;

calling the convolutional network layer to process the model input data to obtain a target network situation characteristic diagram associated with the model input data in a time dimension;

calling the attention mechanism layer to process the target network situation characteristic diagram to obtain a target attention enhancement characteristic;

and calling the deep neural network layer to process the target attention enhancement features to obtain a target prediction network situation label corresponding to the target equipment.

In a second aspect, an embodiment of the present application provides an apparatus for training a network security situation prediction model, where the apparatus includes:

the model sample generation module is used for generating a model training sample according to the acquired network security condition data of the electronic equipment; each model training sample corresponds to a real network situation label;

the model sample input module is used for inputting the model training sample to a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises: a convolution network layer, an attention mechanism layer and a deep neural network layer;

the situation characteristic diagram acquisition module is used for calling the convolution network layer to process the model training sample to obtain a network situation characteristic diagram of the model training sample in a time dimension;

the enhanced feature acquisition module is used for calling the attention mechanism layer to process the network situation features in the network situation feature map to obtain the attention enhanced features;

the prediction label acquisition module is used for calling the deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample;

the loss value calculation module is used for calculating and obtaining the loss value of the to-be-trained network security situation prediction model based on the predicted network situation label and the real network situation label;

and the prediction model acquisition module is used for taking the trained network security situation prediction model to be trained as a final network security situation prediction model under the condition that the loss value is within a preset range.

Optionally, the model sample generation module includes:

the network condition data acquisition unit is used for acquiring network safety condition data of the electronic equipment;

the safety condition data acquisition unit is used for cleaning the network safety condition data to obtain processed network safety condition data;

and the model training sample acquisition unit is used for carrying out normalization processing on the processed network security condition data to obtain the model training sample.

Optionally, the situation characteristic map obtaining module includes:

the situation characteristic acquisition unit is used for calling the convolution network layer to carry out convolution operation on the model training sample so as to obtain the network situation characteristic and the time dimension characteristic of the model training sample;

and the situation characteristic map acquisition unit is used for carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic map associated in the time dimension.

Optionally, the enhanced feature acquisition module includes:

the global feature acquisition unit is used for calling the attention mechanism layer to perform vector conversion processing on the network situation features in each channel to obtain the global situation features corresponding to each channel;

the channel weight obtaining unit is used for obtaining the weight corresponding to each channel according to the correlation index among the channels;

and the enhanced feature acquisition unit is used for processing the global situation features and the corresponding weights of each channel to obtain attention enhanced features.

the device further comprises:

the parameter adjustment value determining module is used for calling the parameter optimization layer to determine a model parameter adjustment value of the to-be-trained network security situation prediction model according to the loss value by adopting a genetic algorithm under the condition that the loss value is not within a preset range;

and the model parameter adjusting module is used for adjusting the model parameters of the to-be-trained network security situation prediction model based on the model parameter adjusting values.

Optionally, the apparatus further comprises:

the target condition data acquisition module is used for acquiring target network security condition data of the target equipment;

the model input data acquisition module is used for preprocessing the target network security condition data to obtain model input data;

the model input data input module is used for inputting the model input data into the network security situation prediction model;

the target characteristic diagram acquisition module is used for calling the convolution network layer to process the model input data to obtain a target network situation characteristic diagram associated with the model input data in a time dimension;

the target enhancement feature acquisition module is used for calling the attention mechanism layer to process the target network situation feature map to obtain target attention enhancement features;

and the network situation label acquisition module is used for calling the deep neural network layer to process the target attention enhancement features to obtain a target prediction network situation label corresponding to the target equipment.

In a third aspect, an embodiment of the present application provides an electronic device, including:

the network security situation prediction model training method comprises a processor, a memory and a computer program which is stored in the memory and can run on the processor, wherein the processor executes the program to realize the network security situation prediction model training method.

In a fourth aspect, the present application provides a computer-readable storage medium, where instructions, when executed by a processor of an electronic device, enable the electronic device to perform any one of the above methods for training a network security situation prediction model.

Compared with the prior art, the embodiment of the application has the following advantages:

in the embodiment of the application, model training samples are generated according to the acquired network security status data of the electronic equipment, and each model training sample corresponds to a real network situation label. Inputting the model training sample into a to-be-trained network security situation prediction model, wherein the to-be-trained network security situation prediction model comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training sample to obtain a network situation characteristic diagram associated with the model training sample in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain the attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample. And calculating to obtain a loss value of the network security situation prediction model to be trained based on the predicted network situation label and the real network situation label. And under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, the deep relation between data is trained by combining the characteristics of complexity and time variability of network security situation change, so that a model for predicting the network security situation is obtained through training, and the network security can be greatly improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

Fig. 1 is a flowchart illustrating steps of a method for training a network security situation prediction model according to an embodiment of the present application;

fig. 2 is a flowchart illustrating steps of a method for obtaining model training samples according to an embodiment of the present disclosure;

fig. 3 is a flowchart illustrating steps of a method for acquiring a network situation characteristic diagram according to an embodiment of the present application;

fig. 4 is a flowchart illustrating steps of a method for obtaining attention-enhancing features according to an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating steps of a method for adjusting model parameters according to an embodiment of the present disclosure;

fig. 6 is a flowchart illustrating steps of a method for obtaining a target prediction network situation prediction tag according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a training apparatus for a network security situation prediction model according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

In a network situation analysis scene, network security situation prediction refers to analyzing and fusing situation values obtained by evaluating historical network data situations, mining deep-level relationships among data, predicting the development trend of future situations by using theoretical methods such as expert knowledge and the like, and providing decision basis for security management personnel. The network security situation change has the characteristics of complexity, nonlinearity and time-varying property, and the neural network technology has higher fault tolerance, stronger nonlinear mapping capability and generalization capability for a complex system. Compared with the traditional machine learning model, the deep learning model has great potential in the field of network security situation prediction.

The embodiment of the application trains deep relationships among data by combining the characteristics of complexity and time-varying property of network security situation change, so as to train and obtain a model for predicting the network security situation, and greatly improve the network security.

Next, a training process of the network security situation prediction model provided in the embodiment of the present application is described in detail with reference to specific embodiments.

Referring to fig. 1, a flowchart illustrating steps of a training method of a network security situation prediction model provided in an embodiment of the present application is shown, and as shown in fig. 1, the training method of the network security situation prediction model may include the following steps:

step 101: generating a model training sample according to the acquired network security status data of the electronic equipment; each model training sample corresponds to a real network situation label.

In this embodiment, the network security status data may include: data contents such as IT asset information, topology information, vulnerability information and the like.

When the network security situation prediction model is trained, network security condition data of the electronic equipment can be collected, and a model training sample is generated according to the network security condition data of the electronic equipment. In specific implementation, after the network security status data of the electronic device is collected, the network security status data can be cleaned, and then normalization processing is performed, so that a model training sample can be obtained. In this example, each model training sample corresponds to a real network situation label, which is a network security situation category label labeled for the model training sample.

The process of obtaining the model training samples can be described in detail below in conjunction with fig. 2.

Referring to fig. 2, a flowchart illustrating steps of a method for obtaining a model training sample according to an embodiment of the present application is shown, and as shown in fig. 2, the method for obtaining a model training sample may include: step 201, step 202 and step 203.

Step 201: and collecting network security status data of the electronic equipment.

In this embodiment, when training the network security situation prediction model, network security status data of the electronic device, such as IT asset information, topology information, and vulnerability information, may be collected.

After the network security status data of the electronic device is collected, step 202 is executed.

Step 202: and cleaning the network safety condition data to obtain processed network safety condition data.

After the network security status data of the electronic device is collected, a cleaning operation may be performed on the network security status data to obtain processed network security status data. Specifically, missing value processing, duplicate value removal, noise reduction, outlier removal, and the like may be performed on the network security status data to obtain processed network security status data.

After the network security status data is processed by performing a cleaning operation on the network security status data, step 203 is performed.

Step 203: and normalizing the processed network security condition data to obtain the model training sample.

After the network security condition data is cleaned to obtain processed network security condition data, the processed network security condition data can be normalized to obtain a model training sample.

In specific implementation, the data normalization can reduce the variance of the features to a certain range, reduce the influence of abnormal values, and improve the convergence rate of the model. The existing normalization method is mainly to normalize to [0,1]And [ -1,1]Two, the latter can be selected in this embodiment, and the min-max normalization method is adopted to normalize the feature data to be between-1 and 1. Specifically, for the feature data H _x ＝[h _x1 ,h _x2 ,...,h _xn ](x =1,2,3,4,5), wherein (n represents the total number of samples), and h is _xi Mapping to the Interval [ -1,1]Result of (b) is h' _xi The calculation formula is as follows:

in the above formula (1), min (H) _x ) Representing the ith dimension feature H _x Minimum value of, max (H) _x ) Represents the ith dimension feature H _x Of (c) is calculated.

It can be understood that, when performing model training, the number of required model training samples is large, in this example, a model sample corresponding to the network security status data of each electronic device may be used as one training sample, and then, in order to obtain a large number of model training samples, network security status data of a preset number of electronic devices may be obtained, and then the network security status data of each electronic device is processed to obtain a model training sample corresponding to each electronic device, so that the preset number of model training samples may be obtained.

After generating model training samples from the collected network security condition data of the electronic device, step 102 is performed.

Step 102: inputting the model training sample into a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer.

The to-be-trained network security situation prediction model is a pre-built model which is not trained yet and is used for predicting the network security situation. In this example, the network security situation prediction model to be trained may include: a convolutional network layer, an attention mechanism layer, and a deep neural network layer.

After the model training samples are generated according to the acquired network security condition data of the electronic device, the model training samples can be input into the to-be-trained network security situation prediction model.

After inputting the model training samples into the network security situation prediction model to be trained, step 103 is executed.

Step 103: and calling the convolutional network layer to process the model training sample to obtain a network situation characteristic diagram of the model training sample in a time dimension.

After the model training sample is input into the to-be-trained network security situation prediction model, the convolution network layer can be called to process the model training sample so as to obtain a network situation characteristic diagram of the model training sample associated in the time dimension. The method is characterized in that the incidence relation of a model training sample in a time dimension is mined through a convolutional neural network so as to sufficiently learn the dependency relation between data. The specific process for obtaining the network situation feature map associated with the model training sample in the time dimension can be described in detail below with reference to fig. 3.

Referring to fig. 3, a flowchart illustrating steps of a network situation feature map obtaining method provided in an embodiment of the present application is shown, and as shown in fig. 3, the network situation feature map obtaining method may include: step 301 and step 302.

Step 301: and calling the convolution network layer to perform convolution operation on the model training sample so as to obtain the network situation characteristic and the time dimension characteristic of the model training sample.

In this embodiment, a 2-layer convolutional neural network may be used for the convolution operation. The convolutional layer in the convolutional neural network is a main part which is different from other traditional neural networks, the convolutional layer is equivalent to a feature extractor to extract features, parameter sharing is realized, the number of parameters is reduced, and the process expression can be shown as the following formula (2):

y＝δ(W*x+b) (2)

in the above equation (2), y is the output of the convolutional network layer, δ is the activation function, W is the weight matrix, x is the convolution operation, x is the input of the convolutional layer, and b is the offset variable.

Among them, the most critical part in the convolution operation is the convolution kernel, also called as a filter. The convolution kernel is essentially an array of fixed-size numerical parameters. The operation algorithm of the convolution kernel can be expressed as shown in the following formula (3), formula (4) and formula (5):

in the above formula, Y ^t Is an input of the t +1 layer, Y ^t+1 Is a situation characteristic diagram, is the output of the t +1 layer, b is a deviation variable, L _t+1 Is Y ^t+1 Of the cell. Y (i, j) is the pixel value corresponding to the coordinate point (i, j) in the feature map, p, k, s are the size, step length and filling of the convolution kernel in the convolution layer, respectively, and C is the channel number of the feature map. The summation in the above equation is equivalent to a cross-correlation solution.

The convolution layer extracts local features of input information by utilizing convolution operation, and the final result is influenced by different input data, so that different feature activations are obtained and become feature mappings. The convolution operation is performed in the same convolution layer by using the same group of convolution kernel parameters, and when the weight is updated, only the group of convolution kernel parameters need to be updated, which is called weight sharing. After the convolution operation, the features are processed by using a nonlinear activation function, and the nonlinear expression capability of the learning features is enhanced.

In the convolutional neural network of the present embodiment, a GELU function is employed as the activation function. The activation function is responsible for mapping the input of the neural network neuron to the output end and introducing nonlinear factors, and the features are mapped to a nonlinear space, so that more effective extraction of the features is completed. If a linear activation function is used, the neural network can only achieve a linear combination of the input information. The present embodiment uses the GELU activation function as the activation function of the neural network, and the formula is as follows:

in the convolutional neural network of the present embodiment, the loss function is a cross entropy loss function, and the formula is as follows:

wherein n is the number of categories in the classification problem, p _k (x) For real network security situation class values, q _k (x) To predict the network security situation category value, k represents the kth class type. The algorithm in this embodiment also needs to perform L2 regularization, and add a regularization term, where the formula is as follows:

wherein, λ is a regularization coefficient, m is a regularization partial parameter quantity, and ω is a parameter term participating in regularization.

L2 regularization can limit the range of parameter values, preventing overfitting from occurring.

The cross entropy can measure the difference degree of different probability distributions in the same random variable, and the difference degree is expressed as the difference between the real probability distribution and the prediction probability distribution in the learning of the neural network model, and the smaller the value of the cross entropy is, the better the model prediction effect is.

When the GELU activating function is used, the cross entropy loss function can better solve the problem that the square loss function updates the weight too slowly, and has the good property that the error magnitude is linearly related to the weight updating speed, namely the weight is updated more quickly when the error is larger, and the weight is updated more slowly when the error is smaller.

In the classification problem, a cross entropy loss function and a Softmax activation function are used together to play a role more effectively, the Softmax function processes an output result to enable the sum of predicted values of a plurality of classifications to be 1, and then loss is calculated through the cross entropy function.

In this embodiment, 2 convolutional neural networks are used, and 1 pooling layer is used after each convolutional layer to prevent the over-fitting phenomenon. The main function of pooling is to compress the amount of data and parameters, reduce the spatial feature dimension, and achieve a down-sampling function of the data, based on extracting the key local information in each feature, so the pooling layer is also called a down-sampling layer, and is usually placed after the convolutional layer. The pooling layer can effectively improve the calculation speed, improve the robustness of the extracted features, improve the generalization capability of the model and prevent overfitting. The pooling operation can be shown by the following equation (9):

wherein the content of the first and second substances,

for pooled output data, q is a pooling type, representing average pooling when q is 1 and maximum pooling when q is 0.

After the model training samples are obtained, a convolution network layer can be called to perform convolution operation on the model training samples so as to obtain network situation characteristics and time dimension characteristics of the model training samples. The time dimension characteristic may be used to indicate time information of the network situation characteristic.

After the convolutional network layer is invoked to perform a convolution operation on the model training sample to obtain the network situation features and the time dimension features of the model training sample, step 302 is performed.

Step 302: and fusing the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated in a time dimension.

After the convolution network layer is called to perform convolution operation on the model training sample to obtain the network situation features and the time dimension features of the model training sample, fusion processing can be performed on the network situation features and the time dimension features to obtain a network situation feature map associated under the time dimension.

After the convolutional network layer is called to process the model training sample to obtain a network situation feature map associated with the model training sample in the time dimension, step 104 is executed.

Step 104: and calling the attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain attention enhancement characteristics.

After the convolutional network layer is called to process the model training sample to obtain a network situation characteristic diagram associated with the model training sample in the time dimension, the attention mechanism layer can be called to process the network situation characteristics in the network situation characteristic diagram to obtain the attention enhancement characteristics.

The attention mechanism layer is described as follows:

the squeeze and channel network attention module first performs a squeeze operation on the feature map, using global average pooling to convert the entire spatial feature of a channel into a global spatial feature as a representation of the channel. Can be represented by the following formula:

wherein X _m (i, j) denotes the mth feature map X _m Channel characteristic value at position (i, j), F _sq () Indicating a squeeze operation, i.e., a GAP operation.

After obtaining the global description features, the correlation between channels is captured by the excitation operation. The method comprises the following steps of adopting a structure containing two full connection layers, enabling the first full connection layer to be responsible for dimensionality reduction, adopting the second full connection layer to restore the original dimensionality after a ReLU function is activated, introducing a gating mechanism in a Sigmoid form, learning the nonlinear relation among channels flexibly by the gating mechanism to obtain a weight value between 0 and 1, multiplying each original characteristic by the weight of the corresponding channel to obtain a new characteristic diagram, wherein the operation process is shown in a formula (11):

wherein f () and δ () denote Sigmoid function and ReLU function, respectively, W _U Is the weight, W, obtained by increasing the number of channels at a certain rate when the low-dimensional feature maps pass through the convolutional layer _X The effect of (1) is to reduce the number of channels, which are the weights of the convolutional layers. Each slice channel is activated by training the two weights to obtain a one-dimensional excitation weight. u denotes the finally obtained channel statistics. u. of _m Represents the mth channel scaling descriptor, and X _m Channel-by-channel multiplication results in a Hadamard product.

The acquisition process for the attention enhancing feature may be described in detail below in conjunction with fig. 4.

Referring to fig. 4, a flowchart illustrating steps of an attention enhancement feature acquisition method provided in an embodiment of the present application is shown, and as shown in fig. 4, the attention enhancement feature acquisition method may include: step 401, step 402 and step 403.

Step 401: and calling the attention mechanism layer to perform vector conversion processing on the network situation features in each channel to obtain the global situation features corresponding to each channel.

In this embodiment, the obtained network situation feature map is a feature map formed by three-dimensional features, each dimensional network situation feature corresponds to one channel, and after the network situation feature map is obtained, an attention mechanism layer may be invoked to perform vector conversion processing on the network situation features in each channel, so as to obtain a global situation feature corresponding to each channel.

After the attention mechanism layer is called to perform vector conversion processing on the network situation features in each channel to obtain global situation features corresponding to each channel, step 403 is executed.

Step 402: and obtaining the weight corresponding to each channel according to the correlation index among the channels.

In this embodiment, the weight corresponding to each channel may be obtained according to the correlation index between the channels. For the manner of obtaining the weight, reference may be made to the description of the attention mechanism layer, and this embodiment is not described herein again.

After obtaining the weight corresponding to each channel according to the correlation index between the channels, step 403 is performed.

Step 403: and processing the global situation characteristics and the corresponding weights of each channel to obtain attention enhancement characteristics.

After the weight corresponding to each channel is obtained according to the correlation index between the channels, the processing can be performed according to the global situation characteristic and the corresponding weight of each channel to obtain the attention enhancement characteristic. Specifically, the global situation feature of each channel may be multiplied by the weight of the corresponding channel, so as to obtain a new feature map, where the new feature map includes the attention enhancing feature.

After the attention mechanism layer is invoked to process the network situation features in the network situation feature map to obtain the attention enhancement features, step 105 is executed.

Step 105: and calling the deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample.

The prediction network situation label refers to a prediction network situation category of a model training sample output by a to-be-trained network security situation prediction model.

After the attention mechanism layer is called to process the network situation features in the network situation feature map to obtain the attention enhancement features, the deep neural network layer can be called to process the attention enhancement features to obtain the predicted network situation labels corresponding to the model training samples.

In this embodiment, the deep neural network layer may be a deep neural network optimized by using a genetic algorithm, and the basic deep neural network performs learning training by using an error back propagation algorithm, and has a simple structure, high plasticity, and high data fitting capability. The deep neural network mainly comprises an input layer, a hidden layer and an output layer, in the training process, the neural network continuously adjusts the weight and the threshold value between the input layer and the hidden layer and between the hidden layer and the output layer, when the output value of the neural network is consistent with a target value or reaches the number of times of generation selection, the training is stopped, and the neural network has strong generalization capability.

The invention adopts a genetic algorithm to replace a BP algorithm, and optimizes the deep neural network through the genetic algorithm. The genetic algorithm is designed and proposed according to the evolution law of organisms in the nature, the working principle is that input data are coded firstly, then selection, intersection and mutation operation are carried out through certain probability until an individual with the maximum fitness is selected as a target value to be output, and then the operation is stopped.

In the genetic algorithm adopted by the invention, the reciprocal of the square error is used as a fitness function to measure the size of the individual adaptability in the population, and the formula is as follows:

in the above formula (12), E is an error function, P is an overall output, W is a weight vector, x is an input vector, F is fitness, j is the number of generations, y _i Is a theoretical output value.

The traditional genetic algorithm usually adopts a 'roulette' mode in the working process, the probability of selecting individuals in a population is random, the optimal individuals are probably lost by the selection mode, and a large error is generated in the actual operation process, so that the selection operator is improved, the population individuals are firstly rearranged by using a ranking method, and the probability of selecting the individuals after the rearrangement is as follows:

in the above formula (13), a is the number of populations in the genetic algorithm, p0 is the probability that the optimal individual may be selected, and S is p ⁰ The normalized value, b is the position of the nth individual after the population has been rearranged.

The genetic algorithm adopted in the invention improves the crossover operator. Conventional genetic algorithms typically set the crossover probability to a constant between 0.3 and 0.8 during operation. In the operation process, the global searching capability of the genetic algorithm is improved when the cross probability is set to be too high, but the adaptive capacity of the chromosome is reduced, and the global searching capability and the convergence speed of the genetic algorithm are reduced when the cross probability is set to be too low. The invention improves the crossover operator, and can adjust the change of the crossover probability according to the change of the fitness in the algorithm generation selection process, wherein the improved crossover probability is as follows:

in the above formula (14), F is the maximum fitness of two crossed individuals in the population, mean is the average fitness of the whole population, n is the number of generations of the genetic algorithm in the current working process, and n is _max For the maximum number of generations of the evolution operator in the working process, the minimum cross probability P can be set when the genetic algorithm is initialized _jmin Set to 0.3, maximum crossover probability P _jmax Set to 0.8.

The genetic algorithm adopted in the embodiment improves mutation operators, and the mutation probability of the traditional genetic algorithm is generally set to be a constant between 0.001 and 0.1 in the working process. In the initial stage of genetic algorithm operation, the fitness of population individuals is low relative to the average fitness, and therefore, it is necessary to keep individuals with excellent genes in chromosomes by setting the probability of mutation to a small value. In the later stage of genetic algorithm operation, the fitness of population individuals is relatively higher than the average fitness, so that the probability of variation needs to be set to a larger value to improve the local search capability of the genetic algorithm. The invention improves the mutation operator, and can adjust the value of the mutation probability according to the change of the fitness in the operation process of the genetic algorithm, and the improved mutation probability is shown as the following formula:

in the above formula (15), F is the maximum fitness of two individuals in the population that have undergone variation, mean is the average fitness of the entire population, n is the number of generations of the genetic algorithm in the current mean working process, and n is the number of generations of the genetic algorithm in the current mean working process _max For the maximum generation number of the evolution operator in the working process, the minimum mutation probability P can be set when the genetic algorithm is initialized _bmin Set to 0.001, maximum mutation probability P _bmax Set x to 0.1.

In this embodiment, the parameters of the deep neural network used are: the number of network layers is 5, the number of hidden nodes is 100, the learning rate is 0.001, the batch size (batch size) is 64, and the number of iterations is 200. The classifier adopted in the present embodiment may be a Softmax classification function, etc.

And executing step 106 after calling the deep neural network layer to process the attention enhancement features to obtain the predicted network situation label corresponding to the model training sample.

Step 106: and calculating to obtain a loss value of the to-be-trained network security situation prediction model based on the predicted network situation label and the real network situation label.

After the deep neural network layer is called to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample, a loss value of the to-be-trained network security situation prediction model can be calculated and obtained based on the prediction network situation label and the real network situation label.

In a specific implementation, the calculation manner of the loss value may be a calculation manner corresponding to the cross entropy loss function, and the like, specifically, the specific calculation manner of the loss value may be determined according to the service requirement, which is not limited in this embodiment.

After calculating the loss value of the to-be-trained network security situation prediction model based on the predicted network situation label and the real network situation label, executing step 107.

Step 107: and under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model.

After the loss value of the to-be-trained network security situation prediction model is calculated based on the prediction network situation label and the real network situation label, whether the loss value is within a preset range or not can be judged.

If the loss value is within the preset range, the trained network security situation prediction model to be trained can be used as a final network security situation prediction model, and the network security situation prediction model can be applied to a subsequent scene for predicting the network security situation of the electronic equipment.

If the loss value is not within the preset range, the model parameters of the network security situation prediction model to be trained can be adjusted according to the loss value, and the network security situation prediction model to be trained with the adjusted model parameters is continuously trained. The adjustment process for the model parameters can be described in detail below in conjunction with fig. 5.

Referring to fig. 5, a flowchart illustrating steps of a model parameter adjustment method provided in an embodiment of the present application is shown, and as shown in fig. 5, the model parameter adjustment method may include: step 501 and step 502.

In this embodiment, the to-be-trained network security situation prediction model may further include: the parameter optimization layer, after the step 106, may further include:

step 501: and under the condition that the loss value is not within a preset range, calling the parameter optimization layer to determine a model parameter adjustment value of the to-be-trained network security situation prediction model according to the loss value by adopting a genetic algorithm.

In this embodiment, after the loss value of the to-be-trained network security situation prediction model is calculated, if the loss value is not within the preset range, the parameter optimization layer may be invoked to determine the model parameter adjustment value of the to-be-trained network security situation prediction model according to the loss value by using a genetic algorithm.

After the parameter optimization layer is called and a genetic algorithm is adopted to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value, step 502 is executed.

Step 502: and adjusting the model parameters of the to-be-trained network security situation prediction model based on the model parameter adjustment value.

After the parameter optimization layer is called and a model parameter adjustment value of the to-be-trained network security situation prediction model is determined according to the loss value by adopting a genetic algorithm, model parameters of the to-be-trained network security situation prediction model can be adjusted based on the model parameter adjustment value.

After the model parameters of the to-be-trained network security situation prediction model are adjusted, the to-be-trained network security situation prediction model after the model parameters are adjusted can be trained until the model converges.

After the final network security situation prediction model is obtained through training, an inference task can be executed by using the network security situation prediction model. The model inference process can be described in detail below in conjunction with FIG. 6.

Referring to fig. 6, a flowchart illustrating steps of a target prediction network situation label obtaining method provided in an embodiment of the present application is shown, and as shown in fig. 6, the target prediction network situation label obtaining method may include: step 601, step 602, step 603, step 604, step 605 and step 606.

Step 601: and acquiring target network security status data of the target equipment.

In this embodiment, the target device refers to an electronic device that needs to perform network security situation prediction.

The target network security status data refers to network security status data within a historical time of the target device.

When the network security situation of the target device is predicted, the target network security situation data of the target device can be acquired.

After the target network security status data of the target device is acquired, step 602 is executed.

Step 602: and preprocessing the target network security condition data to obtain model input data.

After the target network security status data of the target device is acquired, the target network security status data may be preprocessed to obtain model input data, and specifically, the target network security status data may be subjected to data cleaning and normalization processing to obtain the model input data.

After preprocessing the target network security status data to obtain model input data, step 603 is performed.

Step 603: and inputting the model input data into the network security situation prediction model.

After the model input data is obtained by preprocessing the target network security condition data, the model input data can be input into the network security situation prediction model to predict the network security situation.

After inputting the model input data to the network security posture prediction model, step 604 is performed.

Step 604: and calling the convolutional network layer to process the model input data to obtain a target network situation characteristic diagram associated with the model input data in a time dimension.

After the model input data are input into the network security situation prediction model, a convolutional network layer can be called to process the model input data, and a target network situation characteristic diagram associated with the model input data in a time dimension is obtained. For the processing procedure, the implementation procedure of the network situation characteristic diagram may be referred to, and details are not described herein again.

After the convolutional network layer is called to process the model input data to obtain a target network situation feature map associated with the model input data in the time dimension, step 605 is executed.

Step 605: and calling the attention mechanism layer to process the target network situation characteristic diagram to obtain a target attention enhancement characteristic.

After the convolution network layer is called to process the model input data to obtain a target network situation characteristic diagram associated with the model input data in the time dimension, the attention mechanism layer can be called to process the target network situation characteristic diagram to obtain a target attention enhancement characteristic. For the processing procedure, reference may be made to the implementation procedure of the attention enhancement feature, and details of this embodiment are not described herein again.

After invoking the attention mechanism layer to process the target network situation feature map to obtain the target attention enhancement feature, step 606 is executed.

Step 606: and calling the deep neural network layer to process the target attention enhancement features to obtain a target prediction network situation label corresponding to the target equipment.

After the attention mechanism layer is called to process the target network situation characteristic graph to obtain the target attention enhancement characteristic, the deep neural network layer can be called to process the target attention enhancement characteristic to obtain a target prediction network situation label corresponding to the target device.

According to the embodiment of the application, the overall design of the network security situation perception system is provided on the basis of analyzing key technologies such as data acquisition, correlation analysis, situation assessment and security response in detail, the key modules are designed in detail and technically realized, and the network security is greatly improved.

According to the training method of the network security situation prediction model, model training samples are generated according to the acquired network security situation data of the electronic equipment, and each model training sample corresponds to a real network situation label. Inputting the model training sample into a to-be-trained network security situation prediction model, wherein the to-be-trained network security situation prediction model comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training sample to obtain a network situation characteristic diagram associated with the model training sample in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain the attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample. And calculating to obtain a loss value of the to-be-trained network security situation prediction model based on the prediction network situation label and the real network situation label. And under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, the deep relation between data is trained by combining the characteristics of complexity and time variability of network security situation change, so that a model for predicting the network security situation is obtained through training, and the network security can be greatly improved.

Referring to fig. 7, a schematic structural diagram of a training apparatus for a network security situation prediction model provided in an embodiment of the present application is shown, and as shown in fig. 7, the training apparatus 700 for a network security situation prediction model may include the following modules:

the model sample generation module 710 is used for generating a model training sample according to the acquired network security status data of the electronic device; each model training sample corresponds to a real network situation label;

the model sample input module 720 is configured to input the model training sample to the to-be-trained network security situation prediction model; the network security situation prediction model to be trained comprises: a convolution network layer, an attention mechanism layer and a deep neural network layer;

the situation characteristic map obtaining module 730 is configured to call the convolutional network layer to process the model training sample, so as to obtain a network situation characteristic map of the model training sample in a time dimension;

the enhanced feature obtaining module 740 is configured to invoke the attention mechanism layer to process the network situation features in the network situation feature map, so as to obtain an attention enhanced feature;

a predicted label obtaining module 750, configured to invoke the deep neural network layer to process the attention enhancing feature, so as to obtain a predicted network situation label corresponding to the model training sample;

a loss value calculation module 760, configured to calculate a loss value of the to-be-trained network security situation prediction model based on the predicted network situation label and the real network situation label;

and the prediction model obtaining module 770 is configured to, under the condition that the loss value is within the preset range, use the trained network security situation prediction model to be trained as a final network security situation prediction model.

Optionally, the model sample generation module includes:

Optionally, the situation characteristic map obtaining module includes:

and the situation characteristic map acquisition unit is used for carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic map associated in a time dimension.

Optionally, the enhanced feature acquisition module includes:

the device further comprises:

Optionally, the apparatus further comprises:

the target enhanced feature acquisition module is used for calling the attention mechanism layer to process the target network situation feature map to obtain target attention enhanced features;

According to the training device of the network security situation prediction model, model training samples are generated according to the acquired network security situation data of the electronic equipment, and each model training sample corresponds to a real network situation label. Inputting the model training sample into a to-be-trained network security situation prediction model, wherein the to-be-trained network security situation prediction model comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training sample to obtain a network situation characteristic diagram associated with the model training sample in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain the attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample. And calculating to obtain a loss value of the to-be-trained network security situation prediction model based on the prediction network situation label and the real network situation label. And under the condition that the loss value is within a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, the deep relation between data is trained by combining the characteristics of complexity and time variability of network security situation change, so that a model for predicting the network security situation is obtained through training, and the network security can be greatly improved.

An embodiment of the present application further provides an electronic device, including: the network security situation prediction model training method comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the training method of the network security situation prediction model is realized.

Fig. 8 shows a schematic structural diagram of an electronic device 800 according to an embodiment of the present application. As shown in fig. 8, electronic device 800 includes a Central Processing Unit (CPU) 801 that can perform various appropriate actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM) 802 or computer program instructions loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the electronic apparatus 800 can also be stored. The CPU801, ROM802, and RAM803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the electronic device 800 are connected to the I/O interface 805, including: an input unit 806 such as a keyboard, a mouse, a microphone, and the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The various processes and processes described above may be performed by processing unit 801. For example, the methods of any of the above embodiments may be implemented as a computer software program tangibly embodied on a computer-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto the electronic device 800 via the ROM802 and/or the communication unit 809. When loaded into RAM803 and executed by CPU801, a computer program may perform one or more of the actions of the methods described above.

Additionally, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for training the network security situation prediction model.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminals (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the true scope of the embodiments of the present application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusions, such that a process, method, article, or terminal that includes a list of elements does not include only those elements but also other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrases "comprising a" \8230; "does not exclude the presence of additional like elements in a process, method, article, or terminal that comprises the element.

The above detailed description is given to a training method of a network security situation prediction model, a training apparatus of a network security situation prediction model, an electronic device, and a computer-readable storage medium, and specific examples are applied herein to explain the principles and embodiments of the present application, and the descriptions of the above embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A training method of a network security situation prediction model is characterized by comprising the following steps:

inputting the model training sample into a to-be-trained network security situation prediction model; the network security situation prediction model to be trained comprises: a convolution network layer, an attention mechanism layer and a deep neural network layer;

2. The method of claim 1, wherein generating model training samples from the collected network security condition data of the electronic device comprises:

collecting network security status data of the electronic equipment;

3. The method of claim 1, wherein the invoking the convolutional network layer to process the model training sample to obtain a network situation feature map of the model training sample in a time dimension comprises:

and fusing the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated in a time dimension.

4. The method of claim 1, wherein the invoking the attention mechanism layer to process the network situation features in the network situation feature map to obtain attention enhancement features comprises:

5. The method of claim 1, wherein the network security posture prediction model to be trained further comprises: a parameter optimization layer for optimizing the parameters of the target,

6. The method according to claim 1, wherein after the using the trained network security situation prediction model to be trained as a final network security situation prediction model, further comprising:

acquiring target network security status data of target equipment;

calling the convolution network layer to process the model input data to obtain a target network situation characteristic diagram associated with the model input data in a time dimension;

7. An apparatus for training a network security situation prediction model, the apparatus comprising:

8. The apparatus of claim 7, wherein the model sample generation module comprises:

9. The apparatus of claim 7, wherein the situational feature map obtaining module comprises:

10. The apparatus of claim 7, wherein the enhanced feature acquisition module comprises:

11. The apparatus of claim 7, wherein the network security posture prediction model to be trained further comprises: a parameter optimization layer for optimizing the parameters of the target,

the device further comprises:

12. The apparatus of claim 7, further comprising:

13. An electronic device, comprising:

a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the method for training a network security posture prediction model of any one of claims 1 to 6 when executing the program.

14. A computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of training a network security posture prediction model of any one of claims 1 to 6.