CN115695025B

CN115695025B - Training method and device for network security situation prediction model

Info

Publication number: CN115695025B
Application number: CN202211378514.1A
Authority: CN
Inventors: 葛康康
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2022-11-04
Filing date: 2022-11-04
Publication date: 2024-05-14
Anticipated expiration: 2042-11-04
Also published as: CN115695025A

Abstract

The application provides a training method and device for a network security situation prediction model. The method comprises the following steps: generating a model training sample according to the collected network security condition data of the electronic equipment; inputting a model training sample into a network security situation prediction model to be trained; invoking a convolution network layer to process the model training sample to obtain a network situation feature map associated with the model training sample in a time dimension; invoking an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram to obtain attention enhancement characteristics; invoking a deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample; calculating to obtain a loss value of a network security situation prediction model to be trained based on the prediction network situation label and the real network situation label; and under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. The application can improve the network security.

Description

Training method and device for network security situation prediction model

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a training method and device of a network security situation prediction model.

Background

With the large-scale development of new applications of new technologies such as cloud computing, big data, internet of things, artificial intelligence and the like in the domestic internet technology, network security events are increasing.

For increasingly complex and diversified network attacks, traditional protection measures such as firewall, access control, vulnerability scanning and intrusion detection protect from both active defense and passive defense, and although the security devices have the recording functions of security events and security logs, the security devices are mutually independent, and the security information is dispersed and cannot be shared. It is difficult for a network security administrator to monitor the global network conditions and appropriate decisions cannot be made when an attack occurs, resulting in lower network security.

Disclosure of Invention

The embodiment of the application aims to provide a training method and a training device for a network security situation prediction model so as to greatly improve network security.

In a first aspect, an embodiment of the present application provides a training method for a network security situation prediction model, where the method includes:

generating a model training sample according to the collected network security condition data of the electronic equipment; each model training sample corresponds to a real network situation label;

inputting the model training sample into a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises the following steps: a convolutional network layer, an attention mechanism layer and a deep neural network layer;

invoking the convolution network layer to process the model training sample to obtain a network situation feature map of the model training sample in a time dimension;

invoking the attention mechanism layer to process the network situation characteristics in the network situation characteristic map to obtain attention enhancement characteristics;

invoking the deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample;

Calculating to obtain a loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label;

And under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model.

Optionally, the generating a model training sample according to the collected network security status data of the electronic device includes:

collecting network security status data of the electronic equipment;

performing cleaning operation on the network security condition data to obtain processed network security condition data;

and normalizing the processed network security condition data to obtain the model training sample.

Optionally, the calling the convolutional network layer to process the model training sample to obtain a network situation feature map of the model training sample in a time dimension, including:

invoking the convolution network layer to execute convolution operation on the model training sample so as to obtain network situation characteristics and time dimension characteristics of the model training sample;

And carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated under the time dimension.

Optionally, the calling the attention mechanism layer processes the network situation feature in the network situation feature map to obtain an attention enhancement feature, including:

invoking the attention mechanism layer to perform vector conversion processing on the network situation characteristics in each channel to obtain global situation characteristics corresponding to each channel;

Obtaining the weight corresponding to each channel according to the association index between the channels;

and processing the global situation characteristics and the corresponding weights of each channel to obtain the attention-enhancing characteristics.

Optionally, the network security situation prediction model to be trained further includes: a parameter optimization layer, wherein the parameter optimization layer,

After the loss value of the network security situation prediction model to be trained is calculated based on the predicted network situation label and the real network situation label, the method further comprises the following steps:

Under the condition that the loss value is not in a preset range, calling the parameter optimization layer to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by adopting a genetic algorithm;

And adjusting model parameters of the network security situation prediction model to be trained based on the model parameter adjustment value.

Optionally, after the trained network security situation prediction model to be trained is used as a final network security situation prediction model, the method further includes:

Acquiring target network security condition data of target equipment;

preprocessing the target network security condition data to obtain model input data;

inputting the model input data to the network security situation prediction model;

invoking the convolution network layer to process the model input data to obtain a target network situation feature map associated with the model input data in a time dimension;

invoking the attention mechanism layer to process the target network situation feature map to obtain target attention enhancement features;

And calling the deep neural network layer to process the target attention enhancement feature to obtain a target prediction network situation label corresponding to the target equipment.

In a second aspect, an embodiment of the present application provides a training apparatus for a network security situation prediction model, where the apparatus includes:

The model sample generation module is used for generating a model training sample according to the collected network security condition data of the electronic equipment; each model training sample corresponds to a real network situation label;

the model sample input module is used for inputting the model training sample into a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises the following steps: a convolutional network layer, an attention mechanism layer and a deep neural network layer;

the situation feature map acquisition module is used for calling the convolution network layer to process the model training sample to obtain a network situation feature map of the model training sample in the time dimension;

the enhanced feature acquisition module is used for calling the attention mechanism layer to process the network situation features in the network situation feature map so as to obtain attention enhanced features;

the prediction label acquisition module is used for calling the deep neural network layer to process the attention enhancement features to obtain a prediction network situation label corresponding to the model training sample;

the loss value calculation module is used for calculating the loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label;

The prediction model acquisition module is used for taking the trained network security situation prediction model to be trained as a final network security situation prediction model under the condition that the loss value is in a preset range.

Optionally, the model sample generation module includes:

the network condition data acquisition unit is used for acquiring network security condition data of the electronic equipment;

the safety condition data acquisition unit is used for cleaning the network safety condition data to obtain processed network safety condition data;

The model training sample acquisition unit is used for carrying out normalization processing on the processed network security condition data to obtain the model training sample.

Optionally, the situation feature map obtaining module includes:

The situation characteristic acquisition unit is used for calling the convolution network layer to execute convolution operation on the model training sample so as to obtain network situation characteristics and time dimension characteristics of the model training sample;

and the situation feature map acquisition unit is used for carrying out fusion processing on the network situation features and the time dimension features to obtain a network situation feature map associated under the time dimension.

Optionally, the enhanced feature acquisition module includes:

the global feature acquisition unit is used for calling the attention mechanism layer to perform vector conversion processing on the network situation features in each channel to obtain global situation features corresponding to each channel;

the channel weight acquisition unit is used for acquiring the weight corresponding to each channel according to the association index among the channels;

And the enhancement feature acquisition unit is used for processing the global situation features and the corresponding weights of each channel to obtain attention enhancement features.

The apparatus further comprises:

The parameter adjustment value determining module is used for calling the parameter optimization layer to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by adopting a genetic algorithm under the condition that the loss value is not in a preset range;

And the model parameter adjustment module is used for adjusting the model parameters of the network security situation prediction model to be trained based on the model parameter adjustment value.

Optionally, the apparatus further comprises:

The target condition data acquisition module is used for acquiring target network security condition data of the target equipment;

The model input data acquisition module is used for preprocessing the target network security condition data to obtain model input data;

the model input data input module is used for inputting the model input data into the network security situation prediction model;

The target feature map acquisition module is used for calling the convolution network layer to process the model input data to obtain a target network situation feature map associated with the model input data in a time dimension;

The target enhanced feature acquisition module is used for calling the attention mechanism layer to process the target network situation feature map so as to obtain target enhanced features;

And the network situation label acquisition module is used for calling the deep neural network layer to process the target attention enhancement characteristic so as to obtain a target prediction network situation label corresponding to the target equipment.

In a third aspect, an embodiment of the present application provides an electronic device, including:

A processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the training method of the network security posture prediction model of any one of the above when executing the program.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform a training method of the network security posture prediction model described in any one of the above.

Compared with the prior art, the embodiment of the application has the following advantages:

according to the embodiment of the application, the model training samples are generated according to the collected network security condition data of the electronic equipment, and each model training sample corresponds to one real network situation label. Inputting a model training sample into a network security situation prediction model to be trained, wherein the network security situation prediction model to be trained comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training samples to obtain a network situation feature map associated with the model training samples in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram so as to obtain attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample. And calculating to obtain a loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label. And under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, by combining the characteristics of complexity and time variability of network security situation change, the deep relation between data is trained to obtain the model for predicting the network security situation, so that the network security can be greatly improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.

Drawings

Fig. 1 is a step flowchart of a training method of a network security situation prediction model according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating steps of a method for obtaining a model training sample according to an embodiment of the present application;

Fig. 3 is a flowchart of steps of a method for obtaining a network situation feature map according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating steps of a method for obtaining an attention-enhancing feature according to an embodiment of the present application;

FIG. 5 is a flowchart illustrating steps of a method for adjusting model parameters according to an embodiment of the present application;

fig. 6 is a flowchart of steps of a method for obtaining a predicted label of a predicted network situation of a target according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a training device of a network security situation prediction model according to an embodiment of the present application;

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description.

The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

In a network situation analysis scene, network security situation prediction refers to that situation values obtained by evaluating historical network data situations are analyzed and fused, deep relations among the data are mined, and the development trend of future situations is predicted by using theoretical methods such as expert knowledge, so that decision basis is provided for security management staff. The network security situation change has the characteristics of complexity, nonlinearity and time variability, and the neural network technology has higher fault tolerance, stronger nonlinear mapping capability and generalization capability for a complex system. Compared with the traditional machine learning model, the deep learning model has great potential in the field of network security situation prediction.

The embodiment of the application trains the deep relation among data by combining the characteristics of complexity and time variability of network security situation change so as to train and obtain a model for predicting the network security situation, thereby greatly improving the network security.

The training process of the network security situation prediction model provided by the embodiment of the application is described in detail below with reference to specific embodiments.

Referring to fig. 1, a step flowchart of a training method of a network security situation prediction model provided by an embodiment of the present application is shown, and as shown in fig. 1, the training method of the network security situation prediction model may include the following steps:

Step 101: generating a model training sample according to the collected network security condition data of the electronic equipment; each of the model training samples corresponds to a real network situation label.

In this embodiment, the network security status data may include: data content such as IT asset information, topology information, vulnerability information and the like.

When the network security situation prediction model is trained, network security situation data of the electronic equipment can be collected, and a model training sample is generated according to the network security situation data of the electronic equipment. In a specific implementation, after the network security condition data of the electronic equipment are collected, the network security condition data can be cleaned, and then normalization processing is performed, so that a model training sample can be obtained. In this example, each model training sample corresponds to a real network situation label, which is a network security situation category label marked for the model training sample.

The process of obtaining the model training samples may be described in detail below in conjunction with fig. 2.

Referring to fig. 2, a flowchart illustrating steps of a model training sample obtaining method according to an embodiment of the present application is shown, where, as shown in fig. 2, the model training sample obtaining method may include: step 201, step 202 and step 203.

Step 201: and collecting network security condition data of the electronic equipment.

In this embodiment, when training the network security situation prediction model, network security status data of the electronic device, such as IT asset information, topology information, vulnerability information, and the like of the electronic device, may be collected.

After the network security status data of the electronic device is collected, step 202 is performed.

Step 202: and cleaning the network security condition data to obtain processed network security condition data.

After the network security condition data of the electronic device is collected, a cleaning operation may be performed on the network security condition data to obtain processed network security condition data. Specifically, operations such as missing value processing, duplicate value removal, noise reduction, outlier removal and the like may be performed on the network security condition data to obtain processed network security condition data.

After the network security condition data is processed by performing the cleaning operation on the network security condition data, step 203 is performed.

Step 203: and normalizing the processed network security condition data to obtain the model training sample.

After the network security condition data is cleaned to obtain processed network security condition data, the processed network security condition data can be normalized to obtain a model training sample.

In a specific implementation, the data normalization can reduce the variance of the features to a certain range, reduce the influence of abnormal values and improve the convergence rate of the model. The existing normalization modes mainly comprise two normalization modes of [0,1] and [ -1,1], the latter mode can be selected in the embodiment, and the characteristic data is normalized to be between-1 and 1 by adopting a min-max normalization method. Specifically, for the feature data H _x＝[h_x1,h_x2,...,h_xn (x=1, 2,3,4, 5), where (n represents the total number of samples), the result of mapping H _xi to the interval [ -1,1] is H' _xi, whose calculation formula is shown below:

in the above formula (1), min (H _x) represents the minimum value of the i-th dimensional feature H _x, and max (H _x) represents the maximum value of the i-th dimensional feature H _x.

It may be appreciated that, when performing model training, the number of required model training samples is relatively large, in this example, the model sample corresponding to the network security status data of each electronic device may be used as one training sample, and then, in order to obtain numerous model training samples, the network security status data of a preset number of electronic devices may be obtained, and then, the network security status data of each electronic device is processed to obtain the model training sample corresponding to each electronic device, so that the preset number of model training samples may be obtained.

After generating model training samples from the collected network security status data of the electronic device, step 102 is performed.

Step 102: inputting the model training sample into a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises the following steps: a convolutional network layer, an attention mechanism layer, and a deep neural network layer.

The network security situation prediction model to be trained refers to a model which is built in advance and is not trained yet and is used for predicting the network security situation. In this example, the network security posture prediction model to be trained may include: a convolutional network layer, an attention mechanism layer, and a deep neural network layer.

After generating the model training sample according to the collected network security condition data of the electronic device, the model training sample can be input into a network security situation prediction model to be trained.

After inputting the model training samples into the network security posture prediction model to be trained, step 103 is performed.

Step 103: and calling the convolution network layer to process the model training sample to obtain a network situation feature map of the model training sample in the time dimension.

After the model training samples are input into the network security situation prediction model to be trained, a convolution network layer can be called to process the model training samples so as to obtain a network situation feature map associated with the model training samples in the time dimension. The association relation of the model training samples in the time dimension is mined through the convolutional neural network, so that the dependency relation between data is fully learned. The specific process of obtaining a network situation feature map in which model training samples are associated in the time dimension may be described in detail below in conjunction with fig. 3.

Referring to fig. 3, a step flowchart of a network situation feature map obtaining method provided by an embodiment of the present application is shown, where, as shown in fig. 3, the network situation feature map obtaining method may include: step 301 and step 302.

Step 301: and calling the convolution network layer to execute convolution operation on the model training sample so as to obtain the network situation characteristic and the time dimension characteristic of the model training sample.

In this embodiment, a 2-layer convolutional neural network may be used for the convolutional operation. The convolution layer in the convolution neural network is a main part which makes the convolution layer be different from other traditional neural networks, the convolution layer is equivalent to a feature extractor to extract features, so that parameter sharing is realized, the number of parameters is reduced, and the process expression can be shown in the following formula (2):

y＝δ(W*x+b) (2)

In the above formula (2), y is the output of the convolutional network layer, δ is the activation function, W is the weight matrix, x is the convolution operation, x is the input of the convolutional layer, and b is the bias variable.

The most critical part in convolution operation is the convolution kernel, which is also called a filter. The convolution kernel is essentially an array of fixed size, numerical parameters. The working algorithm of the convolution kernel may be expressed as shown in the following formulas (3), (4) and (5):

In the above formula, Y ^t is the input of the t+1 layer, Y ^t+1 is the situation feature map, the output of the t+1 layer, b is the bias variable, and L _t+1 is the size of Y ^t+1. Y (i, j) is the pixel value of the (i, j) coordinate point in the corresponding feature map, p, k, s are the size, step length and filling of the convolution kernel in the convolution layer respectively, and C is the channel number of the feature map. The summation in the above equation is equivalent to a one-time cross-correlation solution.

The convolution layer extracts local features of the input information by utilizing convolution operation, and the difference of the input data can influence the final result, so that different feature activations are obtained, and the feature mapping is realized. The same group of convolution kernel parameters are used for carrying out convolution operation in the same layer of convolution layer, and when the weight is updated, the group of convolution kernel parameters are only required to be updated, which is called weight sharing. And processing the features by using a nonlinear activation function after the convolution operation, and enhancing the nonlinear expression capability of the learning features.

In the convolutional neural network of the present embodiment, GELU functions are employed as the activation functions. The activation function is responsible for mapping the input of the neural network neurons to the output end and introducing nonlinear factors, mapping the features to a nonlinear space, and further completing more effective extraction of the features. If a linear activation function is used, the neural network can only achieve a linear combination of the input information. The present embodiment uses GELU activation functions as the activation functions of the neural network, and the formula is as follows:

in the convolutional neural network of the present embodiment, the loss function adopts a cross entropy loss function, and the formula is as follows:

Wherein n is the number of categories in the classification problem, p _k (x) is the true network security situation category value, q _k (x) is the predicted network security situation category value, and k represents the kth category type. The algorithm in this embodiment also needs to perform L2 regularization, and adds a regularization term, where the formula is as follows:

where λ is a regularization coefficient, m is the number of regularized partial parameters, and ω is a parameter term participating in regularization.

L2 regularization can limit the range of parameter values, preventing overfitting from occurring.

The cross entropy can measure the difference degree of different probability distributions in the same random variable, and is expressed as the difference between the true probability distribution and the predicted probability distribution in the learning of the neural network model, and the smaller the value of the cross entropy is, the better the model prediction effect is.

When using GELU activation functions, the cross entropy loss function can better solve the problem of too slow a square loss function update weight, which has the good property that the magnitude of the error is linearly related to the weight update speed, i.e. the larger the error, the faster the weight update, and the smaller the error, the slower the weight update.

In the classification problem, the cross entropy loss function and the Softmax activation function are used together to effectively play a role, the Softmax function processes the output result to enable the sum of predictive values of a plurality of classifications to be 1, and then the loss is calculated through the cross entropy function.

In this embodiment, a 2-layer convolutional neural network is used, and in order to prevent the overfitting phenomenon, 1 pooling layer is used after each convolutional layer. The main function of the pooling operation is to compress the amount of data and parameters on the basis of extracting critical local information in each feature, reduce the spatial feature dimension, and implement the downsampling function of the data, so that the pooling layer, also called downsampling layer, is usually placed after the convolution layer. The pooling layer can effectively improve the calculation speed, improve the robustness of the extracted features, improve the generalization capability of the model and prevent over-fitting. The pooling operation may be represented by the following formula (9):

Wherein, For pooling output data, q is the pooling type, average pooling when q is 1, and maximum pooling when q is 0.

After the model training sample is obtained, a convolution network layer can be called to execute convolution operation on the model training sample so as to obtain the network situation characteristics and the time dimension characteristics of the model training sample. The time dimension feature may be used to indicate time information of the network situation feature.

After invoking the convolutional network layer to perform a convolutional operation on the model training samples to obtain the network situation features and the time dimension features of the model training samples, step 302 is performed.

Step 302: and carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated under the time dimension.

After the convolution network layer is called to execute convolution operation on the model training sample to obtain the network situation feature and the time dimension feature of the model training sample, fusion processing can be carried out on the network situation feature and the time dimension feature to obtain the associated network situation feature map under the time dimension.

After invoking the convolutional network layer to process the model training samples to obtain the network situation feature map associated with the model training samples in the time dimension, step 104 is performed.

Step 104: and calling the attention mechanism layer to process the network situation characteristics in the network situation characteristic map so as to obtain attention enhancement characteristics.

After the convolutional network layer is called to process the model training samples to obtain a network situation feature map which is related to the model training samples in the time dimension, the attention mechanism layer can be called to process the network situation features in the network situation feature map to obtain attention enhancement features.

The attention mechanism layer is described as follows:

The squeeze and channel network attention module first performs a squeeze operation on the feature map, converting the entire spatial feature of a channel to a global spatial feature as a representation of the channel using global averaging pooling. Can be represented by the following formula:

Where X _m (i, j) represents the channel eigenvalue of the mth eigenvector X _m at the (i, j) position, and F _sq () represents the squeeze operation, that is, the GAP operation.

After the global description feature is obtained, the correlation between channels is then grasped by the excitation operation. Adopting a structure containing two full-connection layers, wherein the first full-connection layer is responsible for dimension reduction, adopting a ReLU function to activate, then adopting the second full-connection layer to recover the original dimension, introducing a gating mechanism in the form of Sigmoid, flexibly learning the nonlinear relation among all channels by the mechanism to obtain a weight value between 0 and 1, multiplying each original characteristic by the weight of the corresponding channel to obtain a new characteristic diagram, and performing an operation process as shown in the following formula (11):

Wherein, f () and δ () represent Sigmoid function and ReLU function, respectively, W _U is the weight obtained by increasing the number of channels in a certain ratio when the low-dimensional feature map passes through the convolution layer, and W _X functions to reduce the number of channels and is the weight of the convolution layer. Each layer of channels is activated by training the two weights to obtain a one-dimensional excitation weight. u represents the channel statistic that is finally obtained. u _m denotes the mth channel scaling descriptor, which is multiplied by X _m channel by channel to obtain the Hadamard product.

The process of obtaining the attention-enhancing feature may be described in detail below in connection with fig. 4.

Referring to fig. 4, a flowchart illustrating steps of a method for obtaining an attention-enhancing feature according to an embodiment of the present application is shown, and as shown in fig. 4, the method for obtaining an attention-enhancing feature may include: step 401, step 402 and step 403.

Step 401: and calling the attention mechanism layer to perform vector conversion processing on the network situation characteristics in each channel to obtain the global situation characteristics corresponding to each channel.

In this embodiment, the obtained network situation feature map is a feature map formed by three-dimensional features, each dimension of network situation feature corresponds to one channel, and after the network situation feature map is obtained, the attention mechanism layer may be called to perform vector conversion processing on the network situation feature in each channel, so as to obtain a global situation feature corresponding to each channel.

After invoking the attention mechanism layer to perform vector conversion processing on the network situation features in each channel to obtain global situation features corresponding to each channel, step 403 is executed.

Step 402: and obtaining the weight corresponding to each channel according to the association index among the channels.

In this embodiment, the weight corresponding to each channel may be obtained according to the association index between channels. The method for obtaining the weight may refer to the description of the attention mechanism layer, and this embodiment is not described herein.

After obtaining the weight corresponding to each channel according to the association index between channels, step 403 is performed.

Step 403: and processing the global situation characteristics and the corresponding weights of each channel to obtain the attention-enhancing characteristics.

After the weight corresponding to each channel is obtained according to the association index between channels, the processing can be performed according to the global situation characteristic and the corresponding weight of each channel so as to obtain the attention enhancement characteristic. Specifically, the global situation feature of each channel may be multiplied by the weight of the corresponding channel, so as to obtain a new feature map, where the new feature map includes the attention enhancing feature.

Step 105 is performed after invoking the attention mechanism layer to process the network situation feature in the network situation feature map to obtain an attention enhanced feature.

Step 105: and calling the deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample.

The predicted network situation label refers to a predicted network situation category of a model training sample output by a network security situation prediction model to be trained.

After the attention mechanism layer is called to process the network situation characteristics in the network situation characteristic map to obtain attention enhancement characteristics, the deep neural network layer can be called to process the attention enhancement characteristics to obtain predicted network situation labels corresponding to the model training samples.

In this embodiment, the deep neural network layer may be a deep neural network optimized by a genetic algorithm, and the basic deep neural network performs learning training by adopting an error back propagation algorithm, which has a simple structure, strong plasticity, and strong data fitting capability. The deep neural network mainly comprises an input layer, an hidden layer and an output layer, in the training process, the neural network continuously adjusts weights and thresholds between the input layer and the hidden layer and between the hidden layer and the output layer, and when the output value of the neural network is consistent with a target value or reaches the number of times of selection, training is stopped, so that the neural network has stronger generalization capability.

The invention adopts genetic algorithm to replace BP algorithm, and optimizes the deep neural network through genetic algorithm. The genetic algorithm is designed and proposed according to the law of organism evolution in the nature, the working principle is that input data are firstly encoded, then selection, crossover and mutation operation are carried out through a certain probability until an individual with the largest fitness is selected as a target value to be output, and then operation is stopped.

In the genetic algorithm adopted by the invention, the inverse of the square error is adopted as the fitness function, and the standard for measuring the fitness of individuals in the population is carried out according to the following formula:

In the above formula (12), E is an error function, P is an overall output, W is a weight vector, x is an input vector, F is a fitness, j is a number of times of selection, and y _i is a theoretical output value.

The traditional genetic algorithm often adopts a mode of 'roulette' in the working process, the probability of selecting individuals in the population is random, the optimal individuals are most likely to be lost in the selection mode, and larger errors can be generated in the actual operation process, so that the invention improves the selection operator, firstly, the individuals in the population are rearranged by using a sorting method, and the probability of selecting the individuals after the rearrangement is as follows:

In the above formula (13), a is the number of populations in the genetic algorithm, p0 is the probability that the optimal individual may be selected, S is the value obtained by normalizing p ⁰, and b is the position of the nth individual after the population is rearranged.

The genetic algorithm adopted in the invention improves the crossover operator. Conventional genetic algorithms typically set the crossover probability to a constant between 0.3 and 0.8 during operation. In the operation process, the crossover probability is set too high to improve the global searching capability of the genetic algorithm, but the adaptive capability of the chromosome is reduced, and the crossover probability is set too low to reduce the global searching capability and the convergence speed of the genetic algorithm. The invention improves the crossover operator, and adjusts the variation of crossover probability according to the variation of fitness in the algorithm substitution process, wherein the improved crossover probability is as follows:

In the above formula (14), F is the maximum fitness of two crossed individuals in the population, mean is the average fitness of the whole population, n is the number of generations of the genetic algorithm in the current working process, n _max is the maximum number of generations of the evolution operator in the working process, the minimum crossing probability P _jmin can be set to 0.3 and the maximum crossing probability P _jmax can be set to 0.8 when the genetic algorithm is initialized.

The genetic algorithm used in this embodiment improves the mutation operator, and the conventional genetic algorithm generally sets the mutation probability to be a constant between 0.001 and 0.1 during the working process. In the early stage of the genetic algorithm operation, the fitness of individuals in the population is relatively low compared with the average fitness, so that the probability of mutation needs to be set to a small value, and individuals with excellent genes in chromosomes are reserved. In the later stage of the genetic algorithm operation, the fitness of population individuals is relatively higher than the average fitness, so that the probability of variation needs to be set to a larger value to improve the local searching capability of the genetic algorithm. The invention improves the mutation operator, and the value of the mutation probability is adjusted according to the change of the fitness in the operation process of the genetic algorithm, wherein the improved mutation probability is shown in the following formula:

In the above formula (15), F is the maximum fitness of two mutated individuals in the population, mean is the average fitness of the whole population, n is the number of generations of the genetic algorithm in the current mean working process, n _max is the maximum number of generations of the evolution operator in the working process, the minimum mutation probability P _bmin can be set to 0.001 when the genetic algorithm is initialized, and the maximum mutation probability P _bmax is set to x to 0.1.

In this embodiment, the parameters of the deep neural network used are: the number of network layers is 5, the number of hidden nodes is 100, the learning rate is 0.001, the batch size (batch size) is 64, and the iteration number is 200. The classifier used in this embodiment may be a Softmax classification function, etc.

And after the deep neural network layer is called to process the attention enhancement features to obtain the predicted network situation labels corresponding to the model training samples, executing step 106.

Step 106: and calculating to obtain a loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label.

After the deep neural network layer is called to process the attention enhancement features to obtain the predicted network situation labels corresponding to the model training samples, the loss value of the network security situation predicted model to be trained can be calculated based on the predicted network situation labels and the real network situation labels.

In a specific implementation, the calculation mode of the loss value may be a calculation mode corresponding to a cross entropy loss function, etc., specifically, the specific calculation mode of the loss value may be determined according to a service requirement, which is not limited in this embodiment.

After calculating the loss value of the network security posture prediction model to be trained based on the predicted network posture label and the real network posture label, step 107 is performed.

Step 107: and under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model.

After the loss value of the network security situation prediction model to be trained is obtained through calculation based on the prediction network situation label and the real network situation label, whether the loss value is in a preset range or not can be judged.

If the loss value is within the preset range, the trained network security situation prediction model to be trained can be used as a final network security situation prediction model, and the network security situation prediction model can be applied to a subsequent scene of predicting the network security situation of the electronic equipment.

If the loss value is not in the preset range, the model parameters of the network security situation prediction model to be trained can be adjusted according to the loss value, and the network security situation prediction model to be trained with the model parameters adjusted can be continuously trained. The process of adjusting the model parameters may be described in detail below in conjunction with fig. 5.

Referring to fig. 5, a flowchart illustrating steps of a method for adjusting model parameters according to an embodiment of the present application is shown, where, as shown in fig. 5, the method for adjusting model parameters may include: step 501 and step 502.

In this embodiment, the network security situation prediction model to be trained may further include: the parameter optimization layer, after the step 106, may further include:

Step 501: and under the condition that the loss value is not in a preset range, calling the parameter optimization layer to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by adopting a genetic algorithm.

In this embodiment, after the calculated loss value of the network security situation prediction model to be trained, if the loss value is not within the preset range, the parameter optimization layer may be invoked to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by using a genetic algorithm.

After the parameter optimization layer is invoked to determine the model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by adopting a genetic algorithm, step 502 is executed.

Step 502: and adjusting model parameters of the network security situation prediction model to be trained based on the model parameter adjustment value.

After the parameter optimization layer is called to determine a model parameter adjustment value of the network security situation prediction model to be trained according to the loss value by adopting a genetic algorithm, the model parameters of the network security situation prediction model to be trained can be adjusted based on the model parameter adjustment value.

After the model parameters of the network security situation prediction model to be trained are adjusted, the network security situation prediction model to be trained after the model parameters are adjusted can be trained until the model converges.

After training to obtain a final network security situation prediction model, the network security situation prediction model can be used to perform reasoning tasks. The model reasoning process can be described in detail below in connection with FIG. 6.

Referring to fig. 6, a flowchart illustrating steps of a method for obtaining a target predicted network situation label according to an embodiment of the present application is shown, where, as shown in fig. 6, the method for obtaining a target predicted network situation label may include: step 601, step 602, step 603, step 604, step 605 and step 606.

Step 601: and acquiring target network security condition data of the target equipment.

In this embodiment, the target device refers to an electronic device that needs to perform network security situation prediction.

The target network security status data refers to network security status data over a historical time of the target device.

When predicting the network security situation of the target device, the target network security situation data of the target device can be obtained.

After the target network security status data of the target device is obtained, step 602 is performed.

Step 602: and preprocessing the target network security condition data to obtain model input data.

After the target network security status data of the target device is obtained, the target network security status data may be preprocessed to obtain model input data, and specifically, data cleaning and normalization processing may be performed on the target network security status data to obtain model input data.

After preprocessing the target network security status data to obtain model input data, step 603 is performed.

Step 603: and inputting the model input data into the network security situation prediction model.

After preprocessing the target network security status data to obtain model input data, the model input data can be input into a network security situation prediction model to predict the network security situation.

After the model input data is input to the network security posture prediction model, step 604 is performed.

Step 604: and calling the convolution network layer to process the model input data to obtain a target network situation feature map associated with the model input data in the time dimension.

After the model input data is input into the network security situation prediction model, a convolution network layer can be called to process the model input data, and a target network situation feature map associated with the model input data in a time dimension is obtained. For this processing procedure, reference may be made to the implementation procedure of the network situation feature map, and this embodiment will not be described herein.

After invoking the convolutional network layer to process the model input data to obtain a target network situation feature map associated with the model input data in the time dimension, step 605 is performed.

Step 605: and calling the attention mechanism layer to process the target network situation feature map so as to obtain target attention enhancement features.

After the convolution network layer is called to process the model input data to obtain a target network situation feature map related to the model input data in the time dimension, the attention mechanism layer can be called to process the target network situation feature map to obtain target attention enhancement features. For this process, reference may be made to the implementation of the above-mentioned attention-enhancing feature, and this embodiment will not be described herein.

After invoking the attention mechanism layer to process the target network situation feature map to obtain the target attention enhancement feature, step 606 is performed.

Step 606: and calling the deep neural network layer to process the target attention enhancement feature to obtain a target prediction network situation label corresponding to the target equipment.

After the attention mechanism layer is called to process the target network situation feature map to obtain target attention enhancement features, the deep neural network layer can be called to process the target attention enhancement features to obtain target prediction network situation labels corresponding to target devices.

The embodiment of the application provides the overall design of the network security situation awareness system based on key technologies such as detailed analysis data acquisition, association analysis, situation assessment, security response and the like, and carries out detailed design and technical realization on key modules, thereby greatly improving the network security.

According to the training method of the network security situation prediction model, model training samples are generated according to the collected network security situation data of the electronic equipment, and each model training sample corresponds to one real network situation label. Inputting a model training sample into a network security situation prediction model to be trained, wherein the network security situation prediction model to be trained comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training samples to obtain a network situation feature map associated with the model training samples in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram so as to obtain attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample. And calculating to obtain a loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label. And under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, by combining the characteristics of complexity and time variability of network security situation change, the deep relation between data is trained to obtain the model for predicting the network security situation, so that the network security can be greatly improved.

Referring to fig. 7, a schematic structural diagram of a training device for a network security situation prediction model according to an embodiment of the present application is shown, and as shown in fig. 7, a training device 700 for a network security situation prediction model may include the following modules:

The model sample generation module 710 is configured to generate a model training sample according to the collected network security status data of the electronic device; each model training sample corresponds to a real network situation label;

The model sample input module 720 is configured to input the model training sample to a network security situation prediction model to be trained; the network security situation prediction model to be trained comprises the following steps: a convolutional network layer, an attention mechanism layer and a deep neural network layer;

The situation feature map obtaining module 730 is configured to invoke the convolutional network layer to process the model training sample, so as to obtain a network situation feature map of the model training sample in a time dimension;

The enhanced feature obtaining module 740 is configured to invoke the attention mechanism layer to process the network situation feature in the network situation feature map, so as to obtain an attention enhanced feature;

The prediction tag obtaining module 750 is configured to invoke the deep neural network layer to process the attention enhancement feature, so as to obtain a prediction network situation tag corresponding to the model training sample;

the loss value calculation module 760 is configured to calculate, based on the predicted network situation label and the real network situation label, a loss value of the network security situation prediction model to be trained;

the prediction model obtaining module 770 is configured to take the trained network security situation prediction model to be a final network security situation prediction model when the loss value is within a preset range.

Optionally, the model sample generation module includes:

Optionally, the situation feature map obtaining module includes:

Optionally, the enhanced feature acquisition module includes:

The apparatus further comprises:

Optionally, the apparatus further comprises:

According to the training device for the network security situation prediction model, provided by the embodiment of the application, the model training samples are generated according to the collected network security situation data of the electronic equipment, and each model training sample corresponds to one real network situation label. Inputting a model training sample into a network security situation prediction model to be trained, wherein the network security situation prediction model to be trained comprises: a convolutional network layer, an attention mechanism layer, and a deep neural network layer. And calling a convolutional network layer to process the model training samples to obtain a network situation feature map associated with the model training samples in the time dimension. And calling an attention mechanism layer to process the network situation characteristics in the network situation characteristic diagram so as to obtain attention enhancement characteristics. And calling a deep neural network layer to process the attention enhancement features to obtain a predicted network situation label corresponding to the model training sample. And calculating to obtain a loss value of the network security situation prediction model to be trained based on the prediction network situation label and the real network situation label. And under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model. According to the embodiment of the application, by combining the characteristics of complexity and time variability of network security situation change, the deep relation between data is trained to obtain the model for predicting the network security situation, so that the network security can be greatly improved.

The embodiment of the application also provides electronic equipment, which comprises: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes the training method of the network security situation prediction model when being executed by the processor.

Fig. 8 shows a schematic structural diagram of an electronic device 800 according to an embodiment of the present application. As shown in fig. 8, the electronic device 800 includes a Central Processing Unit (CPU) 801 that can perform various appropriate actions and processes according to computer program instructions stored in a Read Only Memory (ROM) 802 or computer program instructions loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data required for the operation of the electronic device 800 can also be stored. The CPU801, ROM802, and RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.

Various components in electronic device 800 are connected to I/O interface 805, including: an input unit 806, such as a keyboard, mouse, microphone, etc.; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, etc.; and a communication unit 809, such as a network card, modem, wireless communication transceiver, or the like. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The various processes and treatments described above may be performed by the processing unit 801. For example, the method of any of the embodiments described above may be implemented as a computer software program tangibly embodied on a computer-readable medium, such as the storage unit 808. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 800 via the ROM802 and/or the communication unit 809. When the computer program is loaded into RAM803 and executed by CPU801, one or more actions in the above-described method may be performed.

Additionally, the embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the training method of the network security situation prediction model.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminals (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The above detailed description of a training method of a network security situation prediction model, a training device of a network security situation prediction model, an electronic device and a computer readable storage medium provided by the present application, the specific examples are applied to illustrate the principles and embodiments of the present application, and the above description of the examples is only used to help understand the method and core ideas of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. The training method of the network security situation prediction model is characterized by comprising the following steps of:

under the condition that the loss value is in a preset range, taking the trained network security situation prediction model to be trained as a final network security situation prediction model;

the step of calling the convolution network layer to process the model training sample to obtain a network situation feature map of the model training sample in a time dimension comprises the following steps:

carrying out fusion processing on the network situation characteristics and the time dimension characteristics to obtain a network situation characteristic diagram associated under the time dimension;

The calling the attention mechanism layer processes the network situation characteristics in the network situation characteristic diagram to obtain attention enhancement characteristics, and the method comprises the following steps:

2. The method of claim 1, wherein generating model training samples from the collected network security status data of the electronic device comprises:

collecting network security status data of the electronic equipment;

3. The method of claim 1, wherein the network security posture prediction model to be trained further comprises: a parameter optimization layer, wherein the parameter optimization layer,

4. The method according to claim 1, further comprising, after said taking the trained network security posture prediction model to be trained as a final network security posture prediction model:

Acquiring target network security condition data of target equipment;

5. A training device for a network security posture prediction model, the device comprising:

the prediction model acquisition module is used for taking the trained network security situation prediction model to be trained as a final network security situation prediction model under the condition that the loss value is in a preset range;

the situation characteristic map acquisition module comprises:

The situation feature map acquisition unit is used for carrying out fusion processing on the network situation features and the time dimension features to obtain a network situation feature map associated under the time dimension;

the enhanced feature acquisition module includes:

6. The apparatus of claim 5, wherein the model sample generation module comprises:

7. The apparatus of claim 5, wherein the network security posture prediction model to be trained further comprises: a parameter optimization layer, wherein the parameter optimization layer,

The apparatus further comprises:

8. The apparatus of claim 5, wherein the apparatus further comprises:

9. An electronic device, comprising:

a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the training method of the network security posture prediction model of any one of claims 1 to 4 when the program is executed.

10. A computer readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the training method of the network security posture prediction model of any of claims 1 to 4.