CN117834309A - Vulnerability assessment method based on contrast graph clustering and reinforcement learning - Google Patents
Vulnerability assessment method based on contrast graph clustering and reinforcement learning Download PDFInfo
- Publication number
- CN117834309A CN117834309A CN202410251919.1A CN202410251919A CN117834309A CN 117834309 A CN117834309 A CN 117834309A CN 202410251919 A CN202410251919 A CN 202410251919A CN 117834309 A CN117834309 A CN 117834309A
- Authority
- CN
- China
- Prior art keywords
- representing
- vulnerability
- feature
- network environment
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000002787 reinforcement Effects 0.000 title claims abstract description 23
- 238000011156 evaluation Methods 0.000 claims abstract description 37
- 230000006870 function Effects 0.000 claims abstract description 27
- 238000005259 measurement Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 9
- 230000007613 environmental effect Effects 0.000 claims description 59
- 230000014509 gene expression Effects 0.000 claims description 36
- 239000003795 chemical substances by application Substances 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 230000008447 perception Effects 0.000 claims description 6
- 238000013527 convolutional neural network Methods 0.000 claims description 4
- 238000012512 characterization method Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000000513 principal component analysis Methods 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 2
- 238000012502 risk assessment Methods 0.000 abstract description 3
- 230000008439 repair process Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000012038 vulnerability analysis Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1433—Vulnerability analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computer Hardware Design (AREA)
- Discrete Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a vulnerability assessment method based on contrast graph clustering and reinforcement learning, which comprises the following steps: inputting the network security vulnerability data set into a multi-dimensional conditional variation automatic encoder, learning a public feature representation and a specific feature representation, inputting a network environment parameter sample into a text encoder to generate a feature representation of the network environment; clustering based on similarity measurement is carried out to generate a cluster map; weighting the common feature representation using a dynamic sample weighting strategy; the intelligent agent takes the difference between the loopholes and the loophole-environment sample pairs as a reward function for training initial evaluation points, and an evaluation module calculates the initial evaluation points of the loopholes; inputting initial scores of loopholes and actual network environments into a decision module comprising a memory bank to generate final loophole evaluation scores; the method and the device have the advantages that the severity of the loopholes can be accurately predicted according to the actual network environment, and the accuracy of the prediction of the possibility that the loopholes are utilized and the accuracy of high loophole risk assessment are improved.
Description
Technical Field
The invention relates to the technical field of network security, in particular to a vulnerability assessment method based on contrast graph clustering and reinforcement learning.
Background
With the increasing innovation and development of internet technology, the network security problem is also becoming more serious, the network attack scale is becoming more organized, the attack means are continuously changed, and the network attack scale is diversified and structured. In this context, vulnerabilities become a significant problem in network security. Vulnerabilities refer to vulnerabilities or errors that exist in a system that a hacker can exploit to conduct attacks and intrusions. The existing vulnerability early warning mechanism has certain hysteresis, and the process from vulnerability discovery and repair to notifying a user often needs a long time.
CVSS has become a widely adopted open framework for assessing the severity of security vulnerabilities in software systems and applications. CVSS was developed by the event response and security team forum to provide a standardized quantitative method for assessing the potential impact of vulnerabilities on affected systems, enabling organizations to more effectively prioritize their security response and repair work. The CVSS score is calculated using a combination of index values, with the result that a numerical score ranging from 0.0 to 10.0 is obtained, with greater values having greater severity. Specifically, the CVSS score is calculated from a set of vulnerability characteristics that fall into three main categories:
basic index feature set: the class indicator describes the inherent properties of the vulnerability, including its availability and potential impact on the affected system if the vulnerability is successfully exploited. The features of the basic set of metrics include attack vector, attack complexity, required permissions, user interactions, confidentiality impact, integrity impact, availability impact, and scope.
Time index feature set: the index reflects the current state of the vulnerability over time, and factors such as availability of the vulnerability exploitation or patch, maturity of repair work, confidence of vulnerability analysis and the like are considered. The time index set includes exploit code maturity, repair level, and reporting confidence.
Environmental index feature set: such metrics take into account the specific environment in which the vulnerable system is located, factors such as the importance of the affected system to the organization, potential collateral damage, and the security requirements of the organization. The environmental metrics include modified base metrics, confidentiality requirements, integrity requirements, and availability requirements.
Although CVSS is widely used to evaluate vulnerability risk and generate scores reflecting its severity, the CVSS framework only provides severity and impact scores for individual vulnerabilities and cannot evaluate the vulnerability risk of the entire system according to the actual network environment at present. Therefore, an improved vulnerability risk assessment algorithm is provided on the basis of the CVSS, which can assess the severity of the vulnerability aiming at the uncertainty of the network environment parameters, so as to solve the problem of helping security managers measure the vulnerability risk of the whole system and improve the efficiency of network security management.
Disclosure of Invention
In order to achieve the above object, the present inventors provide a vulnerability assessment method based on contrast graph clustering and reinforcement learning, comprising the steps of:
s1, inputting a public network security vulnerability data set into a multi-dimensional condition variation automatic encoder, learning public characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by the multi-dimensional condition variation automatic encoder, and inputting a network environment parameter sample into a text encoder to generate characteristic representation of a network environment;
s2, clustering the specific characteristic representation of various indexes of the loopholes and the characteristic representation of the network environment based on similarity measurement to generate a cluster map;
s3, under the guidance of high confidence information of the cluster map, using the comparison learning of network environment perception to generate a dynamic sample weighting strategy of the network environment, and carrying out weighting treatment on public feature representations of various indexes of the vulnerability;
s4, the intelligent agent takes the difference between the loopholes identified by the client and the loophole-environment sample pairs as a reward function for training initial evaluation scores, and calculates the initial evaluation scores of the loopholes by using a lightweight evaluation module;
s5, inputting the initial scores of the loopholes and the actual network environment into a decision module comprising a memory bank to generate final loophole evaluation scores.
As a preferred mode of the present invention, the step S1 further includes the steps of:
s101, the given public network security hole data set is recorded asWherein->Indicating that the data set together comprises +.>Type of vulnerability to->Each type of loopholes in the model is respectively input into a multidimensional conditional variation automatic encoder, which comprises the step of inputting basic index characteristic variation automatic encoder->Automatic encoder for time index characteristic variation>And environmental index feature variation automatic encoder +.>In the expression:
,
,
,
wherein,representing mathematical expectation value, ++>Representing basic index features->Representing the time index feature>Representing environmental index features, < >>Type description representing vulnerability->Posterior distribution of basic index features under vulnerability type description>Posterior distribution of time index features under vulnerability type description>Posterior distribution of environmental index features under vulnerability type description, ++>Representing posterior distribution of all vulnerability characteristics under various index characteristics, +.>Representing a logarithmic function>Indicating the basic index features in time index and environment indexA priori distribution under the target features ∈ ->Representing a priori distribution of time index features under basic index and environmental index features, < ->Representing a priori distribution of environmental indicator features under basic indicator and time indicator features, < >>The relative entropy between posterior distribution and prior distribution is represented, and JSD represents the difference degree between posterior distribution and prior distribution;
s102, learning common characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by a multidimensional conditional variation automatic encoder, wherein the expressions are as follows:
,
wherein,and->Is composed of 3 x 3 convolution, which respectively represent a common feature mapping layer and a specific feature mapping layer, < ->Is a common characteristic representation of basic index characteristics in potential space,/->Is a specific characteristic representation of the basic index characteristic in the potential space, < ->Is a common feature representation of the time index feature in the potential space,/->Is a time index featureSpecific characteristic representation in potential space, +.>Is a common feature representation of the environmental indicator feature in the potential space, is->Is a specific characteristic representation of the environmental index characteristic in the potential space;
for network environment parameter samplesInput to text encoder->Generating a characteristic representation of the network environment, the expression being:
,
wherein,for the characterization of the network environment, < - > a->Is composed of a plurality of self-attention modules.
As a preferred mode of the present invention, the step S2 further includes the steps of:
s201, carrying out similarity measurement on specific characteristic representations of various indexes of the vulnerability and characteristic representations of a network environment, wherein a similarity measurement function expression is as follows:
,
wherein,representing the similarity between basic index features and network environment features,/->Representing any two of the basic index features, < +.>Representing the mathematical expectation of any two of the basic index features to the network environment features, +.>And->Representing hyper-parameters for adjusting the weights of the sample feature attributes,/->Expressed as natural number +.>An exponential function of the base +.>Representing cosine similarity;
,
wherein,representing the similarity between the time index feature and the network environment feature,/->Representing any two of the time index features, < +.>Mathematical expectation value representing any two characteristics of time index characteristics for network environment characteristics, +.>And->Representing super parameters for adjusting time samplesThe weight of the attribute of the feature;
,
wherein,representing the similarity between the environmental indicator feature and the network environmental feature,/->Representing any two of the basic index features, < +.>Mathematical expectation value representing any two characteristics of the environmental index characteristics for the network environmental characteristics, +.>And->The super-parameters are used for adjusting the weights of the characteristic attributes of the environmental samples;
s202, according to the similarity measurement result, using a structural encoder for K features with similar resultsAnd attribute encoder->Generating a cluster map->And->All the components are formed by graph convolution neural networks, and the expression is:
,
,
,
wherein,a cluster diagram representing the difference between the basic index feature of the vulnerability and the network environment feature sample pair, ++>A cluster diagram representing the difference between the vulnerability time index feature and the network environment feature sample pair, ++>A cluster map representing the pairs of vulnerability environmental index features and network environmental feature samples, ++>And the method is used for searching K features with similar results.
As a preferred mode of the present invention, the step S3 further includes the steps of:
s301, under the guidance of high confidence information of a cluster map, generating a dynamic sample weighting strategy of a network environment by using contrast learning of network environment perception, wherein the expression is as follows:
,
wherein,dynamic weights representing basic feature-network environment sample pairs, +.>Dynamic weights representing time feature-network environment sample pairs, +.>Dynamic weights representing environmental feature-network environmental sample pairs, +.>Respectively representing the coordinate indexes;
s302, carrying out weighting processing on common characteristic representations of various indexes of the vulnerability through corresponding weights, wherein the expression is as follows:
,
wherein,graph structure representing weighted base feature-network environment sample pairs, +.>Graph structure representing weighted temporal feature-network environment sample pairs, +.>Representing the graph structure of the weighted environmental feature-network environmental sample pairs, the GCN represents a graph convolutional neural network.
As a preferred mode of the present invention, step S4 further includes the steps of:
s401, using the intelligent agent in reinforcement learning to take the vulnerability identified by the client and the difference between the local environment and various vulnerability characteristic-network environment sample pairs as a reward function for training initial evaluation pointsThe expression is:
,
wherein,representing vulnerability identified by client and local environment, < ->A text encoder is represented by a representation of the text,and->Representing the coordinate index>Representing the distance of squared euclidean;
s402, the agent calculates initial vulnerability assessment score S by using a lightweight assessment module, wherein the assessment module takes vulnerabilities identified by the client and a local environment as input, and the expression is as follows:
,
wherein,representing the normalized exponential function, GMM representing the gaussian mixture model, and PCA representing the principal component analysis.
As a preferred mode of the present invention, step S5 further includes the steps of:
s501, inputting initial scores of loopholes and actual network environments into a decision module comprising a memory bank to generate final loophole evaluation scoresThe expression is:
,
wherein,is shown in memory bank->Finding out +.>Is similar to the actual network environment of (a)Assessment record of->The system consists of a plurality of self-attention modules, represents a text encoder which takes initial scores of loopholes, actual network environments and histories in a memory library as input, and MLP represents a multi-layer perceptron->Representing the mathematical expectation of the initial vulnerability score compared to the historic records in the repository under an actual network environment.
As a preferred mode of the present invention, further comprising the steps of: and S6, storing the difference between the initial evaluation and the final evaluation and the vulnerability characteristic distribution as self-feedback information in a memory bank.
As a preferred mode of the present invention, step S6 further includes the steps of:
s601, the initial evaluation score S and the final evaluation scoreDifference value->And vulnerability characteristic distribution identified by the client>Stored as self-feedback information in memory bank->The expression is:
;
s602, removing memory bank through gate control unit module and variable graph convolutionThe gate control unit module will store +.>Generating a convolution step strategy as input +.>The expression is:
,
wherein,representing a one-dimensional vector, different indexes representing different convolution steps, and +.>Is indicative of the effectiveness of using the convolution step, < >>The weight of each convolution step can be made between 0 and 1, tanh represents the hyperbolic tangent function,>representing a convolution function>Representing a batch normalization layer, +.>Respectively representing convolution parameters, wherein GAP represents a global average pooling layer;
s603, searching convolution step strategyThe index of the maximum effective value of (a) as a convolution step length, and changing the memory bank by controlling the convolution step length +.>Is expressed as:
,
wherein,representing the compressed memory bank, GCN representing the graph convolution neural network, ++>Convolution step length of the representation network, +.>Index for finding the maximum efficiency value, +.>Size of (2) represents memory bank->A kind of electronic device。
Compared with the prior art, the beneficial effects achieved by the technical scheme are as follows:
(1) In the prior art, only the recognition of the vulnerability characteristics is concerned, but the influence of the actual network environment is ignored, and in addition, the prior method ignores important structural information in the aspect of the characteristic recognition of the network security vulnerability, so that the representativeness of the selected characteristics is reduced; therefore, the method provides a new contrast graph clustering method, firstly, the connection between the vulnerability characteristics and the network environment parameters is constructed by introducing comprehensive similarity measurement standards, and a dynamic weighting strategy is provided to ensure that the characteristics of the security vulnerabilities are more discriminant;
(2) Prior reinforcement learning-based vulnerability detection methods prioritize feature sampling according to a reward function, but in complex actual network environments, such algorithms often produce suboptimal results; therefore, the method provides a new reinforcement learning method for vulnerability assessment, firstly, an agent interacts the characteristics of the contrast graph clusters with the identified vulnerabilities to generate initial vulnerability assessment scores, a decision module generates final vulnerability assessment scores according to the initial scores and the actual network environment, in addition, the difference between the initial results and the final results is stored in a memory library as self-feedback information to provide valuable feedback for future assessment, and the iterative process of self-feedback and persistent memory enables the agent to quickly improve the decision capability of the agent in various network environments by utilizing information feedback signals.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment.
FIG. 2 is a diagram of a reinforcement learning-based vulnerability assessment framework in accordance with an embodiment.
Detailed Description
In order to describe the technical content, constructional features, achieved objects and effects of the technical solution in detail, the following description is made in connection with the specific embodiments in conjunction with the accompanying drawings.
In order to aim at the disclosed network security vulnerability data, how to accurately predict the severity of the vulnerability according to the actual network environment is researched, and how to improve the accuracy of vulnerability exploitation possibility prediction and the accuracy of high vulnerability risk assessment. As shown in fig. 1 and 2, the embodiment provides a vulnerability assessment method based on contrast graph clustering and reinforcement learning, which includes the following steps:
s1, inputting a public network security vulnerability data set into a multi-dimensional condition variation automatic encoder, learning public characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by the multi-dimensional condition variation automatic encoder, and inputting a network environment parameter sample into a text encoder to generate characteristic representation of a network environment;
s2, clustering the specific characteristic representation of various indexes of the loopholes and the characteristic representation of the network environment based on similarity measurement to generate a cluster map;
s3, under the guidance of high confidence information of the cluster map, using the comparison learning of network environment perception to generate a dynamic sample weighting strategy of the network environment, and carrying out weighting treatment on public feature representations of various indexes of the vulnerability;
s4, the intelligent agent takes the difference between the loopholes identified by the client and the loophole-environment sample pairs as a reward function for training initial evaluation scores, and calculates the initial evaluation scores of the loopholes by using a lightweight evaluation module;
s5, inputting the initial scores of the loopholes and the actual network environment into a decision module comprising a memory bank to generate final loophole evaluation scores.
In the implementation process of this embodiment, as shown in fig. 1, step S1 further includes the steps of:
s101, the given public network security hole data set is recorded asWherein->Indicating that the data set together comprises +.>Type of vulnerability to->Each type of loopholes in the model is respectively input into a multi-dimensional condition variation automatic encoder, and the multi-dimensional condition variation automatic encoder comprises a basic index feature variation automatic encoder ∈>Automatic encoder for time index characteristic variation>And environmental index feature variation automatic encoder +.>The expression:
,,,
wherein,representing mathematical expectation value, ++>Representing basic index features->Representing the time index feature>Representing environmental index features, < >>Type description representing vulnerability->Posterior distribution of basic index features under vulnerability type description>Posterior distribution of time index features under vulnerability type description>Posterior distribution of environmental index features under vulnerability type description, ++>Representing posterior distribution of all vulnerability characteristics under various index characteristics, +.>Representing a logarithmic function>Representing a priori distribution of basic index features under time index and environmental index features, < + >>Representing a priori distribution of time index features under basic index and environmental index features, < ->Representing a priori distribution of environmental indicator features under basic indicator and time indicator features, < >>The relative entropy between posterior distribution and prior distribution is represented, and JSD represents the difference degree between posterior distribution and prior distribution;
s102, learning common characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by a multidimensional conditional variation automatic encoder, wherein the expressions are as follows:
,
wherein,and->Are each composed of a 3 x 3 convolution, representing a common feature mapping layer and a specific feature mapping layer,is a common characteristic representation of basic index characteristics in potential space,/->Is a specific characteristic representation of the basic index characteristic in the potential space, < ->Is a common feature representation of the time index feature in the potential space,/->Is a specific characteristic representation of the time index characteristic in the potential space,/->Is a common feature representation of the environmental indicator feature in the potential space, is->Is a specific characteristic representation of the environmental index characteristic in the potential space;
for network environment parameter samplesInput to text encoder->Generating a characteristic representation of the network environment, the expression being:
,
wherein,for the characterization of the network environment, < - > a->Is composed of a plurality of self-attention modules.
In this embodiment, step S2 further includes the steps of:
s201, carrying out similarity measurement on specific characteristic representations of various indexes of the vulnerability and characteristic representations of a network environment, wherein a similarity measurement function expression is as follows:
,
wherein,representing the similarity between basic index features and network environment features,/->Representing any two of the basic index features, < +.>Mathematical period representing any two of basic index features to network environment featuresWaning value and->And->Representing hyper-parameters for adjusting the weights of the sample feature attributes,/->Expressed as natural number +.>An exponential function of the base +.>Representing cosine similarity;
,
wherein,representing the similarity between the time index feature and the network environment feature,/->Representing any two of the time index features, < +.>Mathematical expectation value representing any two characteristics of time index characteristics for network environment characteristics, +.>And->The super-parameters are used for adjusting the weights of the characteristic attributes of the time samples;
,
wherein,representing the similarity between the environmental indicator feature and the network environmental feature,/->Representing any two of the basic index features, < +.>Mathematical expectation value representing any two characteristics of the environmental index characteristics for the network environmental characteristics, +.>And->The super-parameters are used for adjusting the weights of the characteristic attributes of the environmental samples;
s202, according to the similarity measurement result, using a structural encoder for K features with the nearest resultsAnd attribute encoder->Generating a cluster map->And->All the components are formed by graph convolution neural networks, and the expression is:
,
,
,
wherein,a cluster diagram representing the relationship between the basic index feature of the vulnerability and the network environment feature sample pair, wherein each relationship in the cluster diagram represents the basic feature-network environment sample pair, +.>A cluster diagram representing the relationship between the vulnerability time index feature and the network environment feature sample pair, wherein each relationship in the cluster diagram represents the time feature-network environment sample pair, +.>A cluster diagram representing the relationship between the vulnerability environmental index feature and the network environmental feature sample pair, wherein each relationship in the cluster diagram represents the environmental feature-network environmental sample pair, +.>And the K features are used for searching the K features with the nearest results.
In this embodiment, step S3 further includes the steps of:
s301, under the guidance of high confidence information of a cluster map, generating a dynamic sample weighting strategy of a network environment by using contrast learning of network environment perception, wherein the expression is as follows:
,
wherein,dynamic weights representing basic feature-network environment sample pairs, +.>Dynamic weights representing time feature-network environment sample pairs, +.>Dynamic weights representing environmental feature-network environmental sample pairs, +.>Respectively representing the coordinate indexes; the step takes corresponding various characteristic-network environment sample pairs as positive samples and the rest are negative samples, and generates a dynamic sample weighting strategy of the network environment in a comparison learning mode.
S302, weighting is carried out on the public feature representation of various indexes of the loopholes according to the weight corresponding to the step S301, so that the weight of a high-correlation loophole-environment sample pair is improved, the feature of the security loophole is more discriminative, and the expression is as follows:
,
wherein,graph structure representing weighted base feature-network environment sample pairs, +.>Graph structure representing weighted temporal feature-network environment sample pairs, +.>Representing the graph structure of weighted environmental feature-network environmental sample pairs, the GCN represents a graph convolutional neural network, consisting of a plurality of graph convolutional neural networks.
As shown in fig. 2, in the present embodiment, step S4 further includes the steps of:
s401, using Agent in reinforcement learning to take the differences between the loopholes identified by the client and the local environment and the various loophole characteristic-network environment sample pairs in step S302 as a reward function of training initial evaluation pointsThe expression is:
,
wherein,representing vulnerability identified by client and local environment, < ->A text encoder is represented by a representation of the text,and->Representing a coordinate index representing a distance of squared euclidean;
s402, the agent calculates initial vulnerability assessment score S by using a lightweight assessment module, wherein the assessment module takes vulnerabilities identified by the client and a local environment as input, and the expression is as follows:
,
wherein,representing the normalized exponential function, GMM representing the gaussian mixture model, PCA representing principal component analysis, the evaluation module finds the initial vulnerability assessment score optimal solution under the direction of the reward function in step S401.
In this embodiment, step S5 further includes the steps of:
s501, inputting initial scores of loopholes and actual network environments into a decision module comprising a memory bank to generate final loophole evaluation scoresThe expression is:
,
the decision module is based on the Q iterative algorithm in conventional reinforcement learning, wherein,is shown in memory bank->Finding out +.>Is similar to the actual network environment of the network,the system consists of a plurality of self-attention modules, represents a text encoder which takes initial scores of loopholes, actual network environments and histories in a memory library as input, and MLP represents a multi-layer perceptron->Representing the mathematical expectation of the initial vulnerability score compared to the historic records in the repository under an actual network environment.
In some embodiments, the method further comprises the step of: and S6, storing the difference between the initial evaluation and the final evaluation and the vulnerability characteristic distribution as self-feedback information in a memory bank. Specific:
s601, the initial evaluation score S and the final evaluation scoreDifference value->And vulnerability characteristic distribution identified by the client>Stored as self-feedback information in memory bank->The expression is:
;
s602, records in the memory library can provide value for future evaluationValue feedback information, but with the increase of the vulnerability count, the memory bank may risk memory overflow, so the memory bank is removed by a gating unit module and variable graph convolutionThe redundant information in (1) is first of all the memory bank is gated by the unit module>Generating convolution step strategy as inputThe expression is:
,/>
wherein,representing a one-dimensional vector, different indexes representing different convolution steps, and +.>Is indicative of the effectiveness of using the convolution step, < >>The weight of each convolution step can be made between 0 and 1, tanh represents the hyperbolic tangent function,>representing a convolution function>Representing a batch normalization layer, +.>Respectively representing convolution parameters, wherein GAP represents a global average pooling layer;
s603, searching convolution step strategyThe index of the maximum effective value of (a) as a convolution step length, and changing the memory bank by controlling the convolution step length>Is expressed as:
,
wherein,representing the compressed memory bank, GCN representing the graph convolution neural network, ++>Convolution step length of the representation network, +.>Index for finding the maximum efficiency value, +.>Size of (2) represents memory bank->A kind of electronic device。
To verify the effectiveness of the present invention, the present invention conducted experiments on different vulnerability assessment data sets, such as: malware Training Sets dataset is a vulnerability detection dataset of the primary malware analysis, the EMBER dataset is used to train a machine learning model to statically detect Malicious Windows portable executable files, the maltools URLs dataset includes Malicious URL instances from large webmail providers that provide 6000-7500 spam and phishing URL instances per day, the MAWILab dataset is a network traffic anomaly detection dataset consisting of sets of labels of traffic anomalies in the MAWI archive, the Aposemat IoT-23 dataset is a network traffic dataset from internet of things (IoT) devices, the results of the invention at different vulnerability assessment datasets are shown in table 1.
Table 1: experimental results of the invention in different vulnerability assessment data sets
As can be seen from table 1, the present invention performs best on the EMBER dataset, mainly because the EMBER dataset contains more attack means samples, and sufficient training samples enable the present invention to be sufficiently trained, making the performance more excellent.
In addition, in order to verify the effectiveness of reinforcement learning adopted by the invention, the method comprises the following steps of: linear discriminant analysis algorithm (LDA), word displacement distance algorithm (WMD), linear discriminant analysis and word displacement distance comprehensive algorithm (WMD-LDA), local sensitive hash algorithm (Simhash) and Euclidean algorithm; the experimental results are shown in table 2.
Table 2: the invention compares experimental results with different evaluation decision algorithms
Table 2 shows that the vulnerability assessment method based on contrast graph clustering reinforcement learning of the invention has better performance than other algorithms in accuracy and F1 value.
It should be noted that, although the foregoing embodiments have been described herein, the scope of the present invention is not limited thereby. Therefore, based on the innovative concepts of the present invention, alterations and modifications to the embodiments described herein, or equivalent structures or equivalent flow transformations made by the present description and drawings, apply the above technical solution, directly or indirectly, to other relevant technical fields, all of which are included in the scope of the invention.
Claims (8)
1. The vulnerability assessment method based on contrast graph clustering and reinforcement learning is characterized by comprising the following steps of:
s1, inputting a public network security vulnerability data set into a multi-dimensional condition variation automatic encoder, learning public characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by the multi-dimensional condition variation automatic encoder, and inputting a network environment parameter sample into a text encoder to generate characteristic representation of a network environment;
s2, clustering the specific characteristic representation of various indexes of the loopholes and the characteristic representation of the network environment based on similarity measurement to generate a cluster map;
s3, under the guidance of high confidence information of the cluster map, using the comparison learning of network environment perception to generate a dynamic sample weighting strategy of the network environment, and carrying out weighting treatment on public feature representations of various indexes of the vulnerability;
s4, the intelligent agent takes the difference between the loopholes identified by the client and the loophole-environment sample pairs as a reward function for training initial evaluation scores, and calculates the initial evaluation scores of the loopholes by using a lightweight evaluation module;
s5, inputting the initial scores of the loopholes and the actual network environment into a decision module comprising a memory bank to generate final loophole evaluation scores.
2. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 1, wherein step S1 further comprises the steps of:
s101, the given public network security hole data set is recorded asWherein->Indicating that the data set together comprises +.>Type of vulnerability to->Each type of loopholes in the model is respectively input into a multidimensional conditional variation automatic encoder, which comprises the step of inputting basic index characteristic variation automatic encoder->Automatic encoder for time index characteristic variation>And environmental index feature variation automatic encoder +.>In the expression:
,,,
wherein,representing mathematical expectation value, ++>Representing basic index features->Representing the time index feature>Representing environmental index features, < >>Type description representing vulnerability->Represents posterior distribution of basic index features under the vulnerability type description,posterior distribution of time index features under vulnerability type description>Posterior distribution of environmental index features under vulnerability type description, ++>Represents posterior distribution of all vulnerability characteristics under various index characteristics,representing a logarithmic function>Representing the prior distribution of basic index features under the time index and environment index features,representing a priori distribution of time index features under basic index and environmental index features, < ->Representing a priori distribution of environmental indicator features under basic indicator and time indicator features, < >>The relative entropy between posterior distribution and prior distribution is represented, and JSD represents the difference degree between posterior distribution and prior distribution;
s102, learning common characteristic representation and specific characteristic representation of various indexes of a vulnerability in a potential characteristic space by a multidimensional conditional variation automatic encoder, wherein the expressions are as follows:
,
wherein,and->Is composed of 3 x 3 convolution, which respectively represent a common feature mapping layer and a specific feature mapping layer, < ->Is a common characteristic representation of basic index characteristics in potential space,/->Is a specific characteristic representation of the basic index characteristic in the potential space,is a common feature representation of the time index feature in the potential space,/->Is a specific characteristic representation of the time index characteristic in the potential space,/->Is a common feature representation of the environmental indicator feature in the potential space, is->Is a specific characteristic representation of the environmental index characteristic in the potential space;
for network environment parameter samplesInput to text encoder->Generating a characteristic representation of the network environment, the expression being:
,
wherein,for the characterization of the network environment, < - > a->Is composed of a plurality of self-attention modules.
3. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 2, wherein step S2 further comprises the steps of:
s201, carrying out similarity measurement on specific characteristic representations of various indexes of the vulnerability and characteristic representations of a network environment, wherein a similarity measurement function expression is as follows:
,
wherein,representing the similarity between basic index features and network environment features,/->Representing any two of the basic index features, < +.>Representing the mathematical expectation of any two of the basic index features to the network environment features, +.>And->Representing hyper-parameters for adjusting the weights of the sample feature attributes,/->Expressed as natural number +.>An exponential function of the base +.>Representing cosine similarity;
,
wherein,representing the similarity between the time index feature and the network environment feature,/->Representing any two of the time index features, < +.>Mathematical expectation value representing any two characteristics of time index characteristics for network environment characteristics, +.>And->The super-parameters are used for adjusting the weights of the characteristic attributes of the time samples;
,
wherein,representing the similarity between the environmental indicator feature and the network environmental feature,/->Representing any two of the basic index features, < +.>Mathematical expectation value representing any two characteristics of the environmental index characteristics for the network environmental characteristics, +.>And->The super-parameters are used for adjusting the weights of the characteristic attributes of the environmental samples;
s202, according to the similarity measurement result, using a structural encoder for K features with similar resultsAnd attribute encoder->Generating a cluster map->And->All the components are formed by graph convolution neural networks, and the expression is:
,,,
wherein,a cluster diagram representing the difference between the basic index feature of the vulnerability and the network environment feature sample pair, ++>A cluster diagram representing the difference between the vulnerability time index feature and the network environment feature sample pair, ++>A cluster map representing the pairs of vulnerability environmental index features and network environmental feature samples, ++>And the method is used for searching K features with similar results.
4. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 3, wherein step S3 further comprises the steps of:
s301, under the guidance of high confidence information of a cluster map, generating a dynamic sample weighting strategy of a network environment by using contrast learning of network environment perception, wherein the expression is as follows:
,
wherein,dynamic weights representing basic feature-network environment sample pairs, +.>Dynamic weights representing time feature-network environment sample pairs, +.>Dynamic weights representing environmental feature-network environmental sample pairs, +.>Respectively representing the coordinate indexes;
s302, carrying out weighting processing on common characteristic representations of various indexes of the vulnerability through corresponding weights, wherein the expression is as follows:
,
wherein,graph structure representing weighted base feature-network environment sample pairs, +.>Graph structure representing weighted temporal feature-network environment sample pairs, +.>Representing the graph structure of the weighted environmental feature-network environmental sample pairs, the GCN represents a graph convolutional neural network.
5. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 4, wherein step S4 further comprises the steps of:
s401, using the intelligent agent in reinforcement learning to take the vulnerability identified by the client and the difference between the local environment and various vulnerability characteristic-network environment sample pairs as a reward function for training initial evaluation pointsThe expression is:
,
wherein,representing vulnerability identified by client and local environment, < ->Representing text encoder, ++>And->Representing the coordinate index>Representing the distance of squared euclidean;
s402, the agent calculates initial vulnerability assessment score S by using a lightweight assessment module, wherein the assessment module takes vulnerabilities identified by the client and a local environment as input, and the expression is as follows:
,
wherein,representing the normalized exponential function, GMM representing the gaussian mixture model, and PCA representing the principal component analysis.
6. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 5, wherein step S5 further comprises the steps of:
s501, inputting initial scores of loopholes and actual network environments into a decision module comprising a memory bank to generate final loophole evaluation scoresThe expression is:
,
wherein,is shown in memory bank->Finding out +.>Evaluation records similar to the actual network environment, +.>The system consists of a plurality of self-attention modules, represents a text encoder which takes initial scores of loopholes, actual network environments and histories in a memory library as input, and MLP represents a multi-layer perceptron->Representing the mathematical expectation of the initial vulnerability score compared to the historic records in the repository under an actual network environment.
7. The contrast graph clustering and reinforcement learning based vulnerability assessment method of claim 6, further comprising the steps of: and S6, storing the difference between the initial evaluation and the final evaluation and the vulnerability characteristic distribution as self-feedback information in a memory bank.
8. The vulnerability assessment method based on contrast graph clustering and reinforcement learning of claim 7, wherein step S6 further comprises the steps of:
s601, the initial evaluation score S and the final evaluation scoreDifference value->And vulnerability characteristic distribution identified by the client>Stored as self-feedback information in memory bank->The expression is:
;
s602, removing memory bank through gate control unit module and variable graph convolutionThe gate control unit module will store +.>Generating a convolution step strategy as input +.>The expression is:
,
wherein,representing a one-dimensional vector, different indexes representing different convolution steps, and +.>Is indicative of the effectiveness of using the convolution step, < >>The weight of each convolution step can be made between 0 and 1, tanh represents the hyperbolic tangent function,>representing a convolution function>Representing a batch normalization layer, +.>Respectively representing convolution parameters, wherein GAP represents a global average pooling layer;
s603, searching convolution step strategyThe index of the maximum effective value of (a) as a convolution step length, and changing the memory bank by controlling the convolution step length +.>Is expressed as:
,
wherein,representing the compressed memory bank, GCN representing the graph convolution neural network, ++>Convolution step length of the representation network, +.>Index for finding the maximum efficiency value, +.>Size of (2) represents memory bank->A kind of electronic device。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410251919.1A CN117834309B (en) | 2024-03-06 | 2024-03-06 | Vulnerability assessment method based on contrast graph clustering and reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410251919.1A CN117834309B (en) | 2024-03-06 | 2024-03-06 | Vulnerability assessment method based on contrast graph clustering and reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117834309A true CN117834309A (en) | 2024-04-05 |
CN117834309B CN117834309B (en) | 2024-05-28 |
Family
ID=90524450
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410251919.1A Active CN117834309B (en) | 2024-03-06 | 2024-03-06 | Vulnerability assessment method based on contrast graph clustering and reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117834309B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103581186A (en) * | 2013-11-05 | 2014-02-12 | 中国科学院计算技术研究所 | Network security situation awareness method and system |
CN114065223A (en) * | 2021-11-26 | 2022-02-18 | 西安工业大学 | Multi-dimensional software security risk assessment method based on CVSS |
CN114386046A (en) * | 2021-12-28 | 2022-04-22 | 绿盟科技集团股份有限公司 | Unknown vulnerability detection method and device, electronic equipment and storage medium |
CN116204889A (en) * | 2023-03-10 | 2023-06-02 | 国网湖南省电力有限公司 | Software vulnerability assessment method, system and medium based on contrast learning |
CN116346475A (en) * | 2023-03-30 | 2023-06-27 | 广东电网有限责任公司江门供电局 | Hidden high-risk behavior operation anomaly scoring method and system |
CN116743468A (en) * | 2023-06-26 | 2023-09-12 | 西安电子科技大学 | Dynamic attack path generation method based on reinforcement learning |
CN117272330A (en) * | 2023-11-22 | 2023-12-22 | 深圳市奥盛通科技有限公司 | Method and system for reinforcing and updating server system |
CN117473512A (en) * | 2023-12-28 | 2024-01-30 | 湘潭大学 | Vulnerability risk assessment method based on network mapping |
-
2024
- 2024-03-06 CN CN202410251919.1A patent/CN117834309B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103581186A (en) * | 2013-11-05 | 2014-02-12 | 中国科学院计算技术研究所 | Network security situation awareness method and system |
CN114065223A (en) * | 2021-11-26 | 2022-02-18 | 西安工业大学 | Multi-dimensional software security risk assessment method based on CVSS |
CN114386046A (en) * | 2021-12-28 | 2022-04-22 | 绿盟科技集团股份有限公司 | Unknown vulnerability detection method and device, electronic equipment and storage medium |
CN116204889A (en) * | 2023-03-10 | 2023-06-02 | 国网湖南省电力有限公司 | Software vulnerability assessment method, system and medium based on contrast learning |
CN116346475A (en) * | 2023-03-30 | 2023-06-27 | 广东电网有限责任公司江门供电局 | Hidden high-risk behavior operation anomaly scoring method and system |
CN116743468A (en) * | 2023-06-26 | 2023-09-12 | 西安电子科技大学 | Dynamic attack path generation method based on reinforcement learning |
CN117272330A (en) * | 2023-11-22 | 2023-12-22 | 深圳市奥盛通科技有限公司 | Method and system for reinforcing and updating server system |
CN117473512A (en) * | 2023-12-28 | 2024-01-30 | 湘潭大学 | Vulnerability risk assessment method based on network mapping |
Also Published As
Publication number | Publication date |
---|---|
CN117834309B (en) | 2024-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Elmasry et al. | Evolving deep learning architectures for network intrusion detection using a double PSO metaheuristic | |
Sarker | CyberLearning: Effectiveness analysis of machine learning security modeling to detect cyber-anomalies and multi-attacks | |
Sarker | Machine learning for intelligent data analysis and automation in cybersecurity: current and future prospects | |
Lian et al. | An Intrusion Detection Method Based on Decision Tree‐Recursive Feature Elimination in Ensemble Learning | |
Zhu et al. | OFS-NN: an effective phishing websites detection model based on optimal feature selection and neural network | |
Liu et al. | CNID: research of network intrusion detection based on convolutional neural network | |
CN104539484B (en) | A kind of method and system of dynamic evaluation network connection confidence level | |
Lin et al. | Collaborative alert ranking for anomaly detection | |
Singh et al. | User behavior based insider threat detection using a multi fuzzy classifier | |
Kaushik et al. | Performance evaluation of learning models for intrusion detection system using feature selection | |
Ibor et al. | Novel adaptive cyberattack prediction model using an enhanced genetic algorithm and deep learning (AdacDeep) | |
CN115987544A (en) | Network security threat prediction method and system based on threat intelligence | |
Guo et al. | Multimodal dual-embedding networks for malware open-set recognition | |
Singh et al. | User behaviour based insider threat detection using a hybrid learning approach | |
CN117675387B (en) | Network security risk prediction method and system based on user behavior analysis | |
Alenezi et al. | Machine learning approach to predict computer operating systems vulnerabilities | |
Meryem et al. | A novel approach in detecting intrusions using NSLKDD database and MapReduce programming | |
Manoharan et al. | Insider threat detection using supervised machine learning algorithms | |
Go et al. | Insider attack detection in database with deep metric neural network with Monte Carlo sampling | |
Wanda et al. | Belief-DDoS: stepping up DDoS attack detection model using DBN algorithm | |
Ren et al. | APT Attack Detection Based on Graph Convolutional Neural Networks | |
CN116633682B (en) | Intelligent identification method and system based on security product risk threat | |
Rahman et al. | An exploratory analysis of feature selection for malware detection with simple machine learning algorithms | |
Saurabh et al. | HMS-IDS: Threat Intelligence Integration for Zero-Day Exploits and Advanced Persistent Threats in IIoT | |
Cui et al. | Multi-homed abnormal behavior detection algorithm based on fuzzy particle swarm cluster in user and entity behavior analytics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |