CN114900346B - Network security testing method and system based on knowledge graph - Google Patents

Network security testing method and system based on knowledge graph Download PDF

Info

Publication number
CN114900346B
CN114900346B CN202210461327.3A CN202210461327A CN114900346B CN 114900346 B CN114900346 B CN 114900346B CN 202210461327 A CN202210461327 A CN 202210461327A CN 114900346 B CN114900346 B CN 114900346B
Authority
CN
China
Prior art keywords
network security
knowledge
test
information
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210461327.3A
Other languages
Chinese (zh)
Other versions
CN114900346A (en
Inventor
谢凌云
王馨雨
杨紫柠
潘乐炳
王艺婷
帅源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microwave Technology CETC 50 Research Institute
Original Assignee
Shanghai Institute of Microwave Technology CETC 50 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microwave Technology CETC 50 Research Institute filed Critical Shanghai Institute of Microwave Technology CETC 50 Research Institute
Priority to CN202210461327.3A priority Critical patent/CN114900346B/en
Publication of CN114900346A publication Critical patent/CN114900346A/en
Application granted granted Critical
Publication of CN114900346B publication Critical patent/CN114900346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a network security testing method and system based on a knowledge graph, comprising the following steps: step S1: extracting a knowledge triplet from the network security domain text; step S2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph; step S3: acquiring information by inquiring based on a network security test knowledge graph; step S4: loading a network security test scheme template, and generating a network security test scheme by using the queried information. According to the invention, by combining the ternary extraction model of the Encoder coding structure and the conditional random field CRF, the knowledge ternary related to the network security test in the network security text is extracted, and the network security test knowledge graph is constructed, so that network security testers can query information related to the network security test by using the knowledge graph, and the requirement on own knowledge storage of the network security testers is reduced.

Description

Network security testing method and system based on knowledge graph
Technical Field
The invention relates to the field of network security, in particular to a network security testing method and system based on a knowledge graph.
Background
With the development of the age and the progress of society, more and more devices for network informatization are provided, and the network security problem is increased. The network attacker can utilize the loopholes existing in the information system, and adopt various attack modes to carry out network attack on the information system, so that the security of the information network is seriously jeopardized. The network scale is greatly enlarged, so that network attack activities are more frequent and attack modes are more various, and the network security protection is provided with serious challenges. Therefore, testing the network security performance of the information system, improving the defending ability of the information system against network attacks, and getting more and more attention.
At present, when network security test is performed, a tester usually writes a network security test scheme by referring to related data. And then, according to the contents of the test outline, the test rules and the like, the network security test is implemented on the tested system. However, the fields related to network security are numerous, data are fragmented and massive, and a tester consumes a great deal of time and energy in the process of inquiring and testing data of network security. And when writing a network security test scheme, the method has high knowledge storage requirements for testers. The testers need to know the related knowledge in the network security field and also know the related information of the tested system so as to write an accurate and reliable network security test scheme. These problems reduce to some extent the efficiency of the performance of network security tests.
Patent document CN110688456A (application number: CN 201910909082.4) discloses a knowledge graph-based vulnerability knowledge base construction method, and relates to the technical field of network security. According to the knowledge fusion method, knowledge extracted by a plurality of data sources is fused through knowledge fusion, so that knowledge from different knowledge sources is subjected to heterogeneous data integration, disambiguation, processing, reasoning verification and updating under the same frame specification, and the fusion of data, information, methods, experience and attack and defense knowledge is achieved, so that a vulnerability knowledge base is formed. The invention extracts the knowledge triplets related to the network security test in the network security text and builds the network security test knowledge graph by combining the triple extraction model of the Encoder coding structure and the conditional random field CRF.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a network security testing method and system based on a knowledge graph.
The network security testing method based on the knowledge graph provided by the invention comprises the following steps:
step S1: extracting a knowledge triplet from the network security domain text;
step S2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
Step S3: acquiring information by inquiring based on a network security test knowledge graph;
step S4: loading a network security test scheme template, and generating a network security test scheme by using the queried information.
Preferably, in said step S1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining the entity category to be extracted, labeling the entity relation triples existing in each text, and marking the entity relation triples in the form of: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
the marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
for the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; and extracting knowledge triples in the unlabeled network security text by using the trained model.
Preferably, in said step S2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
Preferably, in said step S3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. Node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
Preferably, in said step S4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
the mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
and converting the keys and the values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme.
The invention provides a network security testing system based on a knowledge graph, which comprises the following components:
module M1: extracting a knowledge triplet from the network security domain text;
module M2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
module M3: acquiring information by inquiring based on a network security test knowledge graph;
Module M4: loading a network security test scheme template, and generating a network security test scheme by using the queried information.
Preferably, in said module M1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining the entity category to be extracted, labeling the entity relation triples existing in each text, and marking the entity relation triples in the form of: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
the marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
for the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; and extracting knowledge triples in the unlabeled network security text by using the trained model.
Preferably, in said module M2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
Preferably, in said module M3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. Node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
Preferably, in said module M4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
the mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
and converting the keys and the values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, by combining the ternary extraction model of the Encoder coding structure and the conditional random field CRF, the knowledge ternary related to the network security test in the network security text is extracted, and the network security test knowledge graph is constructed, so that network security testers can query information related to the network security test by using the knowledge graph, and the requirement on own knowledge storage of the network security testers is reduced;
2. According to the invention, by utilizing the constructed network security test knowledge graph, a tester can automatically generate a network security test scheme by loading a network security test scheme template, so that the intelligent level of scheme design is improved;
3. the invention utilizes the generated network security test scheme, and the tester can rapidly and efficiently test the network security, thereby improving the execution efficiency of the network security test.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a network security test method;
FIG. 2 is a schematic diagram of a network security knowledge triplet extraction model;
FIG. 3 is a diagram of the internal architecture of a single Encoder Encoder;
FIG. 4 is a Self-attention structure diagram.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
Example 1:
according to the network security testing method based on the knowledge graph, as shown in fig. 1-4, the method comprises the following steps:
step S1: extracting a knowledge triplet from the network security domain text;
step S2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
step S3: acquiring information by inquiring based on a network security test knowledge graph;
step S4: loading a network security test scheme template, and generating a network security test scheme by using the queried information.
Specifically, in the step S1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining the entity category to be extracted, labeling the entity relation triples existing in each text, and marking the entity relation triples in the form of: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
the marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
For the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; and extracting knowledge triples in the unlabeled network security text by using the trained model.
Specifically, in the step S2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
Specifically, in the step S3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. Node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
Specifically, in the step S4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
the mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
and converting the keys and the values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme.
Example 2:
example 2 is a preferable example of example 1 to more specifically explain the present invention.
A person skilled in the art may understand the network security testing method based on a knowledge graph provided by the present invention as a specific implementation manner of the network security testing system based on a knowledge graph, that is, the network security testing system based on a knowledge graph may be implemented by executing the step flow of the network security testing method based on a knowledge graph.
The invention provides a network security testing system based on a knowledge graph, which comprises the following components:
module M1: extracting a knowledge triplet from the network security domain text;
module M2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
module M3: acquiring information by inquiring based on a network security test knowledge graph;
module M4: loading a network security test scheme template, and generating a network security test scheme by using the queried information.
Specifically, in the module M1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining the entity category to be extracted, labeling the entity relation triples existing in each text, and marking the entity relation triples in the form of: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
The marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
for the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; and extracting knowledge triples in the unlabeled network security text by using the trained model.
Specifically, in the module M2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
Storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
Specifically, in the module M3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
Specifically, in the module M4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
The mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
and converting the keys and the values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme.
Example 3:
example 3 is a preferable example of example 1 to more specifically explain the present invention.
Aiming at the defects in the prior art, the technical problems to be solved by the invention are as follows:
1) Constructing a network security test knowledge graph by utilizing massive and fragmented network security data;
2) And generating a network security test scheme by using the constructed network security test knowledge graph.
The invention aims to provide a network security testing method based on a knowledge graph, which is convenient for testing personnel to test the network security of equipment and systems. The method comprises the following steps:
step S100, extracting a knowledge triplet related to network security test from a network security domain text based on a code network structure (Encoder) of a transducer and in combination with a Conditional Random Field (CRF);
step S200, storing the extracted knowledge triples in a Neo4j database in the form of < nodes, attributes, attribute values > and < nodes, relations and nodes > to construct a network security test knowledge graph;
Step S300, based on the constructed knowledge graph, acquiring network security entities and attribute information thereof through node inquiry, and acquiring information related to entity nodes through path inquiry;
step S400, loading a network security test scheme template, and automatically generating a corresponding network security test scheme by using the queried knowledge.
The extracting the network security knowledge triples in step S100 specifically includes:
in step S101, in the network security text, the entity relationship triples existing in each text segment, such as < device name, existence, vulnerability name >, < tool name, attack tool, vulnerability name > are marked in the form of < host entity, relationship, guest entity >, and the like, and are used as training test data of the model.
Step S102, converting the input text into a vector by adopting single-hot coding on the marked text, performing matrix operation on the input text vector through an encoder model to obtain a context feature vector of a network security text sequence, and capturing key local features in the context by utilizing a Self-attention multi-head attention mechanism. And finally, predicting the < main entity, relation and guest entity > triples in the input text by utilizing a CRF model based on the extracted feature vectors.
Step S103, for the model for extracting the knowledge triples in step S102, model parameters are required to be adjusted according to training results, and the model is trained for multiple times, so that the accuracy of knowledge extraction reaches the use requirement. The adjustable model parameters comprise training times, batch size, learning rate, discarding rate, optimization function and the like. And finally, extracting knowledge triples in the unlabeled network security text by using the trained model.
After step S100, step S200 stores the extracted knowledge triples in a graph database Neo4j, and builds a network security test knowledge graph, where step S200 specifically includes:
step S201, according to the priori knowledge, the storage form of the extracted network security knowledge triples in Neo4j is determined. If the vulnerability name is included, the vulnerability score > belongs to the category < node, attribute value >, < tool name, attack tool, vulnerability name > belongs to the category < node, relationship, node >.
Step S202, after step S201, the processed knowledge triples are stored in Neo4j according to the form of < node, attribute value > and < node, relation, node > by using commands such as CREATE, LOAD of the Cypher language.
After step S200, step S300 queries information related to the network security test by a command such as MATCH, WHERE, RETURN in the Cypher language. The step S300 specifically includes two information query methods:
1. And inquiring node information. And inputting the entity name, setting a query condition by utilizing a WHERE command, matching nodes which are the same as the input entity name in a network security test knowledge graph by utilizing a MATCH command, and returning entity information meeting the query condition by utilizing a RETURN command.
2. And inquiring the node path. And inputting entity names and path names, matching the node and node paths meeting the conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation thereof on the paths by utilizing RETURN commands.
After step S300, step S400 automatically generates a corresponding network security test scheme by loading a network security test scheme template and using the queried knowledge. The step S400 specifically includes:
and S401, loading a network security test scheme template by using a Python-docx library, and obtaining the mapping relation between the template and the test scheme. The template content comprises frame information such as a tested object, a testing method, a testing tool and the like;
step S402, representing the mapping relation between the template and the test scheme in the form of a dictionary, wherein the template information is keys in the dictionary, and the knowledge queried from the knowledge graph is values corresponding to the keys in the dictionary;
And step S403, converting the keys and the values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme.
Example 4:
example 4 is a preferable example of example 1 to more specifically explain the present invention.
The technical scheme of the invention will be clearly and completely described below with reference to the accompanying drawings.
Referring to fig. 1, the network security testing method based on the knowledge graph provided by the invention comprises the following steps:
step S100, extracting, by a knowledge extraction algorithm, a knowledge triplet related to a network security test from a network security domain text, where step S100 includes:
in step S101, firstly, network security text is analyzed, determining entity types to be extracted, such as equipment, tools, vulnerabilities, etc., and then, according to expert knowledge, triples of < host entity, relationship, guest entity > in the text, such as < equipment name, existence, vulnerability name >, < tool name, attack tool, vulnerability name > etc., are marked;
in step S102, a model of entity-relationship joint extraction is used, the model is input as text, and output as triples (host entity S, relationship p, guest entity o) in the text. S is predicted first, then s is input to predict o corresponding to s, and then s and o are input to predict the relation p of s and o. The extraction model of the network security knowledge triples is shown in fig. 2.
Network security data is first converted into Token, segment and Position vectors for model input. The Token is an input sequence of a text and represents text content; segment is clause information, the first sentence is represented by '1', and the second sentence is represented by '0'; position is Position information representing the Position index of each input character in the library. Each Input of the model consists of token+segment+position and, when passed to the Encoder, is converted to Input Embedding and Position inputs of the Encoder. The individual Encoder structures are shown in FIG. 3, including Self-attention, add & Norm and Feed-force layers.
The Self-attribute structure is shown in fig. 4, and the context information of each Input Embedding is read according to the Position by querying 3 vectors with the same length, namely, a vector Q, a key vector K and a value vector V. Wherein the calculation formulas of the query vector Q, the key vector K and the value vector V are respectively as follows
Q=XW Q ,K=XW K ,V=XW V
Wherein X is an input matrix, W Q ,W K ,W V The weight matrix can be obtained through model training. The output Y of Self-attribute is
In the method, in the process of the invention,to penalty factors, the effect is to ensure that the product of Q and K is not too large.
The Self-intent outputs the processed data to an Add & Nor layer, the Add adds the input and output of the Self-intent layer, nor normalizes the added result so that the Self-intent layer outputs a word vector list with a mean value of 0 and a variance of 1, and the outputted word vector list is processed by the Feed-word and Add & Nor layer to obtain a new word vector.
And (5) performing matrix operation by a plurality of encoders to obtain a final text feature vector. The CRF classifier calculates the value score of each possible knowledge triplet sequence based on the text feature vector, and outputs the triplet sequence with the highest value as the extraction result of the network security text. The way the CRF calculates the value is as follows:
wherein L (y) 1 ,…y m ) To calculate the total value, b (y 1 ) And e (y) m ) The value s of the initial state and the end state respectively t (y t ) Is that the entity label is y t Value at time, T (y) t ,y t+1 ) For the entity tag y t State transition to entity tag y t+1 The value of the state.
By the method, the key technology of network security knowledge extraction can be realized, and a knowledge triplet is provided for subsequent network security test knowledge graph construction
And step S103, when the model described in the step S102 is utilized to extract the network security knowledge triples, parameters such as training times, batch size, learning rate, discarding rate, optimization function and the like of the model are adjusted according to the accuracy, recall rate and F1 value of the experimental result, so that the final experimental result is optimal, and the training error of the model is ensured to be converged rapidly. And finally, extracting knowledge triples in the unlabeled network security text by using the trained model.
After step S100, step S200 stores the extracted knowledge triples in a graph database Neo4j, and builds a network security test knowledge graph, where step S200 specifically includes:
step S201, according to the priori knowledge, the storage form of the extracted network security knowledge triples in Neo4j is determined. If the vulnerability name is included, the vulnerability score > belongs to the category < node, attribute value >, < tool name, attack tool, vulnerability name > belongs to the category < node, relationship, node >. For the knowledge triples stored in the form of < nodes, relations, nodes >, each type of entity is a type of node in the Neo4j database, and the relations between the two types of entities are the connection relations between the nodes. For a knowledge triplet stored in a form of < node, attribute and attribute value >, the host entity is a type of node in the Neo4j database, the guest entity is the attribute value contained in the node, and the relationship between the two types of entities is the relationship between the node and the attribute value.
Step S202, after determining the node type, attribute type and relationship type in step S201, each central node is first created by using a CREATE command, and based on the node, the knowledge triples of the < node, attribute value > type are stored into the Neo4j database in the form of "node (attribute 1: attribute value 1, attribute 2: attribute value 2, … …)". Then, a node association relation is created, and the nodes with the association relation are connected in a node-relation-node mode, so that the warehousing operation of the knowledge triples of the types of the nodes, the relations and the nodes is realized.
After step S200, the construction of the network security test knowledge graph is implemented. At this time, step S300 will query the constructed knowledge graph for information related to the network security test by commands such as MATCH, WHERE, RETURN in the Cypher language. The query method of step S300 includes:
1. and inquiring node information. Inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning to meet the query conditions by using a RETURN command
Entity information. When information inquiry is carried out, the accurate inquiry mode can be adopted to inquire nodes with the same names as the input entity, and the fuzzy inquiry mode can also be adopted to inquire the nodes containing the names of the input entity in the network security test knowledge graph.
2. And inquiring the node path. The entity name and the path name are input, the node is found in the network security test knowledge graph according to the input entity name, and then other nodes associated with the node are matched according to the input path name. The node path queries include one-hop queries and multi-hop queries. One-hop queries, i.e., queries nodes directly associated with the input node, and multi-hop queries, i.e., queries nodes indirectly associated with the input node through multiple paths. The node path query may return all node information and path information on the entire path.
The step S400 may generate a corresponding network security test scheme according to the network security test knowledge acquired in the step S300. The step S400 specifically includes:
and S401, loading a network security test scheme template by using a Python-docx plug-in. The Python-docx plug-in can analyze and obtain the information of the primary title, the secondary title, the tertiary title and the like in the network test scheme template, and the specific content comprises a tested object, a test method and a test tool.
Step S402, after the template frame information of the network security test scheme is obtained in step S401, related information can be queried in the network security test knowledge graph according to the test task and filled into the template frame. For a known tested object, node information query can be utilized to input the name of the tested object, and information such as vulnerability and vulnerability of the tested object can be obtained. And then inquiring what tools can perform corresponding network security tests aiming at the vulnerabilities and the vulnerabilities by utilizing node path, and using methods of the tools.
When corresponding test information is queried, a mapping form of a template title and a query result is established in the form of a dictionary, wherein the title in the template is a key in the dictionary, the queried corresponding information is a value corresponding to the title key, and the nested relation in the dictionary is the relation among a primary title, a secondary title and a tertiary title in the network test scheme template.
Step S403, after the dictionary mapping of the template title and the test content is obtained in step S402, the keys and the corresponding values in the dictionary are required to be sequentially changed into the corresponding test outline and the corresponding test rule, and the corresponding network security test scheme is generated by using the Python-docx plugin. The key of the outermost layer is the 1-level title of the network security test scheme, the key of the second layer is the 2-level title, and so on.
By using the network security test scheme generated in step S400, a tester can perform network security test according to the information of the test tool, the test method, the test target, and the like in the scheme without excessive knowledge storage.
Those skilled in the art will appreciate that the systems, apparatus, and their respective modules provided herein may be implemented entirely by logic programming of method steps such that the systems, apparatus, and their respective modules are implemented as logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc., in addition to the systems, apparatus, and their respective modules being implemented as pure computer readable program code. Therefore, the system, the apparatus, and the respective modules thereof provided by the present invention may be regarded as one hardware component, and the modules included therein for implementing various programs may also be regarded as structures within the hardware component; modules for implementing various functions may also be regarded as being either software programs for implementing the methods or structures within hardware components.
The foregoing describes specific embodiments of the present application. It is to be understood that the application is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the application. The embodiments of the application and the features of the embodiments may be combined with each other arbitrarily without conflict.

Claims (6)

1. The network security testing method based on the knowledge graph is characterized by comprising the following steps of:
step S1: extracting a knowledge triplet from the network security domain text;
step S2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
step S3: acquiring information by inquiring based on a network security test knowledge graph;
step S4: loading a network security test scheme template, and generating a network security test scheme by using the queried information;
in the step S1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining entity types to be extracted, and labeling entity relation triples existing in each text, wherein the labeling forms are as follows: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
The marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
for the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; extracting knowledge triples in unlabeled network security texts by using the trained model;
in the step S4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
the mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
Converting keys and values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme;
after template frame information of a network security test scheme is obtained, relevant information is queried in a network security test knowledge graph according to a test task and filled into the template frame, and for a known tested object, the name of the tested object is input by utilizing node information query to obtain vulnerability and vulnerability information of the tested object; inquiring tools capable of carrying out corresponding network security tests aiming at vulnerabilities and vulnerabilities by utilizing node path inquiry, and a using method of the tools;
when corresponding test information is queried, a mapping form of a template title and a query result is established in the form of a dictionary, wherein the title in the template is a key in the dictionary, the queried corresponding information is a value corresponding to the title key, and the nesting relationship in the dictionary is the relationship among a primary title, a secondary title and a tertiary title in the network test scheme template;
after dictionary mapping of the template title and the test content is obtained, sequentially traversing the keys and the corresponding values in the dictionary, converting the keys and the values into corresponding test outline and test detail, generating a corresponding network security test scheme by using a Python-docx plug-in, wherein the key at the outermost layer is a 1-level title of the network security test scheme, and the key at the second layer is a 2-level title.
2. The network security testing method based on the knowledge-graph according to claim 1, wherein in the step S2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
3. The network security testing method based on the knowledge-graph according to claim 1, wherein in the step S3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. Node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
4. A knowledge-graph-based network security testing system, comprising:
module M1: extracting a knowledge triplet from the network security domain text;
module M2: storing the extracted knowledge triples in a database in a preset form, and constructing a network security test knowledge graph;
module M3: acquiring information by inquiring based on a network security test knowledge graph;
module M4: loading a network security test scheme template, and generating a network security test scheme by using the queried information;
in the module M1:
based on a coding network structure of a transducer, extracting a knowledge triplet related to network security test from a network security field text in combination with a conditional random field;
analyzing the network security text, determining entity types to be extracted, and labeling entity relation triples existing in each text, wherein the labeling forms are as follows: a host entity, a relationship, a guest entity; taking the labeling text as training test data of the model;
The marked text is subjected to single-heat coding, the input text is converted into a vector, matrix operation is carried out on the input text vector through a coding network structure, a context feature vector of a network security text sequence is obtained, a multi-head attention mechanism is utilized to capture local features meeting preset conditions in the context, and based on the extracted feature vector, a triplet in the input text is predicted by utilizing a conditional random field;
for the model for extracting the knowledge triples, adjusting model parameters according to training results, and training the model for multiple times to enable the accuracy of knowledge extraction to reach preset requirements; the adjustable model parameters comprise training times, batch size, learning rate, discarding rate and optimization function; extracting knowledge triples in unlabeled network security texts by using the trained model;
in the module M4:
loading a network security test scheme template by using a Python-docx library, and obtaining a mapping relation between the template and the test scheme; the template content comprises a tested object, a testing method and a testing tool;
the mapping relation between the template and the test scheme is expressed in the form of a dictionary, the template information is the key in the dictionary, and the knowledge inquired from the knowledge graph is the value corresponding to the key in the dictionary;
Converting keys and values in the dictionary into corresponding test outline and test rules, and generating a corresponding network security test scheme;
after template frame information of a network security test scheme is obtained, relevant information is queried in a network security test knowledge graph according to a test task and filled into the template frame, and for a known tested object, the name of the tested object is input by utilizing node information query to obtain vulnerability and vulnerability information of the tested object; inquiring tools capable of carrying out corresponding network security tests aiming at vulnerabilities and vulnerabilities by utilizing node path inquiry, and a using method of the tools;
when corresponding test information is queried, a mapping form of a template title and a query result is established in the form of a dictionary, wherein the title in the template is a key in the dictionary, the queried corresponding information is a value corresponding to the title key, and the nesting relationship in the dictionary is the relationship among a primary title, a secondary title and a tertiary title in the network test scheme template;
after dictionary mapping of the template title and the test content is obtained, sequentially traversing the keys and the corresponding values in the dictionary, converting the keys and the values into corresponding test outline and test detail, generating a corresponding network security test scheme by using a Python-docx plug-in, wherein the key at the outermost layer is a 1-level title of the network security test scheme, and the key at the second layer is a 2-level title.
5. The knowledge-graph-based network security test system of claim 4, wherein in said module M2:
storing the extracted knowledge triples into a Neo4j database, and constructing a network security test knowledge graph, wherein the storage form is as follows: nodes, attributes, attribute values; or the storage form is as follows: nodes, relations, nodes;
determining the storage form of the extracted network security knowledge triples in Neo4j according to the priori knowledge;
storing the processed knowledge triples into Neo4j respectively by using a Cypher language; the storage form is as follows: nodes, attributes, attribute values; alternatively, the storage form is: nodes, relationships, nodes.
6. The knowledge-graph-based network security test system of claim 4, wherein in said module M3:
based on the constructed knowledge graph, acquiring a network security entity and attribute information thereof through node inquiry, and acquiring information related to the entity node through path inquiry;
the information related to the network security test is queried through the Cypher language, which concretely comprises two information query modes:
a. node information query: inputting entity names, setting query conditions by using a WHERE command, matching nodes which are the same as the input entity names in a network security test knowledge graph by using a MATCH command, and returning entity information meeting the preset query conditions by using a RETURN command;
b. Node path query: and inputting entity names and path names, matching the node and node paths meeting preset conditions on the network security test knowledge graph through MATCH and WHERE commands, and returning the node attribute information and the association relation on the paths by utilizing RETURN commands.
CN202210461327.3A 2022-04-28 2022-04-28 Network security testing method and system based on knowledge graph Active CN114900346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210461327.3A CN114900346B (en) 2022-04-28 2022-04-28 Network security testing method and system based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210461327.3A CN114900346B (en) 2022-04-28 2022-04-28 Network security testing method and system based on knowledge graph

Publications (2)

Publication Number Publication Date
CN114900346A CN114900346A (en) 2022-08-12
CN114900346B true CN114900346B (en) 2023-09-19

Family

ID=82718861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210461327.3A Active CN114900346B (en) 2022-04-28 2022-04-28 Network security testing method and system based on knowledge graph

Country Status (1)

Country Link
CN (1) CN114900346B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115658913A (en) * 2022-10-17 2023-01-31 桂林电子科技大学 Electric power equipment information map construction method based on Handle identification analysis
CN117376228B (en) * 2023-11-27 2024-05-28 中国电子科技集团公司第十五研究所 Network security testing tool determining method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN114091034A (en) * 2021-11-12 2022-02-25 绿盟科技集团股份有限公司 Safety penetration testing method and device, electronic equipment and storage medium
CN114205154A (en) * 2021-12-12 2022-03-18 中国电子科技集团公司第十五研究所 Network security test method for isolation security mechanism
CN114257420A (en) * 2021-11-29 2022-03-29 中国人民解放军63891部队 Method for generating network security test based on knowledge graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130096946A1 (en) * 2011-10-13 2013-04-18 The Board of Trustees of the Leland Stanford, Junior, University Method and System for Ontology Based Analytics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN114091034A (en) * 2021-11-12 2022-02-25 绿盟科技集团股份有限公司 Safety penetration testing method and device, electronic equipment and storage medium
CN114257420A (en) * 2021-11-29 2022-03-29 中国人民解放军63891部队 Method for generating network security test based on knowledge graph
CN114205154A (en) * 2021-12-12 2022-03-18 中国电子科技集团公司第十五研究所 Network security test method for isolation security mechanism

Also Published As

Publication number Publication date
CN114900346A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN111428054B (en) Construction and storage method of knowledge graph in network space security field
Jia et al. A practical approach to constructing a knowledge graph for cybersecurity
WO2020001373A1 (en) Method and apparatus for ontology construction
CN114900346B (en) Network security testing method and system based on knowledge graph
CN109885698A (en) A kind of knowledge mapping construction method and device, electronic equipment
CN117271767B (en) Operation and maintenance knowledge base establishing method based on multiple intelligent agents
US8386238B2 (en) Systems and methods for evaluating a sequence of characters
CN107844533A (en) A kind of intelligent Answer System and analysis method
CN113254507B (en) Intelligent construction and inventory method for data asset directory
CN111931935A (en) Network security knowledge extraction method and device based on One-shot learning
KR102532216B1 (en) Method for establishing ESG database with structured ESG data using ESG auxiliary tool and ESG service providing system performing the same
Sakai et al. Rough set‐based rule generation and Apriori‐based rule generation from table data sets: a survey and a combination
CN114491082A (en) Plan matching method based on network security emergency response knowledge graph feature extraction
CN116861269A (en) Multi-source heterogeneous data fusion and analysis method in engineering field
CN116361788A (en) Binary software vulnerability prediction method based on machine learning
Balaji et al. Text summarization using NLP technique
Wang et al. Robust Recommendation with Adversarial Gaussian Data Augmentation
CN117879934A (en) SQL injection attack detection method based on network data packet context
Cao Design and optimization of a decision support system for sports training based on data mining technology
Seo et al. Active learning for knowledge graph schema expansion
CN110633363B (en) Text entity recommendation method based on NLP and fuzzy multi-criterion decision
CN116226404A (en) Knowledge graph construction method and knowledge graph system for intestinal-brain axis
Yang et al. Evaluation and assessment of machine learning based user story grouping: A framework and empirical studies
Liao et al. An Automatic and Unified Consistency Verification Rule and Method of SG-CIM Model
Tan et al. A new method for business process retrieval using breadth-first traversal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant