CN117951713A - Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation - Google Patents

Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation Download PDF

Info

Publication number
CN117951713A
CN117951713A CN202410115040.4A CN202410115040A CN117951713A CN 117951713 A CN117951713 A CN 117951713A CN 202410115040 A CN202410115040 A CN 202410115040A CN 117951713 A CN117951713 A CN 117951713A
Authority
CN
China
Prior art keywords
abnormal sample
data set
abnormal
software
container
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410115040.4A
Other languages
Chinese (zh)
Inventor
李瑞林
李志伟
冯超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202410115040.4A priority Critical patent/CN117951713A/en
Publication of CN117951713A publication Critical patent/CN117951713A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses an automatic generation method and system of an abnormal sample data set for vulnerability classification evaluation, wherein the automatic generation method of the abnormal sample data set comprises the following steps: analyzing and storing the container template; constructing a container instance according to the container template; generating an anomaly sample in the container instance by fuzzy test tool software; verifying whether the generated abnormal sample is valid; destroying invalid abnormal samples; the retained abnormal samples are collected and collated to obtain an abnormal sample dataset. The method can solve the technical problems of complicated steps, difficult management and high labor cost in the existing method for establishing the abnormal sample data set for vulnerability classification evaluation, and the technical problem that the accuracy of the data set is influenced by artificial uncertain factors due to the fact that the validity of the abnormal sample needs to be verified artificially.

Description

Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation
Technical Field
The invention relates to the field of computer software security vulnerabilities, in particular to an automatic generation method and system for an abnormal sample data set for vulnerability classification evaluation.
Background
The abnormal sample data set includes a number of abnormal sample files and related information. The abnormal sample file is generally simply referred to as an abnormal sample. The system comprises a target software, an exception sample file, a computer storage device and a program, wherein the exception sample file is a section of data stream stored in the computer storage device, and when the exception sample file is used as input in the running process of the target software, the vulnerability in the target software can be triggered, so that the target software crashes and exits abnormally. The relevant information includes the source, modification time, conditions triggering defects or vulnerabilities, comments, etc. of all abnormal sample files in the dataset, typically stored in plain text, JSON (JavaScript Object Notation, JS object profile), etc. Practitioners in the computer field can find and solve the loopholes triggered by the abnormal sample files through analyzing the abnormal sample files and the target software reading the abnormal sample files as an input operation process. In an actual scene, a large number of abnormal samples often exist to increase the difficulty of the analysis process, so that the abnormal samples are classified according to the related attribute of the triggered loopholes (simply called loopholes classification), the manual analysis difficulty is reduced, and the method is one of the popular research directions in the software security loopholes analysis field. In research work for vulnerability classification, the performance of classification methods must be evaluated with the aid of large-scale, high-quality abnormal sample data sets.
At present, abnormal sample data sets are mostly constructed based on fuzzy test tool evaluation data sets disclosed in the industry, namely, fuzzy test tools are manually operated on the data sets, and output abnormal samples of the fuzzy test tools are collected and screened. However, the method has the defects of non-negligible difference, namely, the difference of the evaluation data sets of different fuzzy test tools is large, and the steps of establishing the environment required by running the fuzzy test tools and target software are complicated, the migration cost is high and the efficiency is low; secondly, abnormal samples are required to be manually collected and screened, the accuracy of the data set is influenced by artificial uncertain factors, a large amount of manpower and time are obviously consumed along with the large-scale increase of the number of the samples, and the error probability is increased along with the large-scale increase of the number of the samples; thirdly, the large-scale abnormal sample data set contains a large number of files belonging to different categories, and the manual management and retrieval difficulties are high.
Disclosure of Invention
The invention aims to provide an automatic generation method and system for an abnormal sample data set for vulnerability classification evaluation, so as to construct the abnormal sample data set with high efficiency and high quality, and solve the problems that the accuracy, the steps are complicated, the management is difficult, the cost is high and the efficiency is low due to the fact that the accuracy of the data set is influenced by human uncertain factors in the existing method for establishing the abnormal sample data set.
In order to achieve the above object, in one aspect, the present application provides an automatic generation method of an abnormal sample data set for vulnerability classification evaluation, the automatic generation method of the abnormal sample data set comprising:
Generating an abnormal sample for the target software through the fuzzy test tool software;
acquiring a software copy corresponding to the target software, repairing a target vulnerability on the software copy by using a target vulnerability patch, and generating repaired software;
inputting the abnormal sample into the repaired software;
If the target bug of the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
And generating an abnormal sample data set according to the effective abnormal sample, the identification information of the target vulnerability patch and the identification information of the target vulnerability.
Optionally, the abnormal sample data set automatic generation method is performed by a container instance.
Optionally, before the automatic generating method of the abnormal sample data set is performed by the container instance, the method further includes:
obtaining a container template; wherein, the container template describes operating system configuration information, software configuration information and sample configuration information;
the container instance is constructed from the container template.
Optionally, after inputting the abnormal sample into the repaired software, the method further comprises:
If the target bug of the repaired software is triggered, judging the abnormal sample as an invalid abnormal sample, and destroying all the invalid abnormal samples when a destroying condition is met.
Optionally, the automatic generation method of the abnormal sample data set further comprises:
If the control instruction generation condition is currently satisfied, at least one of the following instructions is automatically generated and the corresponding operation is performed: container instance construction instructions, abnormal sample generation instructions, abnormal sample verification instructions, invalid abnormal sample destruction instructions, and data set generation instructions.
Optionally, the automatic generation method of the abnormal sample data set further comprises:
Generating an abnormal sample data set generating log and storing the abnormal sample data set generating log into a database; wherein the abnormal sample dataset generation log comprises at least one of the following logs: the container instance constructs a log, an abnormal sample generation log, an abnormal sample verification log, an invalid abnormal sample destruction log and a data set generation log.
In yet another aspect, the present application further provides an automatic generation system of an abnormal sample data set for vulnerability classification evaluation, the automatic generation system of an abnormal sample data set comprising:
The sample generation unit is used for generating an abnormal sample aiming at the target software through the fuzzy test tool software;
The repairing unit is used for acquiring a software copy corresponding to the target software, repairing the target vulnerability on the software copy by using the target vulnerability patch and generating repaired software;
a verification unit for inputting the abnormal sample into the repaired software; if the target bug not found by the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
And the data set generating unit is used for generating an abnormal sample data set according to the effective abnormal sample, the identification information of the target vulnerability patch and the identification information of the target vulnerability.
Optionally, the sample generation unit, the repair unit, the verification unit, and the dataset generation unit are disposed within a container instance.
Optionally, the automatic generation system of abnormal sample data set further comprises:
The monitoring and control service module is used for automatically generating a container instance construction instruction and sending the container instance construction instruction to the container management service module or triggering the container management service module to generate at least one of the following instructions when the control instruction generation condition is currently met: an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction, and a data set generation instruction;
The container management service module is used for constructing the container instance according to the container instance construction instruction; generating at least one of an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction and a data set generation instruction, and controlling the container instance to execute corresponding operations.
Optionally, the container instance is further for: generating an abnormal sample data set generation log; wherein the abnormal sample dataset generation log comprises at least one of the following logs: constructing a log, an abnormal sample generation log, an abnormal sample verification log, an abnormal sample destruction log and a data set generation log by a container instance;
The monitoring and control service module is further configured to: and sending the abnormal sample data set generation log to a database service module so as to store the abnormal sample data set generation log to a database through the database service module.
According to the scheme, the invention discloses an automatic generation method and system for an abnormal sample data set for vulnerability classification evaluation; in the scheme, the generation, verification, arrangement and packaging processes of the abnormal sample related to the loopholes of the target software can be automatically executed, and finally the abnormal sample data set is obtained. According to the scheme, computer resources can be fully utilized, and an abnormal sample data set can be generated in a large scale with high efficiency; in addition, the effective abnormal sample can be accurately and rapidly identified from the generated abnormal sample through the validity verification process of the abnormal sample, so that the quality of the generated abnormal sample data set is ensured; furthermore, the method can also create at least one container instance through one or more container templates describing the target software bug, and execute the automatic generation method of the abnormal sample data set through the container instance, so that the method can efficiently and automatically obtain different types of abnormal sample data sets with larger number specifications through a plurality of container instances which are executed independently and simultaneously.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow diagram of an automatic generation method of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an automatic generation system of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another system for automatically generating an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an overall structure of an automatic generation system of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses an automatic generation method and an automatic generation system for an abnormal sample data set for vulnerability classification evaluation, which are used for automatically generating the abnormal sample data set in a large scale with high efficiency and high quality.
Referring to fig. 1, a flowchart of an automatic generation method of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention is provided, where the method includes:
S101, generating an abnormal sample aiming at target software through fuzzy test tool software;
In this embodiment, the fuzzy test tool software may be AFL (American Fuzzy Lop, a classic fuzzy test tool software in the field of computer software security vulnerabilities), or other fuzzy test tool software developed based on AFL. The fuzzy test tool software can generate a large number of new abnormal samples according to 1 or more initial abnormal samples; in some special cases, a large number of new abnormal samples can be generated without an initial abnormal sample, and when the abnormal sample is generated, the target software to be aimed at is also specified for the fuzzy test tool software, and the fuzzy test tool software is operated in Crash Exploration (abnormal sample exploration) mode.
S102, acquiring a software copy corresponding to target software, and repairing target vulnerabilities on the software copy by using target vulnerability patches to generate repaired software;
In this embodiment, the target software needs to be copied to obtain a software copy of the target software, and after the copying process is completed, the target software and the software copy thereof can be completely equivalent, have the same function, and can be operated simultaneously, independently and without interference. After the software copy of the target software is obtained, the target bug on the software copy is repaired by using the target bug patch, and repaired software is generated, so that the validity of the abnormal sample is verified through the repaired software.
S103, inputting the abnormal sample into the repaired software; if the target bug of the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
It should be noted that, in this embodiment, the initial exception samples input into the fuzzy test tool software are all samples that trigger the same vulnerability, and herein, for distinguishing, the vulnerability triggered by the initial exception sample is referred to as a target vulnerability. However, after the new abnormal sample is generated by the fuzzy test tool software, the vulnerability triggered by the new abnormal sample is not necessarily the target vulnerability triggered by the initial abnormal sample. Therefore, the method needs to perform validity verification on the abnormal samples generated by the fuzzy test tool software, and searches out valid abnormal samples capable of triggering the target loopholes, and samples incapable of triggering the target loopholes are called invalid abnormal samples.
The target vulnerability patch capable of repairing the target vulnerability is mainly used for validity verification. Specifically, the target vulnerability patch of the target vulnerability needs to be found first, after the target vulnerability patch repairs the target vulnerability of the target software, the target vulnerability does not exist on the target software, and then the exception sample aiming at the target vulnerability can not trigger the vulnerability. Therefore, after the target bug on the software copy of the target software is repaired by using the target bug patch, the repaired software can be operated by taking the abnormal sample generated in the step S101 as input; if the target bug of the repaired software is not triggered, the abnormal sample is judged to be a valid abnormal sample, if the target bug is triggered, the input abnormal sample is judged to be an invalid abnormal sample, and all the invalid abnormal samples are destroyed when the destruction condition is met. The destroying operation can release computer resources occupied by the destroyed objects in the computer, and the destroying operation comprises the following steps: the file delete operation, storage device block reclamation operation, file index marking operation, etc., are not particularly limited herein. By the method, the situation that excessive invalid abnormal samples are stored to occupy more storage resources can be avoided, and waste of system resources is avoided. In the process of S103, if the number of abnormal samples generated in S101 is 0, S103 is not executed any more, and the verification state is changed to the execution completion state; if the number of the abnormal samples generated in S101 is 1 or more, the repaired software is run with each abnormal sample as input, and whether the abnormal samples are valid or not is determined one by one.
It should be noted that, a general consensus achieved by those skilled in the art is that, after the vulnerability is triggered, the target software crash may be significantly identified by the operating system or by the specific tool software, so that, in general, the target software crash may be equivalently replaced by the vulnerability is triggered. In the method, the general situation and the special situation are considered, and vulnerability triggering is taken as a judging basis under the condition that the target software cannot be definitely judged to be crashed or have ambiguity.
S104, generating an abnormal sample data set according to the effective abnormal sample, the identification information of the target vulnerability patch and the identification information of the target vulnerability.
In this embodiment, after determining a valid abnormal sample, the remaining abnormal sample needs to be collected and sorted to obtain an abnormal sample data set, and the process at least includes the following steps:
(1) Retrieving all the retained valid exception samples;
(2) Positioning a temporary file storage position;
(3) Copying the effective abnormal sample obtained by retrieval to a temporary file storage position;
(4) Storing an information list of the effective abnormal samples obtained by retrieval to a temporary file storage position;
(5) Storing the identification information of the target vulnerability information of the target software and the identification information of the target vulnerability patch to a temporary file storage position;
(6) Packaging an effective abnormal sample in a temporary file storage position, an information list of the effective abnormal sample, identification information of target vulnerability information and identification information of target vulnerability patches to form an abnormal sample data set file, and storing the abnormal sample data set file to obtain an abnormal sample data set;
(7) Destroying the contents written into the temporary file storage positions in the steps (3) to (5).
Wherein the packing operation in the step (6) is a process of combining all the packed objects and storing them in a single computer file, and can be performed by a common general-purpose tool used by those skilled in the art. The abnormal sample data set not only comprises a plurality of effective abnormal samples, identification information of target vulnerability information and identification information of target vulnerability patches, but also comprises an information list which describes information such as the number of effective abnormal samples in the abnormal sample data set.
According to the method, the device and the system, the high-quality abnormal sample data set capable of triggering different loopholes can be automatically generated through the process, and the loopholes can be classified through the abnormal sample data set. For example: if the classification effect of the vulnerability classification tool needs to be evaluated, an abnormal sample data set 1 for the vulnerability 1, an abnormal sample data set 2 for the vulnerability 2 and an abnormal sample data set 3 for the vulnerability 3 can be generated through the above process, and all the three vulnerabilities are in the same software, and the vulnerability classification is performed on the abnormal samples in the three abnormal sample data sets through the vulnerability classification tool, and the classification effect of the vulnerability classification tool is evaluated according to the accuracy of the vulnerability classification, for example: if the vulnerability classification tool can successfully classify the abnormal samples in the abnormal sample data set 1 into the type of the vulnerability 1, classify the abnormal samples in the abnormal sample data set 2 into the type of the vulnerability 2, and classify the abnormal samples in the abnormal sample data set 3 into the type of the vulnerability 3, the vulnerability classification tool is determined to be capable of accurately performing vulnerability classification, and the classification effect is good.
In summary, in the scheme, the processes of generating, verifying, sorting and packaging the abnormal samples related to the vulnerabilities of the target software can be automatically executed, and finally, the abnormal sample data set is obtained. According to the scheme, computer resources can be fully utilized, and an abnormal sample data set can be generated in a large scale with high efficiency; in addition, the method can accurately and rapidly identify the effective abnormal sample from the generated abnormal sample through the validity verification process of the abnormal sample, and ensures the quality of the generated abnormal sample data set.
Based on the method of the previous embodiment, in this embodiment, the above-described method for automatically generating an abnormal sample data set is specifically performed by a container instance.
It should be noted that, the method for automatically generating the abnormal sample data set can be executed in a common manner by the computer device, but in order to increase the generation scale of the abnormal sample data set, in this embodiment, the method is executed specifically by the container instance, so that the abnormal sample data set with a larger number of standard can be obtained efficiently and automatically by a plurality of container instances running simultaneously.
In this embodiment, before the method for automatically generating the abnormal sample data set by the container instance is performed, the method further includes: obtaining a container template, and constructing a container instance according to the container template; wherein the container template describes operating system configuration information, software configuration information, and sample configuration information.
In particular, the container template may be provided for the user, and the transmission manner may be network transmission, removable storage device transmission, or the like, which is not particularly limited herein. The container template may exist in the form of a computer file, a load in a network data packet, etc., and is not particularly limited herein.
Wherein, the operating system configuration information described in the container template at least comprises: operating system architecture, operating system version; the software configuration information described in the container template includes at least: the method comprises the steps of a target software acquisition method, a target software storage position, a target software dependency, a target software operation method, target software vulnerability details, a target software vulnerability patch acquisition method, a target software vulnerability patch use method, a fuzzy test tool software acquisition method, a fuzzy test tool software storage position, a fuzzy test tool software dependency and a fuzzy test tool software operation method; the sample configuration information described in the container template includes at least: an initial abnormal sample acquisition method, an initial abnormal sample storage location, a generated abnormal sample acquisition method, a generated abnormal sample storage location, a temporary file storage location, and other content related to generating an abnormal sample. The container templates may be described in human-readable plain text, machine language, etc., and are not specifically limited herein.
After the container template is obtained, the container template needs to be analyzed and stored, and the process specifically comprises the following steps:
(1) Analyzing a container template and extracting various contents described by the container template;
(2) Sorting and reorganizing the extracted contents;
(3) Storing each item of content subjected to classification and rearrangement into a database;
(4) And storing the classified and recombined contents into a container template warehouse.
It should be noted that, the content obtained by extracting is classified, sorted and recombined, still accords with the definition of the container template, and has the same existing form and description mode as the original one.
It will be appreciated that in this embodiment, different container instances may be constructed from the same container template, and these container instances may be independent of each other and may not interfere with each other, and all of the container instances may access all of the content described in the container template. Further, providing different operating system environments in different container instances can enable multiple software with different or even mutually exclusive requirements on the operating system environments and the software to run simultaneously in a single computer operating system without affecting each other. Therefore, the method for automatically generating the abnormal sample data set can efficiently and automatically obtain different types of abnormal sample data sets with larger number of modules by simultaneously executing a plurality of mutually independent container examples.
Based on any of the above method embodiments, in this embodiment, the method for automatically generating an abnormal sample data set further includes the steps of:
If the control instruction generation condition is currently satisfied, at least one of the following instructions is automatically generated and the corresponding operation is performed: container instance construction instructions, abnormal sample generation instructions, abnormal sample verification instructions, invalid abnormal sample destruction instructions, and data set generation instructions.
Generating an abnormal sample data set generating log and storing the abnormal sample data set generating log into a database; wherein the abnormal sample dataset generation log comprises at least one of the following logs: the container instance constructs a log, an abnormal sample generation log, an abnormal sample verification log, an invalid abnormal sample destruction log and a data set generation log.
It should be noted that, in the process of executing the automatic generation method of the abnormal sample data set in this embodiment, each step of the method may be automatically executed according to the trigger of the control instruction, and each control instruction may be generated when the corresponding control instruction generation condition is satisfied, where the control instruction may include: container instance construction instructions, abnormal sample generation instructions, abnormal sample verification instructions, invalid abnormal sample destruction instructions, data set generation instructions, and the like; the control instruction generation condition may be determined to be satisfied when it is detected that the user is actively performing the related operation or issuing the instruction; the trigger condition may be preset by the user. The trigger condition consists of a series of trigger rules that may be combined and/or non-logically, these rules typically include: whether the state of the process currently being executed by the system is normal, whether the elapsed time of the process currently being executed by the system exceeds a preset threshold value, whether the number of generated abnormal samples is larger than a preset number threshold value, whether the number of effective abnormal samples is larger than a preset number threshold value, whether the occupation of computer resources by the system exceeds a limit level, whether the current time reaches a preset moment, whether the operating system sends out preset signals, and other rules formed by studying the current technical parameters of the system and the computer in which the system is positioned to obtain a conclusion with logic authenticity.
In the process of executing the automatic generation method of the abnormal sample data set, logs of each step can be generated and stored in a database, and a technician can know whether each step is reliably and predictably carried out by studying and judging the logs stored in the database; the log may include: the container instance constructs logs, abnormal sample generation logs, abnormal sample verification logs, invalid abnormal sample destruction logs, data set generation logs and the like.
The overall flow of the scheme is described herein in connection with each control command and each log:
1. If the system receives the container instance construction instruction, constructing a container instance according to the container template, and after the container instance construction instruction is received until the execution is finished, storing the container instance construction instruction, the time spent for the execution to be finished, the result obtained after the execution is finished, the process data for constructing the container instance and the like as container instance construction logs into a database. From the data recorded in the container instance construction log, it can be determined whether the container instance construction process completed reliably, in line with expectations, successfully, and it can be inferred what operations the container instance construction process performed, a process of constructing a container instance also referred to as instantiation of a container template.
2. When the automatic generation method of the abnormal sample data set is executed through the container instance, the storage position of the target software can be positioned in the container instance, the target software is copied to obtain a software copy of the target software, after the copying process is finished, if an abnormal sample generation instruction is received, an abnormal sample is generated through the fuzzy test tool software, and in the process of generating the abnormal sample, the fuzzy test tool software accesses the initial abnormal sample according to the related content described in the container template and generates the abnormal sample according to the initial abnormal sample; and if the fuzzy test tool software receives the control instruction for stopping generating the abnormal sample, stopping executing the process of generating the abnormal sample by the fuzzy test tool software. The control instruction for stopping generating the abnormal sample may be in a file, an interrupt signal provided by the operating system, etc., which is not particularly limited herein, and correspondingly, may be sent by writing data into a specific file, sending by the operating system, etc., which is not particularly limited herein. In the process, an abnormal sample generation log needs to be generated and stored in a database, the abnormal sample generation log comprises a control instruction for starting to generate an abnormal sample, an instruction for stopping to generate the abnormal sample, a time point for starting to execute, a time point for stopping to execute, an execution state and the like, wherein the execution state is a piece of data which can reflect configuration parameters used by the fuzzy test tool, initial abnormal samples used, occupation conditions of computer resources during operation, the number of generated abnormal samples, generation paths, storage paths and the like, and a technician researches and judges the information to know whether the abnormal sample generation process is reliably and predictably carried out; since the operating state of the fuzzy test tool is constantly changing during the generation process, the execution state is saved to the database periodically, a plurality of times during the above period, in contrast to the control command and the point in time being saved only once.
3. If the container instance receives the abnormal sample verification instruction, the container instance starts to execute a verification process of the abnormal sample, in the process, an abnormal sample verification log needs to be generated and stored in a database, and the abnormal sample verification log comprises: an abnormal sample verification instruction, time taken to execute the process, execution status, information on whether an abnormal sample is valid, and the like; wherein the execution state is a piece of data which can reflect important parameters and operation states of the abnormal sample verification process, and a technician can judge whether the process is reliably and predictably executed by studying and judging the data; the information about whether the abnormal samples stored in the database are valid or not is also a piece of data, which reflects whether each abnormal sample is valid or not, and the content form of the data can be a database-specific language, plain text, etc., which is not particularly limited herein.
4. If the container instance receives an invalid abnormal sample destruction instruction, executing a process of destroying all invalid abnormal samples, wherein in the process, an invalid abnormal sample destruction log needs to be generated and stored in a database, and the invalid abnormal sample destruction log comprises: invalid abnormal sample destruction instructions, time spent on executing the process, execution state, destroyed abnormal sample storage paths, and the like; the execution state is a piece of data reflecting the technical parameters of the destruction operation, and a technician can judge whether the destruction operation is reliably and predictably executed by studying and judging the data.
5. If the container instance receives the data set generation instruction, executing a generation process of generating an abnormal sample data set, wherein in the process, a data set generation log needs to be generated, and the data set generation log comprises the data set generation instruction, the time spent for executing the process, the execution state, the storage path of the abnormal sample data set, the structure information of the abnormal sample data set and the like from the time of receiving the instruction to the time of completing the execution; the execution state is a piece of data, reflects technical parameters of searching, positioning, copying, storing, packing and destroying operations, and a technician can judge whether each step is reliably and predictably executed or not through studying and judging the data. The structure information of the abnormal sample data set stored in the database is a piece of data reflecting the hierarchical structure of the content, and a technician can restore the logic structure forming the abnormal sample data set by studying and judging the data.
It should be noted that, in this embodiment, the existence form of any control instruction may be a load in a network data packet, data in a pipeline used for inter-process communication, and the like, and is not particularly limited; the transmission mode of any control instruction can be network transmission, named pipeline transmission and the like, and is not particularly limited, and correspondingly, the transmission mode of any control instruction can be network transmission equipment, computer software and the like, and is not particularly limited; the sending condition of any control instruction may be manually preset, computer software control, etc., and is not particularly limited.
In conclusion, the method and the device can flexibly, orderly, automatically and efficiently complete the generation process of the abnormal sample data set under the drive of a series of control instructions, finally obtain the abnormal sample data set, and avoid the occupation of more storage resources by storing excessive invalid abnormal samples and temporary files through the destruction operation driven by the corresponding control instructions, thereby avoiding the waste of computer resources; in addition, the method can be used for recording the specific execution process of each step in detail through the generation of the log through the abnormal sample data set, so that the problem in the execution process can be positioned through the log, and the reason of the problem can be found through the analysis of the log.
The system for automatically generating the abnormal sample data set provided by the embodiment of the invention is introduced, and the system for automatically generating the abnormal sample data set and the method for automatically generating the abnormal sample data set described below can be referred to each other.
Referring to fig. 2, an automatic generation system structure schematic diagram of an abnormal sample data set for vulnerability classification evaluation provided in an embodiment of the present invention specifically includes:
a sample generation unit 11 for generating an abnormal sample for the target software by the fuzzy test tool software;
The repairing unit 12 is configured to obtain a software copy corresponding to the target software, and repair a target vulnerability on the software copy by using a target vulnerability patch, so as to generate repaired software;
a verification unit 13 for inputting the abnormal sample into the repaired software; if the target bug of the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
the data set generating unit 14 is configured to generate an abnormal sample data set according to the valid abnormal sample, the identification information of the target vulnerability patch, and the identification information of the target vulnerability.
Referring to fig. 3, another schematic diagram of an automatic generation system of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention is shown in fig. 3, where a sample generation unit 11, a repair unit 12, a verification unit 13, and a data set generation unit 14 in this embodiment are disposed in a container instance 10.
Further, the system also comprises:
The monitoring and control service module 20 is configured to automatically generate a container instance configuration instruction and send the container instance configuration instruction to the container management service module 30, or trigger the container management service module 30 to generate at least one of the following instructions when the control instruction generation condition is currently satisfied: an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction, and a data set generation instruction;
A container management service module 30 for constructing the container instance according to the container instance construction instruction; generating at least one of an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction and a data set generation instruction, and controlling the container instance to execute corresponding operations.
Wherein container example 10 is also for: generating an abnormal sample data set generation log; wherein the abnormal sample dataset generation log comprises at least one of the following logs: constructing a log, an abnormal sample generation log, an abnormal sample verification log, an abnormal sample destruction log and a data set generation log by a container instance;
The monitoring and control service module 20 is also configured to: the abnormal sample data set generation log is sent to the database service module 40 to store the abnormal sample data set generation log to a database through the database service module 40.
As can be seen from the foregoing, the monitoring and control service module 20 can obtain the data to be saved in the database in the automatic generation method of the abnormal sample data set, and send the data to the database service module 40, and control the database service module 40 to save the data; if the control instruction generation condition is satisfied, the control container management service module 30 generates and transmits a related control instruction; the monitoring and control service module 20 may also control the container management service module 30 to parse and store container templates and construct container instances according to the container templates; and monitors the entire abnormal sample data set automatic generation system. The database service module 40 is mainly used for receiving and storing the data sent by the monitoring and control service module 20 into a database. The container management service module 30 is mainly configured to generate and send the control instructions, where the control instructions are used to implement specific steps of parsing and saving a container template, constructing a container instance according to the container template, performing an automatic generation method of an abnormal sample set in the container instance, such as copying target software or abnormal samples, destroying abnormal samples or other data, locating a storage location, saving, packaging, and the like, and performing a series of operations using content stored in the container instance (including, but not limited to, scripts, application software, operation libraries, system components), such as applying bug patches, running repaired target software copies, running fuzzy test tool software, and the like.
In this embodiment, the container management service module 30 is a module directly connected with the container template warehouse and each container instance, and direct operations related to the container template, the container template warehouse and each container instance in the system are all completed by the container management service module 30. It should be noted that, in this embodiment, in addition to the core structure, a container technology is required to complete the container template, the container template warehouse, the container template instantiation, and the bottom implementation of the container instance in the embodiment of the present invention, that is, the bottom implementation of the foregoing content is supported by technical means and tools provided by the container technology, such as container isolation, container mirroring, container arrangement, container runtime, file system mapping management, namespaces, daemons, and the like. Referring to fig. 4, an overall structure diagram of an automatic generation system of an abnormal sample data set for vulnerability classification evaluation according to an embodiment of the present invention is shown, in which a monitoring and control service S1, a database service S2, and a container management service S3 are respectively actual existing forms of the monitoring and control service module 20, the database service module 40, and the container management service module 30 in the computer operating system in fig. 3. A specific process implemented by the container technology according to this embodiment will be described with reference to fig. 4.
1. This solution requires deployment in the computer (hereinafter referred to as "host") used by the user: the monitoring and control service S1, the database service S2, the container management service S3, and the container template repository H.
Specifically, after the user installs the monitoring and control service S1, the relevant settings need to be made such that: the monitoring and control service S1 can successfully receive the instruction issued by the user and the provided resource, execute corresponding operation and return related information; it can also connect to the database service S2 and read and write the corresponding tables in its managed database; may also connect to the container management service S3 and acquire information or control it to perform a specific operation according to the interface of S3; resources in host file system F0 may also be read from and written to. After the user installs the database service S2, the database needs to be initialized, and a table which needs to be read and written in the use process is established. The container management service software S3 is capable of operating the file system mapping management FM provided by the container technology to map the resources in the host file system F0 into the in-container file system F1, and perform unified management, so that the computer software running inside the container is equivalent to accessing the mapped corresponding resources in the host file system F0 when accessing the required resources in the in-container file system FI. The user also needs to build a container template repository H and make it build a connection with the container management service software S3.
2. The solution also requires the user to load the container template through the container management service.
Specifically, the user issues a corresponding instruction to the monitoring and control service S1 while providing the container template file to the monitoring and control service S1, the monitoring and control service S1 forwards the file to the container management service S3, and controls the container management service S3 to parse the file and save it to the container template repository H. Further, from the information that the monitoring and control service S1 monitors and returns to the system, the user can learn detailed information about loading the container template.
3. The present solution also requires the creation of container instances using container templates.
Specifically, when the container instance is established, the user can issue a corresponding instruction to the monitoring and control service S1, the monitoring and control service S1 interacts with the container management service S3 after receiving the instruction, and the control S3 establishes the container instance I1 according to the established container template; in addition, when the container management service S3 builds the container instance, it is further required to control the container instance to obtain the fuzzy test tool software, the target software and the vulnerability patch file from the in-container file system, and create a copy of the target software. The process specifically comprises the following operations:
3.1, the container management service S3 reads the corresponding container template from the container template warehouse H, starts to instantiate and establishes the operation of the container instance I1;
3.2, the container management service S3 maps the corresponding resources in the host file system F0 to the file system FI in the container according to the related information recorded in the container template by the file system mapping management function FM;
3.3, the container management service S3 controls the container instance I1 to execute operation according to the related information recorded in the container template, and obtains and deploys the fuzzy test tool software T from the file system FI in the container;
3.4. The container management service S3 controls the container instance I1 to operate, acquires the target software from the container internal file system FI, and then copies the target software to form an original P0 and a copy P1;
3.5, the container management service S3 controls the container instance I1 to perform other necessary operations, so as to complete instantiation and establish the container instance I1;
And 3.6, the control service S1 collects container instance construction logs, interacts with the database service S2 and records the container instance construction logs into a corresponding table.
After the above operation is completed, the monitoring and control service S1 collates necessary information acquired by monitoring when interacting with the container management service S3, and returns the necessary information to the user, so that the user can learn detailed information about the established container instance.
4. The user issues a corresponding instruction to the monitoring and control service S1 to start producing an abnormal sample. Upon receiving this instruction, the monitoring and control service S1 interacts with the container management service S3, and the process specifically includes the following operations:
4.1, the container management service S3 sends an instruction to the container instance I1, so that the container instance I1 obtains all initial exception samples E0 initially from the file system FI in the container, sets the initial exception samples E0 as inputs of the fuzzy test tool software T, designates target software P0 to be aimed for the fuzzy test tool software T, and then runs the fuzzy test tool software T;
4.2, the container management service S3 controls the container instance I1 to operate, continuously checks the execution state of the current step, reports to the monitoring and control service S1 as monitoring information, and acquires all output abnormal samples E1;
4.3, when the monitoring and control service S1 detects that the trigger condition set by the user is met, the control container management service S3 sends an instruction to the container instance I1 to execute the validity verification process G, namely: obtaining a vulnerability patch file B from a file system FI in a container, and repairing a copy P1 of target software by using the vulnerability patch file B; for each abnormal sample in E1, enabling a copy P1 of repaired target software to read the sample as input, if the target bug of the copy P1 is not triggered, moving the copy P1 from E1 to a generated effective abnormal sample space E2, otherwise, moving the copy P1 to an ineffective abnormal sample space E3;
4.4, when the monitoring and control service S1 detects that the trigger condition set by the user is met, the container management service S3 sends an instruction to the container instance I1, and the abnormal sample in the E3 is moved to the garbage file processing T process so as to destroy the sample in the E3;
4.5, when the monitoring and control service S1 detects that the trigger condition set by the user is met, the container management service S3 sends an instruction to the container instance I1, the effective abnormal sample in E2 is moved to the corresponding position in the file system FI in the container, the reserved abnormal sample is collected and sorted to obtain an abnormal sample data set, and details of the operation are returned to the monitoring and control service S1; the monitoring and control service S1 interacts with the database service S2 by monitoring and collecting related information, and records the related information into a corresponding table;
5. The user also needs to send other instructions to the monitoring and control service S1, such as:
The user issues a corresponding instruction to the monitoring and control service S1 to learn the execution state of the current system. After receiving the instruction, the monitoring and control service S1 first interacts with the database service S2, extracts relevant information from the corresponding table, and then combines the information obtained by current monitoring and returns the relevant information to the user.
The user issues a corresponding instruction to the monitoring and control service S1 to obtain an abnormal sample data set. After receiving this instruction, the monitoring and control service S1 retrieves the location of the corresponding resource in the host file system F0 by using the function FM of the container management service S3, and returns the abnormal sample data set to the user.
The user issues a corresponding instruction to the monitoring and control service S1 to stop running the container instance I1. After the monitoring and control service S1 receives this instruction, the control container management service S3 ends the operation of the container instance I1.
According to the content, in the system, computer resources can be fully utilized, a high-quality abnormal sample data set can be automatically generated on a large scale, the high-efficiency automatic generation mode not only reduces manual participation and labor cost, but also can reduce error probability and ensure the quality of the abnormal sample data set through standardized operation of a program.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. An automatic generation method of an abnormal sample data set for vulnerability classification evaluation, which is characterized by comprising the following steps:
Generating an abnormal sample for the target software through the fuzzy test tool software;
acquiring a software copy corresponding to the target software, repairing a target vulnerability on the software copy by using a target vulnerability patch, and generating repaired software;
inputting the abnormal sample into the repaired software;
If the target bug of the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
And generating an abnormal sample data set according to the effective abnormal sample, the identification information of the target vulnerability patch and the identification information of the target vulnerability.
2. The method of automatic generation of an abnormal sample data set according to claim 1, wherein the method of automatic generation of an abnormal sample data set is performed by a container instance.
3. The method for automatically generating an abnormal sample data set according to claim 2, further comprising, before said executing the method for automatically generating an abnormal sample data set by a container instance:
obtaining a container template; wherein, the container template describes operating system configuration information, software configuration information and sample configuration information;
the container instance is constructed from the container template.
4. The method for automatically generating an abnormal sample data set according to claim 1, further comprising, after inputting the abnormal sample into the repaired software:
If the target bug of the repaired software is triggered, judging the abnormal sample as an invalid abnormal sample, and destroying all the invalid abnormal samples when a destroying condition is met.
5. The abnormal sample data set automatic generation method according to any one of claims 1 to 4, characterized in that the abnormal sample data set automatic generation method further comprises:
If the control instruction generation condition is currently satisfied, at least one of the following instructions is automatically generated and the corresponding operation is performed: container instance construction instructions, abnormal sample generation instructions, abnormal sample verification instructions, invalid abnormal sample destruction instructions, and data set generation instructions.
6. The abnormal sample data set automatic generation method according to any one of claims 1 to 4, characterized in that the abnormal sample data set automatic generation method further comprises:
Generating an abnormal sample data set generating log and storing the abnormal sample data set generating log into a database; wherein the abnormal sample dataset generation log comprises at least one of the following logs: the container instance constructs a log, an abnormal sample generation log, an abnormal sample verification log, an invalid abnormal sample destruction log and a data set generation log.
7. An automatic generation system for an abnormal sample data set for vulnerability classification evaluation, the automatic generation system comprising:
The sample generation unit is used for generating an abnormal sample aiming at the target software through the fuzzy test tool software;
The repairing unit is used for acquiring a software copy corresponding to the target software, repairing the target vulnerability on the software copy by using the target vulnerability patch and generating repaired software;
A verification unit for inputting the abnormal sample into the repaired software; if the target bug of the repaired software is not triggered, judging that the abnormal sample is a valid abnormal sample;
And the data set generating unit is used for generating an abnormal sample data set according to the effective abnormal sample, the identification information of the target vulnerability patch and the identification information of the target vulnerability.
8. The automated abnormal sample dataset generation system of claim 7, wherein,
The sample generation unit, the repair unit, the verification unit, and the dataset generation unit are disposed within a container instance.
9. The automatic generation system of abnormal sample data set according to claim 8, wherein the automatic generation system of abnormal sample data set further comprises:
The monitoring and control service module is used for automatically generating a container instance construction instruction and sending the container instance construction instruction to the container management service module or triggering the container management service module to generate at least one of the following instructions when the control instruction generation condition is currently met: an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction, and a data set generation instruction;
The container management service module is used for constructing the container instance according to the container instance construction instruction; generating at least one of an abnormal sample generation instruction, an abnormal sample verification instruction, an invalid abnormal sample destruction instruction and a data set generation instruction, and controlling the container instance to execute corresponding operations.
10. The automated abnormal sample dataset generation system of claim 8, wherein,
The container example is also for: generating an abnormal sample data set generation log; wherein the abnormal sample dataset generation log comprises at least one of the following logs: constructing a log, an abnormal sample generation log, an abnormal sample verification log, an abnormal sample destruction log and a data set generation log by a container instance;
The monitoring and control service module is further configured to: and sending the abnormal sample data set generation log to a database service module so as to store the abnormal sample data set generation log to a database through the database service module.
CN202410115040.4A 2024-01-26 2024-01-26 Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation Pending CN117951713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410115040.4A CN117951713A (en) 2024-01-26 2024-01-26 Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410115040.4A CN117951713A (en) 2024-01-26 2024-01-26 Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation

Publications (1)

Publication Number Publication Date
CN117951713A true CN117951713A (en) 2024-04-30

Family

ID=90794034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410115040.4A Pending CN117951713A (en) 2024-01-26 2024-01-26 Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation

Country Status (1)

Country Link
CN (1) CN117951713A (en)

Similar Documents

Publication Publication Date Title
US9632916B2 (en) Method and apparatus to semantically connect independent build and test processes
US6986125B2 (en) Method and apparatus for testing and evaluating a software component using an abstraction matrix
US8881131B2 (en) Method and apparatus for populating a software catalogue with software knowledge gathering
JP4524113B2 (en) Software distribution method and system
CN1908895B (en) System and method for application program globalization problem verification
CN113051155A (en) Control system and control method of automatic test platform
CN106227654A (en) A kind of test platform
CN113220588A (en) Automatic testing method, device and equipment for data processing and storage medium
CN109992476A (en) A kind of analysis method of log, server and storage medium
Moin et al. Bug localization using revision log analysis and open bug repository text categorization
CN112069144A (en) Method and device for collecting system logs by multi-control cluster
CN117951713A (en) Automatic generation method and system for abnormal sample data set for vulnerability classification evaluation
CN116400950A (en) DevOps element pipeline system based on version control
CN110532015B (en) On-orbit upgrading system for aerospace software
CN110321130B (en) Non-repeatable compiling and positioning method based on system call log
CN113641573A (en) Revision log-based automatic testing method and system for program analysis software
Paradkar SALT-an integrated environment to automate generation of function tests for APIs
CN114064387A (en) Log monitoring method, system, device and computer readable storage medium
CN113238956A (en) Fault analysis method, device and equipment for abnormal application and storage medium
CN112035308A (en) Method and device for generating system interface test table
CN109992475A (en) A kind of processing method of log, server and storage medium
WO2020194000A1 (en) Method of detecting and removing defects
EP1746501A1 (en) Method and apparatus for populating a software catalogue with software knowledge gathering
CN117873856A (en) Software testing method, storage medium and computer equipment
CN114817061A (en) Dependency error detection method for virtual construction script

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination